Title: | Evaluation of Algorithm Collections Using Item Response Theory |
---|---|
Description: | An evaluation framework for algorithm portfolios using Item Response Theory (IRT). We use continuous and polytomous IRT models to evaluate algorithms and introduce algorithm characteristics such as stability, effectiveness and anomalousness (Kandanaarachchi, Smith-Miles 2020) <doi:10.13140/RG.2.2.11363.09760>. |
Authors: | Sevvandi Kandanaarachchi [aut, cre] |
Maintainer: | Sevvandi Kandanaarachchi <[email protected]> |
License: | GPL-3 |
Version: | 0.2.3 |
Built: | 2025-01-16 02:37:54 UTC |
Source: | https://github.com/sevvandi/airt |
This function computes the actual and predicted effectiveness of a given algorithm for different tolerance values.
algo_effectiveness_crm(mod, num = 1)
algo_effectiveness_crm(mod, num = 1)
mod |
A fitted |
num |
The algorithm number, for which the goodness of the IRT model is computed. |
A list with the following components:
effective |
The |
predictedEff |
The area under the predicted effectiveness curve. |
actualEff |
The area under the actual effectiveness curve. |
set.seed(1) x1 <- runif(100) x2 <- runif(100) x3 <- runif(100) X <- cbind.data.frame(x1, x2, x3) max_item <- rep(1,3) min_item <- rep(0,3) mod <- cirtmodel(X, max.item=max_item, min.item=min_item) out <- algo_effectiveness_crm(mod$model, num=1) out
set.seed(1) x1 <- runif(100) x2 <- runif(100) x3 <- runif(100) X <- cbind.data.frame(x1, x2, x3) max_item <- rep(1,3) min_item <- rep(0,3) mod <- cirtmodel(X, max.item=max_item, min.item=min_item) out <- algo_effectiveness_crm(mod$model, num=1) out
This function computes the actual and predicted effectiveness of a given algorithm for different tolerance values.
algo_effectiveness_poly(mod, num = 1)
algo_effectiveness_poly(mod, num = 1)
mod |
A fitted |
num |
The algorithm number |
A list with the following components:
effective |
The |
predictedEff |
The area under the predicted effectiveness curve. |
actualEff |
The area under the actual effectiveness curve. |
#'@examples set.seed(1) x1 <- sample(1:5, 100, replace = TRUE) x2 <- sample(1:5, 100, replace = TRUE) x3 <- sample(1:5, 100, replace = TRUE) X <- cbind.data.frame(x1, x2, x3) mod <- pirtmodel(X) out <- algo_effectiveness_poly(mod$model, num=1) out
This function fits a continuous Item Response Theory (IRT) model to the algorithm performance data. The function EstCRMitem in the R package EstCRM is updated to accommodate negative discrimination.
cirtmodel(df, scale = FALSE, scale.method = NULL, max.item = 1, min.item = 0)
cirtmodel(df, scale = FALSE, scale.method = NULL, max.item = 1, min.item = 0)
df |
The performance data in a matrix or dataframe with good performances having high values and poor performances having low values. |
scale |
If |
scale.method |
The method to scale the data. The default is |
max.item |
A vector with the maximum performance value for each algorithm.
This can be used to inform the maximum performance value for each algorithm.
Only will be used if scale is |
min.item |
A vector with the minimum performance value for each algorithm.
This can be used to inform the minimum performance value for each algorithm.
Only will be used if scale is |
A list with the following components:
model |
The IRT model. |
anomalous |
A binary value for each algorithm. It is set to 1 if an algorithm is anomalous. Otherwise it is set to 0. |
consistency |
The consistency of each algorithm. |
difficulty_limit |
The difficulty limit of each algorithm. A higher difficulty limit indicates that the algorithm can tackle harder problems. |
Zopluoglu C (2022). EstCRM: Calibrating Parameters for the Samejima's Continuous IRT Model. R package version 1.6, https://CRAN.R-project.org/package=EstCRM.
set.seed(1) x1 <- runif(100) x2 <- runif(100) x3 <- runif(100) X <- cbind.data.frame(x1, x2, x3) mod <- cirtmodel(X)
set.seed(1) x1 <- runif(100) x2 <- runif(100) x3 <- runif(100) X <- cbind.data.frame(x1, x2, x3) mod <- cirtmodel(X)
This dataset contains the performance of 10 classification algorithms on 235 datasets discussed in the paper Instance Spaces for Machine Learning Classification by M. A. Munoz, L. Villanova, D. Baatar, and K. A. Smith-Miles .
classification_cts
classification_cts
A dataframe of 235 x 10 dimensions.
Each row contains the algorithm performance of a dataset on 10 classification algorithms.
Each column contains the algorithm performance of a single algorithm.
https://katesmithmiles.wixsite.com/home/matilda
This dataset contains the performance of 10 classification algorithms on 235 datasets discussed in the paper Instance Spaces for Machine Learning Classification by M. A. Munoz, L. Villanova, D. Baatar, and K. A. Smith-Miles .
classification_poly
classification_poly
A dataframe of 235 x 10 dimensions.
Each row contains the algorithm performance of a dataset on 10 classification algorithms.
Each column contains the algorithm performance of a single algorithm.
https://katesmithmiles.wixsite.com/home/matilda
This function computes the actual and predicted effectiveness of the collection of algorithms for different tolerance values.
effectiveness_crm(model) ## S3 method for class 'effectivenesscrm' autoplot(object, plottype = 1, ...)
effectiveness_crm(model) ## S3 method for class 'effectivenesscrm' autoplot(object, plottype = 1, ...)
model |
The output of the function cirtmodel. |
object |
For autoplot: The output of the function effectiveness_crm |
plottype |
For autoplot: If plottype = 1, then actual effectiveness is plotted, if plottype = 2, then predicted effectiveness is plotted. If plottype = 3, area under the actual effectiveness curve (AUAEC) is plotted against area under the predicted effectiveness curve (AUPEC). |
... |
Other arguments currently ignored. |
A list with the following components:
effectivenessAUC |
The area under the actual and predicted effectiveness curves. |
actcurves |
The |
#'
prdcurves |
The |
set.seed(1) x1 <- runif(200) x2 <- 2*x1 + rnorm(200, mean=0, sd=0.1) x3 <- 1 - x1 + rnorm(200, mean=0, sd=0.1) X <- cbind.data.frame(x1, x2, x3) mod <- cirtmodel(X, scale = TRUE, scale.method = "multiple") out <- effectiveness_crm(mod) out # For the actual effectiveness plot autoplot(out, plottype = 1) # For the predicted effectivness plot autoplot(out, plottype = 2) # For actual and predicted effectiveness plot autoplot(out, plottype = 3)
set.seed(1) x1 <- runif(200) x2 <- 2*x1 + rnorm(200, mean=0, sd=0.1) x3 <- 1 - x1 + rnorm(200, mean=0, sd=0.1) X <- cbind.data.frame(x1, x2, x3) mod <- cirtmodel(X, scale = TRUE, scale.method = "multiple") out <- effectiveness_crm(mod) out # For the actual effectiveness plot autoplot(out, plottype = 1) # For the predicted effectivness plot autoplot(out, plottype = 2) # For actual and predicted effectiveness plot autoplot(out, plottype = 3)
This function computes the actual and predicted effectiveness of the collection of algorithms for different tolerance values.
effectiveness_poly(model) ## S3 method for class 'effectivenesspoly' autoplot(object, plottype = 1, ...)
effectiveness_poly(model) ## S3 method for class 'effectivenesspoly' autoplot(object, plottype = 1, ...)
model |
The output of pirtmodel function. |
object |
For autoplot: The output of the function effectiveness_crm |
plottype |
For autoplot: If plottype = 1, then actual effectiveness is plotted, if plottype = 2, then predicted effectiveness is plotted. If plottype = 3, area under the actual effectiveness curve (AUAEC) is plotted against area under the predicted effectiveness curve (AUPEC). |
... |
Other arguments currently ignored. |
A list with the following components:
effectivenessAUC |
The area under the actual and predicted effectiveness curves. |
actcurves |
The |
#'
prdcurves |
The |
set.seed(1) x1 <- sample(1:5, 100, replace = TRUE) x2 <- sample(1:5, 100, replace = TRUE) x3 <- sample(1:5, 100, replace = TRUE) X <- cbind.data.frame(x1, x2, x3) mod <- pirtmodel(X) out <- effectiveness_poly(mod) out # For actual effectiveness curves autoplot(out, plottype = 1) # For predicted effectiveness curves autoplot(out, plottype = 2) # For Actual and Predicted Effectiveness (AUAEC, AUPEC) autoplot(out, plottype = 3)
set.seed(1) x1 <- sample(1:5, 100, replace = TRUE) x2 <- sample(1:5, 100, replace = TRUE) x3 <- sample(1:5, 100, replace = TRUE) X <- cbind.data.frame(x1, x2, x3) mod <- pirtmodel(X) out <- effectiveness_poly(mod) out # For actual effectiveness curves autoplot(out, plottype = 1) # For predicted effectiveness curves autoplot(out, plottype = 2) # For Actual and Predicted Effectiveness (AUAEC, AUPEC) autoplot(out, plottype = 3)
This function makes a dataframe from the continuous IRTmodel the autoplot function produces the heatmaps.
heatmaps_crm(model, thetarange = c(-6, 6)) ## S3 method for class 'heatmapcrm' autoplot( object, xlab = "Theta", nrow = 2, ratio = 1, col_scheme = "plasma", ... )
heatmaps_crm(model, thetarange = c(-6, 6)) ## S3 method for class 'heatmapcrm' autoplot( object, xlab = "Theta", nrow = 2, ratio = 1, col_scheme = "plasma", ... )
model |
Output from the function |
thetarange |
The range for |
object |
For autoplot: output of heatmaps_crm function. |
xlab |
For autoplot: xlabel. |
nrow |
For autoplot: number of rows of heatmaps to plot. |
ratio |
For autoplot: ratio for coord_fixed in ggplot. |
col_scheme |
For autoplot: the color scheme for heatmaps. Default value is plasma. |
... |
Other arguments currently ignored. |
Dataframe with output probabilities from the IRT model for all algorithms, an object of class heatmapcrm.
data(classification_cts) model <- cirtmodel(classification_cts) obj <- heatmaps_crm(model) head(obj$df) autoplot(obj)
data(classification_cts) model <- cirtmodel(classification_cts) obj <- heatmaps_crm(model) head(obj$df) autoplot(obj)
This function performs the latent trait analysis of the datasets/problems after fitting a continuous IRT model. It fits a smoothing spline to the points to compute the latent trait. The autoplot function plots the latent trait and the performance.
latent_trait_analysis( df, scale = FALSE, scale.method = NULL, max.item = 1, min.item = 0, paras, epsilon = 0.01 ) ## S3 method for class 'latenttrait' autoplot( object, xlab = "Problem Difficulty", ylab = "Performance", plottype = 1, nrow = 2, se = TRUE, ratio = 3, ... )
latent_trait_analysis( df, scale = FALSE, scale.method = NULL, max.item = 1, min.item = 0, paras, epsilon = 0.01 ) ## S3 method for class 'latenttrait' autoplot( object, xlab = "Problem Difficulty", ylab = "Performance", plottype = 1, nrow = 2, se = TRUE, ratio = 3, ... )
df |
The performance data in a matrix or dataframe with good performances having high values and poor performances having low values. |
scale |
If |
scale.method |
The method to scale the data. The default is |
max.item |
A vector with the maximum performance value for each algorithm.
This can be used to inform the maximum performance value for each algorithm.
Only will be used if scale is |
min.item |
A vector with the minimum performance value for each algorithm.
This can be used to inform the minimum performance value for each algorithm.
Only will be used if scale is |
paras |
The parameters from fitting |
epsilon |
A value defining good algorithm performance. If |
object |
For autoplot: the output of the function latent_trait_analysis. |
xlab |
For autoplot: the xlabel. |
ylab |
For autoplot: the ylabel. |
plottype |
For autoplot: plottype = 1 for all algorithm performances in a single plot, plottype = 2 for using facet_wrap to plot individual algorithms, plottype = 3 to plot the smoothing splines and plottype = 4 to plot strengths and weaknesses. |
nrow |
For autoplot: If |
se |
For autoplot: for plotting splines with standard errors. |
ratio |
For autoplot: for plotting strengths and weaknesses, ratio between x and y axis. |
... |
Other arguments currently ignored. |
A list with the following components:
crmtheta |
The problem trait output computed from the R package EstCRM. |
strengths |
The strengths of each algorithm and positions on the latent trait that they performs well. |
longdf |
The dataset in long format of latent trait occupancy. |
plt |
The ggplot object showing the fitted smoothing splines. |
widedf |
The dataset in wide format with latent trait. |
thetas |
The easiness of the problem set instances. |
weakness |
The weaknesses of each algorithm and positions on the latent trait that they performs poorly. |
# This is a dummy example. set.seed(1) x1 <- runif(200) x2 <- 2*x1 + rnorm(200, mean=0, sd=0.1) x3 <- 1 - x1 + rnorm(200, mean=0, sd=0.1) X <- cbind.data.frame(x1, x2, x3) max_item <- rep(max(x1, x2, x3),3) min_item <- rep(min(x1, x2, x3),3) mod <- cirtmodel(X, max.item=max_item, min.item=min_item) out <- latent_trait_analysis(X, min.item= min_item, max.item = max_item, paras = mod$model$param) out # To plot performance against the problem difficulty autoplot(out) # To plot individual panels autoplot(out, plottype = 2) # To plot smoothing splines autoplot(out, plottype = 3) # To plot strengths and weaknesses autoplot(out, plottype = 4)
# This is a dummy example. set.seed(1) x1 <- runif(200) x2 <- 2*x1 + rnorm(200, mean=0, sd=0.1) x3 <- 1 - x1 + rnorm(200, mean=0, sd=0.1) X <- cbind.data.frame(x1, x2, x3) max_item <- rep(max(x1, x2, x3),3) min_item <- rep(min(x1, x2, x3),3) mod <- cirtmodel(X, max.item=max_item, min.item=min_item) out <- latent_trait_analysis(X, min.item= min_item, max.item = max_item, paras = mod$model$param) out # To plot performance against the problem difficulty autoplot(out) # To plot individual panels autoplot(out, plottype = 2) # To plot smoothing splines autoplot(out, plottype = 3) # To plot strengths and weaknesses autoplot(out, plottype = 4)
This function converts continous performance data to polytomous data with 5 categories
make_polyIRT_data(df, method = 1)
make_polyIRT_data(df, method = 1)
df |
The input data in a dataframe or a matrix |
method |
If |
The polytomous data frame.
set.seed(1) x1 <- runif(500) x2 <- runif(500) x3 <- runif(500) x <- cbind(x1, x2, x3) xout <- make_polyIRT_data(x)
set.seed(1) x1 <- runif(500) x2 <- runif(500) x3 <- runif(500) x <- cbind(x1, x2, x3) xout <- make_polyIRT_data(x)
This function computes the goodness of the IRT model for all algorithms for different goodness tolerances.
model_goodness_crm(model) ## S3 method for class 'modelgoodnesscrm' autoplot(object, ...)
model_goodness_crm(model) ## S3 method for class 'modelgoodnesscrm' autoplot(object, ...)
model |
The output of function cirtmodel. |
object |
For autoplot: The output of model_goodness_crm. |
... |
Other arguments currently ignored. |
A list with the following components:
goodnessAUC |
The area under the model goodness curve for each algorithm. |
curves |
The |
residuals |
The residuals for each algorithm using the AIRT model. |
set.seed(1) x1 <- runif(200) x2 <- 2*x1 + rnorm(200, mean=0, sd=0.1) x3 <- 1 - x1 + rnorm(200, mean=0, sd=0.1) X <- cbind.data.frame(x1, x2, x3) mod <- cirtmodel(X, scale = TRUE, scale.method = "multiple") out <- model_goodness_crm(mod) out autoplot(out)
set.seed(1) x1 <- runif(200) x2 <- 2*x1 + rnorm(200, mean=0, sd=0.1) x3 <- 1 - x1 + rnorm(200, mean=0, sd=0.1) X <- cbind.data.frame(x1, x2, x3) mod <- cirtmodel(X, scale = TRUE, scale.method = "multiple") out <- model_goodness_crm(mod) out autoplot(out)
This function computes the goodness of the IRT model for a given algorithm for different goodness tolerances.
model_goodness_for_algo_crm(mod, num = 1)
model_goodness_for_algo_crm(mod, num = 1)
mod |
A fitted |
num |
The algorithm number, for which the goodness of the IRT model is computed. |
A list with the following components:
xy |
The |
auc |
The area under the model goodness curve. |
residuals |
The different between actual and fitted performance values. |
set.seed(1) x1 <- runif(100) x2 <- runif(100) x3 <- runif(100) X <- cbind.data.frame(x1, x2, x3) max_item <- rep(1,3) min_item <- rep(0,3) mod <- cirtmodel(X, max.item=max_item, min.item=min_item) out <- model_goodness_for_algo_crm(mod$model, num=1) out
set.seed(1) x1 <- runif(100) x2 <- runif(100) x3 <- runif(100) X <- cbind.data.frame(x1, x2, x3) max_item <- rep(1,3) min_item <- rep(0,3) mod <- cirtmodel(X, max.item=max_item, min.item=min_item) out <- model_goodness_for_algo_crm(mod$model, num=1) out
This function computes the goodness of the IRT model fit for a given algorithm using the empirical cumulative distribution function of errors.
model_goodness_for_algo_poly(mod, num = 1)
model_goodness_for_algo_poly(mod, num = 1)
mod |
A fitted |
num |
The algorithm number |
A list with the following components:
xy |
The |
auc |
The area under the CDF. |
mse |
The mean squared error. |
set.seed(1) x1 <- sample(1:5, 100, replace = TRUE) x2 <- sample(1:5, 100, replace = TRUE) x3 <- sample(1:5, 100, replace = TRUE) X <- cbind.data.frame(x1, x2, x3) mod <- pirtmodel(X) out <- model_goodness_for_algo_poly(mod$model, num=1) out
set.seed(1) x1 <- sample(1:5, 100, replace = TRUE) x2 <- sample(1:5, 100, replace = TRUE) x3 <- sample(1:5, 100, replace = TRUE) X <- cbind.data.frame(x1, x2, x3) mod <- pirtmodel(X) out <- model_goodness_for_algo_poly(mod$model, num=1) out
This function computes the goodness of the IRT model for all algorithms using the empirical cumulative distribution function of errors.
model_goodness_poly(model) ## S3 method for class 'modelgoodnesspoly' autoplot(object, ...)
model_goodness_poly(model) ## S3 method for class 'modelgoodnesspoly' autoplot(object, ...)
model |
The output from pirtmodel function. |
object |
For autoplot: The output of the model_goodness_poly function. |
... |
Other arguments currently ignored. |
A list with the following components:
goodnessAUC |
The area under the model goodness curve for each algorithm. |
mse |
The mean squared error. |
curves |
The |
set.seed(1) x1 <- sample(1:5, 100, replace = TRUE) x2 <- sample(1:5, 100, replace = TRUE) x3 <- sample(1:5, 100, replace = TRUE) X <- cbind.data.frame(x1, x2, x3) mod <- pirtmodel(X) out <- model_goodness_poly(mod) out autoplot(out)
set.seed(1) x1 <- sample(1:5, 100, replace = TRUE) x2 <- sample(1:5, 100, replace = TRUE) x3 <- sample(1:5, 100, replace = TRUE) X <- cbind.data.frame(x1, x2, x3) mod <- pirtmodel(X) out <- model_goodness_poly(mod) out autoplot(out)
This function fits a polytomous Item Response Theory (IRT) model using the R package mirt to the algorithm performance data.
pirtmodel(dat, ncycle = NULL, vpara = TRUE)
pirtmodel(dat, ncycle = NULL, vpara = TRUE)
dat |
The performance data in a matrix or dataframe. |
ncycle |
The number of cycles for |
vpara |
It |
A list with the following components:
model |
The IRT model using the R package |
anomalous |
A binary value for each algorithm. It is set to 1 if an algorithm is anomalous. Otherwise it is set to 0. |
consistency |
The consistency of each algorithm. |
difficulty_limit |
The difficulty limits for each algorithm. A higher threshold indicates that the algorithm can tackle harder problems. |
R. Philip Chalmers (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 1-29. doi:10.18637/jss.v048.i06
set.seed(1) x1 <- sample(1:5, 100, replace = TRUE) x2 <- sample(1:5, 100, replace = TRUE) x3 <- sample(1:5, 100, replace = TRUE) X <- cbind.data.frame(x1, x2, x3) mod <- pirtmodel(X)
set.seed(1) x1 <- sample(1:5, 100, replace = TRUE) x2 <- sample(1:5, 100, replace = TRUE) x3 <- sample(1:5, 100, replace = TRUE) X <- cbind.data.frame(x1, x2, x3) mod <- pirtmodel(X)
This function makes a dataframe from the polytomous IRTmodel. The autoplot function can be used to plot trace lines
tracelines_poly(model) ## S3 method for class 'tracelinespoly' autoplot( object, xlab = "Theta", ylab = "Probability", nrow = 2, title = "Tracelines", ... )
tracelines_poly(model) ## S3 method for class 'tracelinespoly' autoplot( object, xlab = "Theta", ylab = "Probability", nrow = 2, title = "Tracelines", ... )
model |
Output from the function |
object |
For autoplot: output of tracelines_poly function. |
xlab |
For autoplot: xlabel. |
ylab |
For autoplot: ylabel. |
nrow |
For autoplot: number of rows of heatmaps to plot. |
title |
For autoplot: the title for the plot. |
... |
Other arguments currently ignored. |
Dataframe with output probabilities from the IRT model for all algorithms, an object of the class tracelinespoly.
data(classification_poly) mod <- pirtmodel(classification_poly) obj <- tracelines_poly(mod) head(obj$df) autoplot(obj)
data(classification_poly) mod <- pirtmodel(classification_poly) obj <- tracelines_poly(mod) head(obj$df) autoplot(obj)