library("gamlss2")
n <- 400
x1 <- runif(n, -3, 3)
x2 <- runif(n, -3, 3)
eta <- sin(x1) + 0.5 * x2
p <- plogis(eta)
y <- rbinom(n, size = 1, prob = p)
d <- data.frame(y = y, x1 = x1, x2 = x2)
m1 <- gamlss2(y ~ x1, family = BI, data = d)
m2 <- gamlss2(y ~ s(x1), family = BI, data = d)
m3 <- gamlss2(y ~ s(x1) + x2, family = BI, data = d)
calibration(m1, m2, m3)
head(calibration(m1, m2, m3, plot = FALSE))Calibration Plots for Binary Responses
Description
Compute and plot calibration curves for models with 0/1 responses. The function can handle one or several fitted gamlss2 models and compares their calibration in a single plot.
Usage
calibration(..., newdata = NULL,
y = NULL, parameter = NULL, breaks = seq(0, 1, by = 0.1),
minn = 20, main = "Calibration plot",
xlab = "Predicted probability",
ylab = "Observed proportion", plot = TRUE,
add_loess = TRUE, smooth_n = 200,
col = NULL, lty = NULL, legend = TRUE, pos = "topleft",
xlim = NULL, ylim = NULL)
Arguments
…
|
One or several fitted gamlss2 model objects to be assessed.
|
newdata
|
Optional data frame for out-of-sample calibration. |
y
|
Optional numeric or factor vector with the binary response. |
parameter
|
Character, which parameter should be used for prediction. |
breaks
|
Numeric vector of break points used to construct bins for the predicted probabilities. |
minn
|
Integer, the minimum number of observations required in each bin. |
main
|
Character, main title for the plot. |
xlab
|
Character, label for the x-axis. |
ylab
|
Character, label for the y-axis. |
plot
|
Logical, should a calibration plot be produced? |
add_loess
|
Logical, should a loess smooth be added for each model? |
smooth_n
|
Integer, number of evaluation points for the loess curve. |
col
|
Colors used for the models. |
lty
|
Line types used for the loess curves. |
legend
|
Logical, should a legend be added when more than one model is supplied? |
pos
|
Character, legend position passed to legend.
|
xlim
|
The x limits of the plot. |
ylim
|
The y limits of the plot. |
Details
For each fitted model, predicted probabilities are obtained via predict using type = “parameter” and the selected parameter. If the corresponding gamlss2.family object provides a probabilities() method, this is used to transform the parameter vector into probabilities. For multi-column outputs, the first column is used.
Predicted probabilities are grouped into bins defined by breaks. Within each bin, the mean predicted probability, the observed proportion of 1s, and the number of observations are computed. Bins with fewer than minn observations are dropped.
Value
Invisibly returns a data frame with the columns interval, probs, y, n, and model. For a single model, the model column is dropped.
References
Van Calster B, McLernon DJ, van Smeden M, Wynants L, Steyerberg EW (2019). Calibration: the Achilles heel of predictive analytics. BMC Medicine, 17, 230. doi:10.1186/s12916-019-1466-7
See Also
gamlss2, predict