library("gamlss2")
<- function(...) {
Normal <- list(
fam "family" = "Normal",
"names" = c("mu", "sigma"),
"links" = c("mu" = "identity", "sigma" = "log"),
"score" = list(
"mu" = function(y, par, ...) {
- par$mu) / (par$sigma^2)
(y
},"sigma" = function(y, par, ...) {
-1 + (y - par$mu)^2 / (par$sigma^2)
}
),"hess" = list(
"mu" = function(y, par, ...) {
1 / (par$sigma^2)
},"sigma" = function(y, par, ...) {
rep(2, length(y))
},"mu.sigma" = function(y, par, ...) {
rep(0, length(y))
}
),"loglik" = function(y, par, ...) {
sum(dnorm(y, par$mu, par$sigma, log = TRUE))
},"mu" = function(par, ...) {
$mu
par
},"pdf" = function(y, par, log = FALSE) {
dnorm(y, mean = par$mu, sd = par$sigma, log = log)
},"cdf" = function(y, par, ...) {
pnorm(y, mean = par$mu, sd = par$sigma, ...)
},"random" = function(n, par) {
rnorm(n, mean = par$mu, sd = par$sigma)
},"quantile" = function(p, par) {
qnorm(p, mean = par$mu, sd = par$sigma)
},"initialize" = list(
"mu" = function(y, ...) { (y + mean(y)) / 2 },
"sigma" = function(y, ...) { rep(sd(y), length(y)) }
),"mean" = function(par) par$mu,
"variance" = function(par) par$sigma^2,
"valid.response" = function(x) {
if(is.factor(x) | is.character(x))
stop("the response should be numeric!")
return(TRUE)
}
)
class(fam) <- "gamlss2.family"
return(fam)
}
## load the abdominal circumference data
data("abdom", package = "gamlss.data")
## specify the model Formula
<- y ~ s(x) | s(x)
f
## estimate model
<- gamlss2(f, data = abdom, family = Normal)
b
## plot estimated effects
plot(b, which = "effects")
## plot diagnostics
plot(b, which = "resid")
## predict parameters
<- predict(b)
par
## predict quantiles
<- sapply(c(0.05, 0.5, 0.95), function(q) family(b)$quantile(q, par))
pq
## visualize
plot(y ~ x, data = abdom, pch = 19,
col = rgb(0.1, 0.1, 0.1, alpha = 0.3))
matplot(abdom$x, pq, type = "l", lwd = 2,
lty = 1, col = 4, add = TRUE)
## another example using only the density
## function, all derivatives are approximated
## in this case; for residual diagnostics,
## the $cdf() and $quantile() function is needed, too.
<- function(...) {
Gamma <- list(
fam "names" = c("mu", "sigma"),
"links" = c("mu" = "log", "sigma" = "log"),
"pdf" = function(y, par, log = FALSE, ...) {
<- par$sigma
shape <- par$mu/par$sigma
scale dgamma(y, shape = shape, scale = scale, log = log)
},"cdf" = function(y, par, lower.tail = TRUE, log.p = FALSE) {
<- par$sigma
shape <- par$mu/par$sigma
scale pgamma(y, shape = shape, scale = scale,
lower.tail = lower.tail, log.p = log.p)
},"quantile" = function(p, par, lower.tail = TRUE, log.p = FALSE) {
<- par$sigma
shape <- par$mu/par$sigma
scale qgamma(p, shape = shape, scale = scale,
lower.tail = lower.tail, log.p = log.p)
}
)
class(fam) <- "gamlss2.family"
return(fam)
}
## example using the Munich rent data
data("rent", package = "gamlss.data")
## model formula
<- R ~ ti(Fl) + ti(A) + ti(Fl, A, bs = "ps") |
f ti(Fl) + ti(A) + ti(Fl, A, bs = "ps")
## estimate model
<- gamlss2(f, data = rent, family = Gamma)
b
## visualize estimated effects
plot(b, which = "effects")
## diagnostics, needs the $cdf() and $quantile() function!
plot(b, which = "resid")
Family Objects in gamlss2
Description
Family objects within the package gamlss2 are used to specify the information required to use a model fitting function. This includes details such as parameter names, corresponding link functions, the density function, log-likelihood function and derivatives of the log-likelihood with respect to the predictors. In addition, family objects are used in the calculation of post-modeling statistics, such as residual diagnostics and random number generation. An overview can be found in the accompanying details and examples.
Details
A gamlss2 family object must be a list
of class “gamlss2.family”
, containing the necessary elements to define the response distribution and how it is handled during model fitting. The minimum requirements are as follows:
-
family
: the name of the distribution (character string). -
names
: a character vector of parameter names (e.g.,c(“mu”, “sigma”)
). -
links
: a named character vector specifying the link function for each parameter (e.g.,c(mu = “identity”, sigma = “log”)
), or a list of link functions, e.g., seesoftplus
. -
pdf(y, par, log = FALSE, …)
: a function to evaluate the (log-)density.
The pdf()
function must accept the response y
, a named list par
of evaluated parameter values (e.g., par$mu
, par$sigma
), a logical log
, and optional additional arguments.
Optionally, the family object may include:
-
score
: a named list of functions (one per parameter), each computing the first derivative of the log-likelihood with respect to the linear predictor:score[param]
. -
hess
: a named list of functions computing second derivatives (negative Hessian). For parametersmu
andsigma
, this includes:hess[“mu”]
,hess[“sigma”]
, and optionally cross derivatives likehess[“mu.sigma”]
. -
loglik(y, par, …)
: a function computing the total log-likelihood. -
cdf(y, par, …)
: cumulative distribution function. -
quantile(p, par, …)
: quantile function. -
random(n, par, …)
: random number generator. -
mean(par, …)
: mean function. -
variance(par, …)
: variance function. -
skewness(par, …)
,kurtosis(par, …)
: optional higher-order moment functions. -
initialize
: a named list of initialization functions, one per parameter (e.g.,initialize$mu(y, …)
), used to generate starting values. -
valid.response(x)
: a function that checks whether the response is valid (e.g., numeric, non-factor). -
optimizer()
: an optional function to define a custom optimization method for use withgamlss2
, see alsolink{RS}
.
If the analytical score or Hessian functions are not provided, they will be approximated numerically. If quantile residuals are to be computed, a cdf()
function must be provided.
See Also
gamlss2