Generalized Additive Models for Location Scale and Shape

Description

Estimation of generalized additive models for location scale and shape (GAMLSS). The model fitting function gamlss2() provides flexible infrastructures to estimate the parameters of a response distribution. The number of distributional parameters is not fixed, see gamlss2.family. Moreover, gamlss2() supports all smooth term constructors from the mgcv package in addition to the classical model terms as provided by gamlss and gamlss.add.

Usage

gamlss2(formula, ...)

## S3 method for class 'formula'
gamlss2(formula, data, family = NO,
  subset, na.action, weights, offset, start = NULL,
  control = gamlss2_control(...), ...)

## S3 method for class 'list'
gamlss2(formula, ...)

Arguments

formula A GAM-type formula or Formula. All smooth terms of the mgcv package are supported, see also formula.gam. For gamlss.list() formula is a list of formulas.
data A data frame or list or environment containing the variables in the model. If not found in data, the variables are taken from environment(formula), typically the environment from which gamlss2 is called.
family A gamlss.family or gamlss2.family object used to define distribution and the link functions of the parameters.
subset An optional vector specifying a subset of observations to be used in the fitting process.
na.action NA processing for setting up the model.frame.
weights An optional vector of prior weights to be used in the fitting process. Should be NULL or a numeric vector.
offset This can be used to specify an a priori known components to be included in the linear predictors during fitting. Please note that if only a single numeric vector is provided, the offset will be assigned to the first specified parameter of the distribution. In the case of multiple offsets, a data frame or list must be supplied. Each offset is assigned in the same order as the parameters of the distribution specified in the family object.
start Starting values for estimation algorithms.
control A list of control arguments, see gamlss2_control.
Arguments passed to gamlss2_control.

Details

The model fitting function gamlss2() provides flexible infrastructures for the estimation of GAMLSS.

  • Distributional models are specified using family objects, either from the gamlss.dist package or using gamlss2.family objects.

  • Estimation is carried out through a Newton-Raphson/Fisher scoring algorithm, see function RS. The estimation algorithms can also be exchanged using gamlss2_control. Additionally, if an optimizer is specified by the family object, this optimizer function will be employed for estimation.

  • The return value is determined by the object returned from the optimizer function, typically an object of class “gamlss2”. Default methods and extractor functions are available for this class. Nevertheless, users have the flexibility to supply their own optimizer function, along with user-specific methods tailored for the returned object.

Value

The return value is determined by the object returned from the optimizer function. By default, the optimization is performed using the RS optimizer function (see gamlss2_control), yielding an object of class “gamlss2”. Default methods and extractor functions are available for this class.

References

Rigby RA, Stasinopoulos DM (2005). “Generalized Additive Models for Location, Scale and Shape (with Discussion).” Journal of the Royal Statistical Society, Series C (Applied Statistics), 54, 507–554. doi:10.1111/j.1467-9876.2005.00510.x

Rigby RA, Stasinopoulos DM, Heller GZ, De Bastiani F (2019). Distributions for Modeling Location, Scale, and Shape: Using GAMLSS in R, Chapman and Hall/CRC. doi:10.1201/9780429298547

Stasinopoulos DM, Rigby RA (2007). “Generalized Additive Models for Location Scale and Shape (GAMLSS) in R.” Journal of Statistical Software, 23(7), 1–46. doi:10.18637/jss.v023.i07

Stasinopoulos DM, Rigby RA, Heller GZ, Voudouris V, De Bastiani F (2017). Flexible Regression and Smoothing: Using GAMLSS in R, Chapman and Hall/CRC. doi:10.1201/b21973

See Also

RS, gamlss2_control, gamlss2.family

Examples

library("gamlss2")


## load the abdominal circumference data
data("abdom", package = "gamlss.data")

## specify the model Formula
f <- y ~ s(x) | s(x) | s(x) | s(x)

## estimate model
b <- gamlss2(f, data = abdom, family = BCT)
GAMLSS-RS iteration  1: Global Deviance = 4774.4529 eps = 0.534347     
GAMLSS-RS iteration  2: Global Deviance = 4770.23 eps = 0.000884     
GAMLSS-RS iteration  3: Global Deviance = 4770.1675 eps = 0.000013     
GAMLSS-RS iteration  4: Global Deviance = 4770.1566 eps = 0.000002     
## model summary
summary(b)
Call:
gamlss2(formula = f, data = abdom, family = BCT)
---
Family: BCT 
Link function: mu = identity, sigma = log, nu = identity, tau = log
*--------
Parameter: mu 
---
Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)  226.334      1.257     180   <2e-16 ***
---
Smooth terms:
     s(x)
edf 4.551
*--------
Parameter: sigma 
---
Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -2.92263    0.01101  -265.5   <2e-16 ***
---
Smooth terms:
     s(x)
edf 2.564
*--------
Parameter: nu 
---
Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept) -0.18024    0.04599  -3.919 9.93e-05 ***
---
Smooth terms:
      s(x)
edf 1.0015
*--------
Parameter: tau 
---
Coefficients:
            Estimate Std. Error t value Pr(>|t|)    
(Intercept)   2.6547     0.0144   184.4   <2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
---
Smooth terms:
      s(x)
edf 1.0004
*--------
n = 610 df =  13.12 res.df =  596.88
Deviance = 4770.1566 Null Dev. Red. = 33.39%
AIC = 4796.3906 elapsed =  0.88sec
## plot estimated effects
plot(b, which = "effects")

## plot diagnostics
plot(b, which = "resid")

## predict parameters
par <- predict(b)

## predict quantiles
pq <- sapply(c(0.05, 0.5, 0.95), function(q) family(b)$q(q, par))

## visualize
plot(y ~ x, data = abdom, pch = 19,
  col = rgb(0.1, 0.1, 0.1, alpha = 0.3))
matplot(abdom$x, pq, type = "l", lwd = 2,
  lty = 1, col = 4, add = TRUE)

## use of starting values
m <- gamlss2(f, data = abdom, family = BCT,
  start = c(mu = 200, sigma = 0.1, nu = 0, tau = 10))
GAMLSS-RS iteration  1: Global Deviance = 4789.3844 eps = 0.556049     
GAMLSS-RS iteration  2: Global Deviance = 4775.2264 eps = 0.002956     
GAMLSS-RS iteration  3: Global Deviance = 4772.53 eps = 0.000564     
GAMLSS-RS iteration  4: Global Deviance = 4770.7437 eps = 0.000374     
GAMLSS-RS iteration  5: Global Deviance = 4769.9311 eps = 0.000170     
GAMLSS-RS iteration  6: Global Deviance = 4769.9052 eps = 0.000005     
## fix some parameters
m <- gamlss2(f, data = abdom, family = BCT,
  start = c(mu = 200, sigma = 0.1, nu = 0, tau = 10),
  fixed = c(nu = TRUE, tau = TRUE))
GAMLSS-RS iteration  1: Global Deviance = 4799.4176 eps = 0.555119     
GAMLSS-RS iteration  2: Global Deviance = 4795.2801 eps = 0.000862     
GAMLSS-RS iteration  3: Global Deviance = 4795.2666 eps = 0.000002     
## estimated coefficients (intercepts)
coef(m)
   mu.p.(Intercept) sigma.p.(Intercept)    nu.p.(Intercept)   tau.p.(Intercept) 
         226.349307           -2.922923            0.000000            2.302585 
## starting values using full predictors
m <- gamlss2(f, data = abdom, family = BCT,
  start = fitted(m))
GAMLSS-RS iteration  1: Global Deviance = 4902.0841 eps = 0.372275     
GAMLSS-RS iteration  2: Global Deviance = 4775.4062 eps = 0.025841     
GAMLSS-RS iteration  3: Global Deviance = 4774.5631 eps = 0.000176     
GAMLSS-RS iteration  4: Global Deviance = 4774.539 eps = 0.000005     
## same with
m <- gamlss2(f, data = abdom, family = BCT,
  start = m)
GAMLSS-RS iteration  1: Global Deviance = 4774.4529 eps = 0.534347     
GAMLSS-RS iteration  2: Global Deviance = 4770.23 eps = 0.000884     
GAMLSS-RS iteration  3: Global Deviance = 4770.1675 eps = 0.000013     
GAMLSS-RS iteration  4: Global Deviance = 4770.1566 eps = 0.000002