The package is designed to follow the workflow of well-established model fitting functions like lm() or glm(), i.e., the step of estimating full distributional regression models is actually not very difficult.
To illustrate the workflow using gamlss2, we analyze the HarzTraffic data, where we model the number of motorcycles (response bikes) at Sonnenberg in the Harz region of Germany. The data can be loaded with
The data consists of seasonal time information (variable yday) along with a number of environmental variables (e.g. mean daily temperature). As a first model, we estimate a linear regression model with normal errors (which is the default)
b <-gamlss2(bikes ~ temp + rain + sunshine + wind, data = HarzTraffic)
GAMLSS-RS iteration 1: Global Deviance = 14325.7146 eps = 0.046095
GAMLSS-RS iteration 2: Global Deviance = 14325.7146 eps = 0.000000
Note that the summary output is very similar to lm() and glm() with the main difference being that summary outputs are provided for all parameters of the distribution. In this case, the model is estimated using the NO family of the gamlss.dist package, a two-parameter distribution with parameters mu and sigma.
Residual Diagnostics
Since we estimated a simple linear model with Gaussian errors up to now, we are assuming that the distribution of the response variable, the number of motorcycles (bikes), follows a normal distribution with constant variance. However, this assumption may not always hold true, especially when the response variable is count data, which often exhibits overdispersion or non-constant variance.
To assess whether the normal distribution with constant variance is appropriate, we can start by examining diagnostic plots.
plot(b)
These plots help us visually inspect the residuals for any deviations from the assumptions of normality and constant variance.
References
Rigby, R. A., and D. M. Stasinopoulos. 2005. “Generalized Additive Models for Location, Scale and Shape.”Journal of the Royal Statistical Society Series C (Applied Statistics) 54 (3): 507–54. https://doi.org/10.1111/j.1467-9876.2005.00510.x.