library("gamlss")
library("gamlss2")
library("nlme")
data("Orthodont")
Random Effects
Random effects can be included as an additive term for any distribution parameter in a gamlss2
model by using the re()
function in package gamlss2
. Within the GAMLSS model fitting, the random effects additive term is then fitted locally by an interface with lme
function of the nlme
package.
1 Random effects only in the \(\mu\) model
First we will consider the case of random effects only in the \(\mu\) model. In this case the estimates of the fixed effects for the \(\sigma\) distribution parameter (and also \(\nu\) and \(\tau\)) in the gamlss2
fit do not use REML estimation.
This is not a problem if the total (effective) degrees of freedom (df) for estimating the fixed and random effects for \(\mu\) are small relative to the total df (i.e. the sample size). Example 1 below illustrates this.
However, if the total (effective) df for estimating the fixed and random effects for \(\mu\) are a significant proportion of the total df, then the estimates of the fixed effects for \(\sigma\) will be seriously negatively biased. Example 2 below illustrates this. [The estimated fixed effects for \(\mu\) are OK (but not their estimated standard errors), and the random effects parameters are REML type estimates and are also OK. So, if these estimates are of primary interest, and the estimate of the \(\sigma\) parameter is not of interest, then there is no problem.]
The total (effective) df for a random effect is ALWAYS less than the df for the corresponding fixed effects. So a quick check [of whether there might be a problem with serious negative bias in the estimates of the fixed effects for \(\sigma\) in the gamlss() fit] is to compare the corresponding fixed effects df with the total df. For example, in a simple random intercepts model for \(\mu\), if m is the number of individuals (or levels of the random effects factor), and n is the total sample size, then compare m with n, and if the proportion m/n is, say, greater than 0.05, there may be problem. Similarly for a random intercepts and slopes model for \(\mu\), then compare 2m with n, and look and the proportion 2m/n.
In the first example below using re() in gamlss for random effects works [i.e. where the total number of individuals (or factor levels) is very small relative to the total number of observations and so REML estimation is not needed, for example, a relatively low number of individuals, each with a lot of repeated measurements, OR a random factor with a low number of levels, each with a lot of observations]. We show that using LME locally in gamlss() by the re() argument gives very similar results, to using LME directly.
2 Example 2
In the second example below using the Orthodont
data, the number of individuals (or factor levels) is significant relative to the total number of observations, and there is a problem with a seriously negatively biased estimate of \(\sigma\). We show that using lme
locally in gamlss2
by the re()
function gives a very different estimate of \(\sigma\) than using LME directly. The Orthodont
data set is analysed in detail on pages 147-155 of Pinheiro and Bates (2000) “Mixed-Effects Models in S and S-PLUS”.
First fit a random intercepts model using LME:
<- lme(distance ~I(age-11),random =~ 1|Subject, data=Orthodont)
l1 l1
Linear mixed-effects model fit by REML
Data: Orthodont
Log-restricted-likelihood: -223.5013
Fixed: distance ~ I(age - 11)
(Intercept) I(age - 11)
24.0231481 0.6601852
Random effects:
Formula: ~1 | Subject
(Intercept) Residual
StdDev: 2.114724 1.431592
Number of Observations: 108
Number of Groups: 27
The model is
\[ \begin{split} \texttt{distance}_{ij} \sim& NO(0, \sigma) \\ \mu =& \beta_{0}+ \beta_1 (\texttt{age}-11) +\alpha_j \\ \end{split} \] where \(\alpha_j \sim NO(0, \sigma_a)\) The fitted model gives estimates \(\beta_0 = 24.02\), \(\beta_1 = 0.660\), \(\sigma_a = 2.1147\) and \(\sigma = 1.432\).
Now fit the random intercepts model using function re()
in gamlss2
: Note that gamlss2
has problem in interpreting the I()
function so we create the variable age_11
there is a problem with random intercepts in gamlss2
so we use gamlss
<- gamlss2(distance ~ I(age-11) + re(random =~ 1 | Subject), data = Orthodont) b1
GAMLSS-RS iteration 1: Global Deviance = 355.3077 eps = 0.339068
GAMLSS-RS iteration 2: Global Deviance = 355.3077 eps = 0.000000
<- gamlss(distance ~ I(age-11) + re(random =~ 1 | Subject), data = Orthodont) m1
GAMLSS-RS iteration 1: Global Deviance = 355.3077
GAMLSS-RS iteration 2: Global Deviance = 355.3077
## extract fitted random intercept special mode term
<- specials(b1, term = "random", elements = "model")
re
## model summary
summary(re)
Linear mixed-effects model fit by maximum likelihood
Data: x$data
AIC BIC logLik
449.3895 457.4359 -221.6948
Random effects:
Formula: ~1 | Subject
(Intercept) Residual
StdDev: 2.072142 1.783505
Variance function:
Structure: fixed weights
Formula: ~weights_w
Fixed effects: x$fixed
Value Std.Error DF t-value p-value
(Intercept) 6.824005e-16 0.4235944 81 1.610976e-15 1
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-3.68695129 -0.53862941 -0.01232442 0.49100161 3.74701483
Number of Observations: 108
Number of Groups: 27
## same with gamlss
getSmo(m1)
Linear mixed-effects model fit by maximum likelihood
Data: Data
Log-likelihood: -221.6948
Fixed: fix.formula
(Intercept)
1.278595e-15
Random effects:
Formula: ~1 | Subject
(Intercept) Residual
StdDev: 2.072142 1.13493
Variance function:
Structure: fixed weights
Formula: ~W.var
Number of Observations: 108
Number of Groups: 27
Compare estimated coefficients
## main model coefficients
coef(b1)
mu.p.(Intercept) mu.p.I(age - 11) sigma.p.(Intercept)
24.0231481 0.6601852 0.2260047
summary(l1)
Linear mixed-effects model fit by REML
Data: Orthodont
AIC BIC logLik
455.0025 465.6563 -223.5013
Random effects:
Formula: ~1 | Subject
(Intercept) Residual
StdDev: 2.114724 1.431592
Fixed effects: distance ~ I(age - 11)
Value Std.Error DF t-value p-value
(Intercept) 24.023148 0.4296605 80 55.91193 0
I(age - 11) 0.660185 0.0616059 80 10.71626 0
Correlation:
(Intr)
I(age - 11) 0
Standardized Within-Group Residuals:
Min Q1 Med Q3 Max
-3.66453932 -0.53507984 -0.01289591 0.48742859 3.72178465
Number of Observations: 108
Number of Groups: 27
## random effect coefficients
plot(coef(re)[, 1] + coef(b1)["mu.p.(Intercept)"], coef(l1)[, 1],
xlab = "gamlss2 random intercepts", ylab = "lme() random intercepts",
main = "Random Intercepts Comparison")
abline(0, 1, lty = 2, col = 4)
The fitted model gives estimates \(\beta_0 = 24.02\), \(\beta_1 = 0.660\), \(\sigma_a = 2.072\) and \(\sigma = 1.254\) The estimate of \(\sigma\) is seriously negatively biased, because it is not a REML
estimate.
For this simple random intercepts model, the estimate of \(\sigma\) can be adjusted by multiplying by the local LME residual given by 1.135 giving: adjusted \(\sigma = 1.254*1.135 = 1.42\), which is very close to the LME estimate of \(\sigma\).
However this adjustment does NOT work for more complex random effects models.