Ordered Logistic Family for Ordinal Regression

Description

Defines the ordered logistic (cumulative logit) family for modeling ordinal response variables within the gamlss2 framework. This implementation supports flexible modeling of the location and threshold (cutpoint) parameters, including effects of covariates.

Usage

ologit(k)

Arguments

k An integer specifying the number of response categories. Must be k >= 2.

Details

This family implements a cumulative logit model for ordinal responses with k ordered categories. The linear predictor models a latent location parameter, and the cutpoints between response categories are parameterized via a monotonic transformation:

  • The first cutpoint is modeled directly (theta1).

  • The remaining cutpoints are expressed as theta1 + exp(delta_j) for j = 2, …, k - 1, ensuring that the thresholds remain ordered.

The ologit() family supports modeling the location and threshold differences (delta_j) as functions of covariates using additive predictors in gamlss2 via the “|” formula interface.

The family returns an object of class “gamlss2.family”, which includes methods for evaluating the log-likelihood, simulating from the model, and computing predicted probabilities.

Value

A “gamlss2.family” object to be used with gamlss2.

See Also

gamlss2, gamlss2.family, polr

Examples

library("gamlss2")

## Example using the housing data from the MASS package:
library("MASS")

## Fit standard cumulative logit model using polr().
m <- polr(Sat ~ Infl + Type + Cont, weights = Freq, data = housing)
summary(m)
Call:
polr(formula = Sat ~ Infl + Type + Cont, data = housing, weights = Freq)

Coefficients:
                Value Std. Error t value
InflMedium     0.5664    0.10465   5.412
InflHigh       1.2888    0.12716  10.136
TypeApartment -0.5724    0.11924  -4.800
TypeAtrium    -0.3662    0.15517  -2.360
TypeTerrace   -1.0910    0.15149  -7.202
ContHigh       0.3603    0.09554   3.771

Intercepts:
            Value   Std. Error t value
Low|Medium  -0.4961  0.1248    -3.9739
Medium|High  0.6907  0.1255     5.5049

Residual Deviance: 3479.149 
AIC: 3495.149 
## Convert response to integer for use with gamlss2.
housing$Satint <- as.integer(housing$Sat)

## Fit equivalent model using gamlss2.
b <- gamlss2(Satint ~ Infl + Type + Cont,
  data = housing, weights = Freq,
  family = ologit(k = 3))
GAMLSS-RS iteration  1: Global Deviance = 3483.3719 eps = 0.322658     
GAMLSS-RS iteration  2: Global Deviance = 3479.4958 eps = 0.001112     
GAMLSS-RS iteration  3: Global Deviance = 3479.1765 eps = 0.000091     
GAMLSS-RS iteration  4: Global Deviance = 3479.1514 eps = 0.000007     
summary(b)
Call:
gamlss2(formula = Satint ~ Infl + Type + Cont, data = housing, 
    family = ologit(k = 3), weights = Freq)
---
Family: Ordered Logit (3 categories) 
Link function: location = identity, theta1 = identity, delta2 = identity
*--------
Parameter: location 
---
Coefficients:
               Estimate Std. Error t value Pr(>|t|)  
(Intercept)     -0.1949  3481.4686   0.000   1.0000  
InflMedium       0.5671     0.5367   1.057   0.2947  
InflHigh         1.2904     0.5572   2.316   0.0238 *
TypeApartment   -0.5731     0.6345  -0.903   0.3698  
TypeAtrium      -0.3666     0.6364  -0.576   0.5666  
TypeTerrace     -1.0924     0.6401  -1.707   0.0928 .
ContHigh         0.3607     0.4462   0.808   0.4219  
*--------
Parameter: theta1 
---
Coefficients:
             Estimate Std. Error t value Pr(>|t|)
(Intercept)   -0.6932  3481.4687       0        1
*--------
Parameter: delta2 
---
Coefficients:
            Estimate Std. Error t value Pr(>|t|)
(Intercept)   0.1724     0.2191   0.787    0.434
*--------
n = 72 df =  9 res.df =  63
Deviance = 3479.1514 Null Dev. Red. = 2099.21%
AIC = 3497.1514 elapsed =  0.03sec
## Compare coefficients.
coef(m)
   InflMedium      InflHigh TypeApartment    TypeAtrium   TypeTerrace 
    0.5663937     1.2888191    -0.5723501    -0.3661866    -1.0910149 
     ContHigh 
    0.3602841 
coef(b)
  location.p.(Intercept)    location.p.InflMedium      location.p.InflHigh 
              -0.1949313                0.5671416                1.2904361 
location.p.TypeApartment    location.p.TypeAtrium   location.p.TypeTerrace 
              -0.5730733               -0.3666344               -1.0924039 
     location.p.ContHigh     theta1.p.(Intercept)     delta2.p.(Intercept) 
               0.3607349               -0.6931503                0.1723565 
## Predict class probabilities.
pm <- predict(m, type = "p")
pb <- predict(b)
pb <- family(b)$probabilities(pb)

print(head(pm))
        Low    Medium      High
1 0.3784493 0.2876752 0.3338755
2 0.3784493 0.2876752 0.3338755
3 0.3784493 0.2876752 0.3338755
4 0.2568264 0.2742122 0.4689613
5 0.2568264 0.2742122 0.4689613
6 0.2568264 0.2742122 0.4689613
print(head(pb))
       Pr(Y=1)   Pr(Y=2)   Pr(Y=3)
[1,] 0.3779593 0.2879814 0.3340593
[2,] 0.3779593 0.2879814 0.3340593
[3,] 0.3779593 0.2879814 0.3340593
[4,] 0.2562864 0.2743603 0.4693533
[5,] 0.2562864 0.2743603 0.4693533
[6,] 0.2562864 0.2743603 0.4693533