Extended Processing of "Fake" Formulas

Description

Create a simplified or "fake" version of a model formula for internal use in gamlss2. Many model formulas in gamlss2 may contain complex terms such as smoothing functions, tensor-product interactions, distributional components, or multiple right-hand sides. These extended formulas cannot be passed directly to model.frame, which expects a traditional formula as used in lm or glm.

fake_formula() therefore "fakes" a standard formula by rewriting the original model specification into a form that is acceptable to model.frame, while preserving all variable transformations (e.g., log(), exp(), arithmetic expressions) and extracting any special model terms. These special terms can then be handled separately during model setup and fitting.

In summary, the function separates the parsing of model terms from the creation of the data frame: it provides a clean, simplified formula that ensures correct extraction and evaluation of variables.

Usage

fake_formula(formula, specials = NULL,
  nospecials = FALSE, onlyspecials = FALSE)

Arguments

formula A formula, Formula, or a list of formulas.
specials Character, vector of names of special functions in the formula, see terms.formula.
nospecials Logical, should variables of special model terms be part of the "fake formula"?
onlyspecials Logical, should only the special model terms be returned?

Value

Depending on the input formula, the function returns a formula or Formula. If onlyspecials = TRUE a vector or list of special model term names is returned.

Note

In some versions of the RStudio IDE, printing or inspecting a fake_formula() object may trigger a message of the form

‘length = 2’ in coercion to ‘logical(1)’.

This is due to the way RStudio internally inspects objects of class Formula, whose length() method returns a two-element vector (reflecting the number of left- and right-hand side components). The message does not indicate a problem with fake_formula(), and the returned object is valid and works correctly with model.frame() and subsequent processing. The message does not appear in a standard R session outside RStudio.

See Also

gamlss2

Examples

library("gamlss2")

## basic formula, log(x3) should be kept
f <- y ~ x1 + x2 + log(x3)
ff <- fake_formula(f)
print(ff)
y ~ x1 + x2 + log(x3)
## including special model terms
## again, keep log(x3)
f <- y ~ x1 + s(x2) + x3 + te(log(x3), x4)
ff <- fake_formula(f)
print(ff)
~x1 + x3 + x2 + log(x3) + x4
## multiple parts on the right-hand side
f <- y ~ x1 + s(x2) + x3 + te(log(x3), x4) | x2 + sqrt(x5)
ff <- fake_formula(f)
print(ff)
y ~ x1 + x3 + x2 + log(x3) + x4 | x2 + sqrt(x5)
## collapse all formula parts
print(formula(ff, collapse = TRUE))
y ~ x1 + x3 + x2 + log(x3) + x4 + (x2 + sqrt(x5))
print(formula(ff, collapse = TRUE, update = TRUE))
y ~ x1 + x3 + x2 + log(x3) + x4 + sqrt(x5)
## list of formulas
f <- list(
  y ~ x1 + s(x2) + x3 + te(log(x3), x4),
    ~ x2 + sqrt(x5),
    ~ z2 + x1 + exp(x3)
)
ff <- fake_formula(f)
print(ff)
y ~ x1 + x3 + x2 + log(x3) + x4 | x2 + sqrt(x5) | z2 + x1 + exp(x3)
## extract separate parts on the right-hand side
formula(ff, rhs = 1)
y ~ x1 + x3 + x2 + log(x3) + x4
formula(ff, rhs = 2)
y ~ x2 + sqrt(x5)
formula(ff, rhs = 3)
y ~ z2 + x1 + exp(x3)
## formula with multiple responses and multiple parts
f <- y1 | y2 | y3 ~ x1 + s(x2) + x3 + te(log(x3), x4) | x2 + ti(x5)
ff <- fake_formula(f)
print(ff)
y1 | y2 | y3 ~ x1 + x3 + x2 + log(x3) + x4 | x2 + x5
## list of formulas with multiple responses
f <- list(
  y1 ~ x1 + s(x2) + x3 + te(log(x3), x4),
  y2 ~ x2 + sqrt(x5),
  y3 ~ z2 + x1 + exp(x3) + s(x10)
)
ff <- fake_formula(f)
print(ff)
y1 | y2 | y3 ~ x1 + x3 + x2 + log(x3) + x4 | x2 + sqrt(x5) | 
    z2 + x1 + exp(x3) + x10
## extract only without special terms
ff <- fake_formula(f, nospecials = TRUE)
print(ff)
y1 | y2 | y3 ~ x1 + x3 | x2 + sqrt(x5) | z2 + x1 + exp(x3)
## extract only special terms
ff <- fake_formula(f, onlyspecials = TRUE)
print(ff)
[[1]]
[1] "s(x2)"          "te(log(x3),x4)"

[[2]]
character(0)

[[3]]
[1] "s(x10)"