bf.Rmd
Similar to the well-known model.frame()
function that is used, e.g., by the linear model fitting function lm()
, or for generalized linear models glm()
, the bamlss.frame()
function extracts a “model frame” for fitting distributional regression models. Internally, the function parses model formulae, one for each parameter of the distribution, using the Formula package infrastructures (Zeileis and Croissant 2010) in combination with model.matrix()
processing for linear effects and smooth.construct()
processing of the mgcv package to setup design and penalty matrices for unspecified smooth function estimation (???), see also, e.g., the documentation of function s()
and te()
.
The most important arguments are
bamlss.frame(formula, data = NULL, family = "gaussian",
weights = NULL, subset = NULL, offset = NULL,
na.action = na.omit, contrasts = NULL, ...)
The argument formula
can be a classical model formulae, e.g., as used by the lm()
function, or an extended bamlss formula including smooth term specifications like s()
or te()
, that is internally parsed by function bamlss.formula()
. Note that the bamlss package uses special family
objects, that can be passed either as a character without the "_bamlss"
extension of the bamlss family name (see the manual ?bamlss.family
for a list of available families and the corresponding vignette BAMLSS Families), or the family function itself. In addition, all families of the gamlss (???) and gamlss.dist (Stasinopoulos and Rigby 2019) package are supported.
The returned object, a named list of class "bamlss.frame"
, can be employed with all model fitting engines. The most important elements used for estimation are:
x
: A named list, the elements correspond to the parameters that are specified within the family
object. For each distribution parameter, the list contains all design and penalty matrices needed for modeling (see the upcoming example).y
: The response data.family
: The processed .To better understand the structure of the "bamlss.frame"
object a print method is provided. For illustration, we simulate data
and set up a "bamlss.frame"
object for a Gaussian distributional regression model including smooth terms. First, a model formula is needed
Afterwards the model frame can be computed with
bf <- bamlss.frame(f, data = d, family = "gaussian")
To keep the overview, there is also an implemented print method for "bamlss.frame"
objects.
print(bf)
## 'bamlss.frame' structure:
## ..$ call
## ..$ model.frame
## ..$ formula
## ..$ family
## ..$ terms
## ..$ x
## .. ..$ mu
## .. .. ..$ formula
## .. .. ..$ fake.formula
## .. .. ..$ terms
## .. .. ..$ model.matrix
## .. .. ..$ smooth.construct
## .. ..$ sigma
## .. .. ..$ formula
## .. .. ..$ fake.formula
## .. .. ..$ terms
## .. .. ..$ model.matrix
## .. .. ..$ smooth.construct
## ..$ y
## .. ..$ num
## ..$ delete
For writing a new estimation engine, the user can directly work with the model.matrix
elements, for linear effects, and the smooth.construct
list, for smooth effects respectively. The smooth.construct
is a named list which is compiled using the smoothCon()
function of the mgcv package using the generic smooth.construct()
method for setting up smooth terms.
## [1] "s(x2)" "s(x3)" "te(lon,lat)"
In this example, the list contains three smooth term objects for parameter mu
and sigma
.
See also the vignette Estimation Engines presenting more details on how to work with the bamlss.frame()
.
Stasinopoulos, Mikis, and Robert Rigby. 2019. Gamlss.dist: Distributions for Generalized Additive Models for Location Scale and Shape. https://CRAN.R-project.org/package=gamlss.dist.
Umlauf, Nikolaus, Nadja Klein, Achim Zeileis, and Thorsten Simon. 2024. bamlss: Bayesian Additive Models for Location Scale and Shape (and Beyond). https://CRAN.R-project.org/package=bamlss.
Zeileis, Achim, and Yves Croissant. 2010. “Extended Model Formulas in R: Multiple Parts and Multiple Responses.” Journal of Statistical Software 34 (1): 1–13. https://doi.org/10.18637/jss.v034.i01.