## Overview

Family objects are important building blocks in the design of BAMLSS models. They specify the distribution by collecting functions of the density, respective log-likelihood, first-order derivatives of the log-likelihood w.r.t. predictors (the score function), and (optionally) second-order derivatives of the log-likelihood w.r.t. predictors or their expectation (the Hessian).

The bamlss package can be easily extended by constructing families for specific tasks, i.e. problems for which a likelihood can be formulated. However, the most important distributions are shipped with bamlss. Finally, it is also possible to employ gamlss families.

## How to build a bamlss family

We illustrate how to build a bamlss family by hand along the Gaussian distribution, with density $f(y\,|\,\mu,\sigma) = \frac{1}{\sqrt{2\pi}\sigma} \cdot \exp \left( \frac{-(y-\mu)^2}{2\sigma^2} \right),$ and log-likelihood function $\ell(\mu,\sigma\,|\,y) = - \frac{1}{2} \log(2\pi) - \log(\sigma) - \frac{(y-\mu)^2}{2\sigma^2},$ for an individual observation. The sum of the log-likelihood function over all observations is the target function of the optimization problem.

In the distributional regression framework the parameters are linked to predictors by link functions, $\mu = \eta_\mu, \qquad \log(\sigma) = \eta_\sigma.$ For the Gaussian $$\mu$$ and $$\sigma$$ are linked to $$\eta_\mu$$ and $$\eta_\sigma$$ by the identity function and the logarithm, respectively.

The score functions in bamlss are the first derivatives of the log-likelihood w.r.t. the predictors: $s_\mu = \frac{\partial\ell}{\partial\eta_\mu} = \frac{\partial\ell}{\partial\mu} \cdot \frac{\partial\mu}{\partial\eta_\mu} = \frac{y-\mu}{\sigma^2},$ and $s_\sigma = \frac{\partial\ell}{\partial\eta_\sigma} = \frac{\partial\ell}{\partial\sigma} \cdot \frac{\partial\sigma}{\partial\eta_\sigma} = -1 + \frac{(y-\mu)^2}{\sigma^2}.$

For the second derivative of the log-likelihood we are able to obtain the negative expectation, $\mathsf{E}(-\partial^2\ell / \partial\eta_{\mu}^{2} ) = \sigma^{-2},$ and $\mathsf{E}(-\partial^2\ell / \partial\eta_{\sigma}^{2} ) = 2.$

Now we have to write a function that returns a family.bamlss object (S3) which encapsulates functions for density, score and Hessian, and the names of the family, parameter and link functions:

Name of element Value
family character string with the name of the family
names vector of character strings with the names of the parameters
links vector of character strings with the names of the link functions
d a function returning the density with arguments y, par, log = FALSE (see below)
score a list with functions (one for each parameter) returning the first derivatives of the log-likelihood w.r.t. predictors
hess a list with functions (one for each parameter) returning the negative second derivatives of the log-likelihood w.r.t. predictors

Nearly all functions take as first argument the response y and as second argument a named list holding the evaluated parameters par of the distribution on the scale of the predictors $$\eta_{\star}$$.

The implementation looks like this:

mygauss <- function(...) {
f <- list(
"family" = "mygauss",
"names"  = c("mu", "sigma"),
"links"  = c(mu = "identity", sigma = "log"),
"d" = function(y, par, log = FALSE) {
dnorm(y, mean = par$mu, sd = par$sigma, log = log)
},
"score" = list(
mu = function(y, par, ...) {
drop((y - par$mu) / (par$sigma^2))
},
sigma = function(y, par, ...) {
drop(-1 + (y - par$mu)^2 / (par$sigma^2))
}
),
"hess" = list(
mu = function(y, par, ...) {
drop(1 / (par\$sigma^2))
},
sigma = function(y, par, ...) {
rep(2, length(y))
}
)
)
class(f) <- "family.bamlss"
return(f)
}
library("bamlss")

## Simulate some data.
d <- GAMart()

## Specify formula.
f <- ~ s(x1) + s(x2) + s(x3) + s(lon, lat)
f <- list(mu = update(f, num ~ .), sigma = f)

## Model.
b <- bamlss(f, data = d, family = mygauss)

Optionally, the family.bamlss object can be extended by functions for

• the cumulative distribution function p(y, par, ...),
• the quantile function (the inverse cdf) q(p, par),
• a random number generator r(n, par),
• the log-likelihood loglik(y, par),
• the expectation mu(par, ...),
• initial values for optimization, which has to be a list containing a function for each parameter,

which can help to speed up optimization, or be convenient for predictions and simulations.

## Available families

There are some families implemented:

### Continuous response

#### Gaussian distribution

Or normal distribution.

Density: $f(y\,|\,\mu,\sigma) = \frac{1}{\sqrt{2\pi}\sigma} \cdot \exp \left( \frac{-(y-\mu)^2}{2\sigma^2} \right),$ with $$\sigma > 0$$.

• $$\mu = \eta_\mu$$
• $$\log(\sigma) = \eta_\sigma$$

Function: gaussian_bamlss(...)

Alternative parameterizations:

• gaussian2_bamlss(...) models inverse sigma with a log link.
• Gaussian_bamlss(...) one parametric family modelling only mu.

#### Generalized logistic distribution

Or skewed-logistic distribution.

Density: $f(y\,|\,\mu,\sigma,\alpha)=\frac{\alpha\cdot\exp\left(-\frac{y-\mu}{\sigma}\right)} {\sigma\cdot\left(1+\exp\left(-\frac{y-\mu}{\sigma}\right)\right)^{2\alpha}},$ with $$\sigma, \alpha > 0$$.

• $$\mu = \eta_\mu$$
• $$\log(\sigma) = \eta_\sigma$$
• $$\log(\alpha) = \eta_\alpha$$

Function: glogis_bamlss(...)

### Positive continuous response

#### Gamma distribution

Two-parameter distribution with $$\mu > 0$$ for the expectation of the distribution and $$\sigma > 0$$ for its shape.

Density: $f(y\,|\,\mu,\sigma)=\frac{y^{\sigma-1}\cdot\exp\left(-\frac{\sigma\cdot y}{\mu}\right)} {\left(\frac{\mu}{\sigma}\right)^\sigma\cdot\Gamma(\sigma)}.$

• $$\log(\mu) = \eta_\mu$$
• $$\log(\sigma) = \eta_\sigma$$

Function: gamma_bamlss(...)

#### Generalized Pareto distribution

Density: $f(y\,|\,\xi,\sigma)=\frac{1}{\sigma}\cdot\left(1+\frac{\xi\cdot y}{\sigma} \right)^{-\left(1+1/\xi\right)},$ with $$\xi, \sigma > 0$$.

• $$\log(\xi) = \eta_\xi$$
• $$\log(\sigma) = \eta_\sigma$$

Function: gpareto_bamlss(...)

#### Weibull distribution

Density: $f(y\,|\,\lambda,\alpha)=\frac{\alpha}{\lambda}\cdot\left(\frac{y}{\lambda}\right)^{\alpha-1} \cdot\exp\left(-\left(\frac{y}{\lambda}\right)^\alpha\right),$ with $$\lambda, \alpha > 0$$.

• $$\log(\lambda) = \eta_\lambda$$
• $$\log(\alpha) = \eta_\alpha$$

Function: weibull_bamlss(...)

#### Lognormal distribution

Density: $f(y\,|\,\mu,\sigma) = \frac{1}{\sqrt{2\pi}\sigma} \cdot \exp \left( \frac{-(\log(y)-\mu)^2}{2\sigma^2} \right),$ with $$\sigma > 0$$.

• $$\mu = \eta_\mu$$
• $$\log(\sigma) = \eta_\sigma$$

Function: lognormal_bamlss(...)

### Censored continuous responses

#### Zero-censored normal distribution

Density: $f(y\,|\,\mu,\sigma)= \begin{cases} \frac{1}{\sigma} \phi\left(\frac{y-\mu}{\sigma} \right) & y > 0 \\ \Phi\left(\frac{-\mu}{\sigma} \right) & \mbox{else,} \end{cases}$ where $$\phi$$ is the probability density and $$\Phi$$ the cumulative distribution function of the standard normal distribution.

• $$\mu = \eta_\mu$$
• $$\log(\sigma) = \eta_\sigma$$

Function: cnorm_bamlss(...)

### Interval responses

#### Beta distribution

The beta distribution, with density $f_{Beta}(y\,|\,\alpha,\beta)=\frac{\Gamma(\alpha+\beta)}{\Gamma(\alpha)\cdot\Gamma(\beta)} \cdot y^{\alpha-1} \cdot (1-y)^{\beta-1}$ is implemented using the Balding-Nichols parameterization (Balding and Nichols 1995).

Density: $f(y\,|\,\mu,\sigma)=f_{Beta}\left(y\,\middle|\,\alpha = \mu \cdot \frac{1-\sigma}{\sigma}, \beta = (1-\mu) \cdot \frac{1-\sigma}{\sigma} \right),$ with $$0 < \mu < 1$$ and $$0 < \sigma < 1$$.

• $$\mathrm{logit}(\mu)=\eta_\mu$$
• $$\mathrm{logit}(\sigma)=\eta_\sigma$$

Function: beta_bamlss(...)

### Discrete responses

#### Bernoulli distribution

Often called binomial, though the Bernoulli is a special case of the binomial distribution.

Density: $f(y\,|\,\pi) = \pi^y \cdot (1-\pi)^{1-y},$ where $$\pi$$ is a probability.

• $$\mathrm{logit}(\pi)=\eta_\pi$$

Function: binomial_bamlss(link = "logit", ...)

#### Multinomial model

Say, $$y$$ is a categorial response with $$c+1$$ nomial categories, and $$y_{1}^{\star}, \dots, y_{c+1}^{\star}$$ are associated dummy variables. $$p_1, \dots, p_{c+1}$$ are probabilities for an observation falling in one of the categories. Category $$c+1$$ serves as reference, i.e. $$p_{c+1} = 1 - p_1 - \dots - p_{c}$$.

Density: $f(y\,|\,\pi_1,\dots,\pi_c) = \prod_{s=1}^{c+1} p_{s}^{y_{s}^{\star}}, \quad \text{where} \quad p_r = \frac{\pi_r}{1 + \pi_1 + \dots + \pi_c},$

• $$\log(\pi_r) = \log\left(\frac{p_r}{p_{c+1}}\right) = \eta_r,$$

where $$\eta_r$$ specifies the log-odds between category $$r$$ and the reference category $$c+1$$.

Function: multinomial_bamlss(...)

### Count data responses

#### Poisson distribution

Density: $f(y\,|\,\lambda) = \frac{\lambda^y \cdot \exp(-\lambda)}{y!},$ with $$\lambda > 0$$.

• $$\log(\lambda) = \eta_\lambda$$

Function: poisson_bamlss(...)

#### Negative binomial (type II)

Density: $f(y\, |\, \mu, \theta) = \frac{\Gamma(\theta + y)}{\Gamma({\theta}) \cdot y!} \cdot \frac{\mu^y \cdot \theta^\theta}{(\mu + \theta)^{\theta + y}},$ with $$\mu,\theta > 0$$.

• $$\log(\mu) = \eta_\mu$$
• $$\log(\theta) = \eta_\theta$$

Function: nbinom_bamlss(...)

#### Zero-truncated negative binomial

Density: $f(y\,|\,\mu,\theta)=\frac{f_{\mathrm{NB}}(y\,|\,\mu,\theta)} {1-f_{\mathrm{NB}}(0\,|\,\mu,\theta)},$ where $$f_{\mathrm{NB}}$$ is the density of the negative binomial.

• $$\log(\mu) = \eta_\mu$$
• $$\log(\theta) = \eta_\theta$$

Function: ztnbinom_bamlss(...)

### Multivariate responses

#### Multivariate normal distribution

Say $$y$$ is a $$k$$ dimensional observation.

Density: $f(y\,|\,\mu,\Sigma) = \frac{1}{\sqrt{(2\pi)^{k} |\Sigma|}} \cdot \exp \left(-\frac{1}{2} (y-\mu)^\top \Sigma^{-1} (y-\mu) \right),$ where $$\Sigma$$ can be decomposed into $$\Sigma = D \Omega D$$ with $D= \begin{pmatrix} \sigma_1 & 0 & \cdots & 0 \\ 0 & \sigma_2 & \cdots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \cdots & \sigma_k \end{pmatrix}, \qquad \text{and} \qquad \Omega= \begin{pmatrix} 1 & \rho_{12} & \cdots & \rho_{1k} \\ \rho_{12} & 1 & \cdots & \rho_{2k} \\ \vdots & \vdots & \ddots & \vdots \\ \rho_{1k} & \rho_{2k} & \cdots & 1 \end{pmatrix}.$

• $$\mu_j = \eta_{\mu_j}$$
• $$\log(\sigma_j) = \eta_{\sigma_j}$$
• $$\mathrm{rhogit}(\rho_{jl}) = \eta_{\rho_{jl}}$$

Function: mvnorm_bamlss(k = 2, ...)

Alternative parameterizations:

• mvnormAR1_bamlss(k = 2, ...) with $\Omega= \begin{pmatrix} 1 & \rho & \rho^2 & \cdots & \rho^{k-1} \\ \rho & 1 & \rho & \cdots & \rho^{k-2} \\ \rho^2 & \rho & 1 & \cdots & \rho^{k-3} \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ \rho^{k-1} & \rho^{k-2} & \rho^{k-3} & \cdots & 1 \end{pmatrix},$ and $$\mathrm{rhogit}(\rho) = \eta_{\rho}$$.

### Survival

• cox_bamlss(...)

### Others

• AR1_bamlss(ar.start = NULL, ...)
• beta1_bamlss(ar.start, ...)

## R package gamlss support

The bamlss package supports gamlss families via a transfer function that builds the bamlss.family object from a GAMLSS distribution funciton (Stasinopoulos and Rigby 2007), e.g. NO(), internally:

## Load the gamlss families.
library("gamlss.dist")

## Estimate model.
b <- bamlss(f, data = d, family = NO)

## References

Balding, David J., and Richard A. Nichols. 1995. “A Method for Quantifying Differentiation Between Populations at Multi-Allelic Loci and Its Implications for Investigating Identity and Paternity.” Genetica 96 (1): 3–12. doi:10.1007/BF01441146.

Stasinopoulos, D. Mikis, and Robert A. Rigby. 2007. “Generalized Additive Models for Location Scale and Shape (Gamlss) in R.” Journal of Statistical Software 23 (7): 1–46. doi:10.18637/jss.v023.i07.

Umlauf, Nikolaus, Nadja Klein, Achim Zeileis, and Thorsten Simon. 2019. bamlss: Bayesian Additive Models for Location Scale and Shape (and Beyond). https://CRAN.R-project.org/package=bamlss.