# PumasModel-Error models

The error models in Pumas are formulated as the conditional distribution of the observed variable, $Y$, conditionally on the random effects, $η$. In the following, $μ = \mathrm{E}(Y|η)$ is the individual prediction. The individual prediction $μ$ generally depends on the random effect $η$ but the dependency will be suppressed in the notation for simplicity.

A broad range of distributions are supported and a subset of the more common ones is presented below. A small code example is presented for each of the error models presented as well as the conditional mean (individual prediction) and variance of each of the models. A summary of all error models is presented as a table towards the end of this page.

In Pumas we don't differentiate between FOCE (First Order Conditional Estimation) and FOCEI (FOCE with interaction). We detect if it is necessary to include the interaction terms as part of our automatic differentiation instead.

## Gaussian models

The Gaussian models are the traditional workhorse models for non-linear mixed effects models. In Pumas, the Gaussian models are defined by using the Normal structure which is parameterized with the mean and the standard deviation.

@derived begin
y ~ @. Normal(μ, σ)
end
$$$\mathrm{E}(Y|η) = μ \quad \mathrm{Var}(Y|η) = σ^2$$$

The dispersion parameter of this model is $σ^2$.

### Proportional

@derived begin
y ~ @. Normal(μ, μ*σ)
end
$$$\mathrm{E}(Y|η) = μ \quad \mathrm{Var}(Y|η) = (μσ)^2$$$

The dispersion parameter of this model is $(μσ)^2$, i.e. a function of $η$ via $μ$.

In some cases when using numerical ODE integration, it can be difficult to ensure that $μ$ remains positive. In such cases, it usually helps to writeNormal(μ, abs(μ)*σ) or alternatively Normal(μ, μ*σ + 1e-10).

@derived begin
y ~ @. Normal(μ, sqrt(σ_add^2 + (μ*σ_prop)^2))
end
$$$\mathrm{E}(Y|η) = μ \quad \mathrm{Var}(Y|η) = σ_{add}^2 + (μσ_{prop})^2$$$

The dispersion parameter of this model is $σ_{add}^2 + (μσ_{prop})^2$, i.e. a function of $η$ via $μ$.

### Log-normal (exponential)

@derived begin
y ~ @. LogNormal(log(μ), σ)
end
$$$\mathrm{E}(Y|η) = \mathrm{E}\left(\exp\left(\log(μ) + \frac{σ^2}{2}\right)|η\right) \approx μ \quad \mathrm{Var}(Y|η) \approx (μσ)^2$$$

for small values of $σ$ so this model is very similar to the Proportional error model.

The dispersion parameter of this model is $σ^2$.

The log-normal model is equivalent to the following model

@derived begin
logy ~ @. Normal(log(μ), σ)
end

where logy = log.(y). Notice that this is different from some other software packages where the log-normal model is equivalent to a proportional error model when using the first-order approximation methods.

## Other continuous models

### Student's t

For data with some extreme observations, it is sometimes useful to use an error model with heavier tails such that Student's t distribution. The LocationScale structure is used to allow for location shift and scaling of one-parameter TDist(ν) distribution type.

@derived
y ~ @. LocationScale(μ, σ, TDist(ν))
end
$$$\mathrm{E}(Y|η) = μ \quad \mathrm{Var}(Y|η) = σ^2 \quad \textrm{if } ν>2$$$

The Student's t distributed model does not allow for FOCE estimation but requires Pumas.LaplaceI().

### Gamma

The Gamma model is traditionally used in the literature on generalized linear models, see P. McCullagh, J.A. Nelder (1989), for data where the standard deviation is proportional to the mean which is the same property that the Gaussian proportional error model has.

In contrast to the Gaussian proportional error model, the dispersion parameter of the Gamma model is $ν^{-1}$ which does not depend on the mean parameter.

@derived begin
y ~ @. Gamma(ν, μ/ν)
end
$$$\mathrm{E}(Y|η) = μ \quad \mathrm{Var}(Y|η) = \frac{μ}{ν}$$$

which shows how the Gamma model is very similar to the Gaussian proportional error model. They share the first two moments when setting $σ^2=ν^{-1}$.

## Models for discrete data

### Bernoulli (logistic regression)

@derived begin
dv ~ @. Bernoulli(μ)
end
$$$\mathrm{E}(Y|η) = μ \quad \mathrm{Var}(Y|η) = μ(1 - μ)$$$

The dispersion parameter of this model is $1$.

### Poisson

@derived begin
y ~ @. Poisson(μ)
end
$$$\mathrm{E}(Y|η) = μ \quad \mathrm{Var}(Y|η) = μ$$$

The dispersion parameter of this model is $1$ which allows for estimation with the Pumas.FOCE() approximation.

### Negative binomial

Notice that this distribution has many parametrizations in the literature.

@derived begin
p := n/(μ + n)
y ~ @. NegativeBinomial(n, p)
end

where $p$ is the probability of success and $n$ is the size (or dispersion) parameter.

$$$\mathrm{E}(Y|η) = μ \quad \mathrm{Var}(Y|η) = \frac{μ(μ + n)}{n}$$$

### Categorical

Categorical error models can be used for modeling unordered (nominal) and ordered (ordinal) data. The categorical error model is slightly different from other discete error model in that the magnitude of the observed variable is irrelevant. For conditional mean $\mathrm{E}(Y|η)$ is not defined for the categorical error model and instead we consider the probability $P(Y=y|η)$ of each of the outcomes $y \in 1,...,k$. This can be formulated in Pumas as

@derived begin
y ~ @. Categorical(p₁, p₂, p₃)
end

where the pᵢ parameters should sum to one. This model can be estimated with the Pumas.FOCE() estimation methods.

## Models for incomplete data

The two most common models for incomplete data with a known truncations point are the truncated and censored Gaussian models, see A Hald (1949). These kinds of models were introduced in pharmacometrics in Stuart L Beal (2001) for handling data below the limit of quantification. The paper used the name M2 for the truncated model and M3 for the censored model.

A common assumption for the two models below is that the core interest is in the latent variable and not in the truncated or censored version of the variable. In consequence, predictions and weighted residuals are computed for the latent variable, not the truncated/censored variable. The residuals associated with censored observations are computed but might not be easily interpreted.

Currently, it is not possible to fit the models for incomplete data using the Pumas.FOCE approximation.

### Truncated (M2)

In the truncated model, all observations outside the upper and lower truncation points $u$ and $l$ are omitted and the probability mass between the truncation points is adjusted accordingly. It is assumed that the data can be modeled with some latent variable $Y^*$ but that only the variable $Y$ is observed where

$$$Y = \begin{cases} Y^* \quad &\textrm{if } l \leqslant Y^* \leqslant u \\ missing \quad &\textrm{if } Y^* < l \vee Y^* > u \end{cases}$$$

When fitting data to this model all incomplete data records should be either removed or coded as missing

@derived begin
y ~ @. truncated(Normal(μ, σ), l, u)
end
$$$\mathrm{E}(Y^*|η) = μ \quad \mathrm{Var}(Y^*|η) = σ^2$$$

Notice, the mean and variance are computed for the latent variable $Y^*$.

The Pumas.FOCE() estimation method is not applicable for this error model. For random effects models based on a truncated error model, the Pumas.LaplaceI() estimation method should be used.

### Censored (M3)

In the censored model, the number of censored observations is assumed to be known. Hence, instead of simply removed probability mass associated with observations outside the truncation points $u$ and $l$, the censored model introduces positive probability mass at the truncation points. It is assumed that the data can be modeled with latent variable $Y^*$ but that only the variable $Y$ is observed where

$$$Y = \begin{cases} u\quad &\textrm{if } Y^* \geqslant u \\ Y^* \quad &\textrm{if } l < Y^* < u \\ l \quad &\textrm{if } Y^* \leqslant l \end{cases}$$$

Hence, incomplete data records should be encoded with a value below (above) or equal to the lower (upper) truncation point.

@derived begin
y ~ @. Censored(Normal(μ, σ), lower, upper)
end
$$$\mathrm{E}(Y^*|η) = μ \quad \mathrm{Var}(Y^*|η) = σ^2$$$

Notice, the mean and variance are computed for the latent variable $Y^*$.

The Pumas.FOCE() estimation method is not applicable for this error model. For random effects models based on a Censored error model, the Pumas.LaplaceI() estimation method should be used.

## Summary of Error Models

DataModelDerived Block<br>ExpressionDispersionApproximation Method
Continuous
Additivey ~ @. Normal(μ, σ)σ^2Pumas.FOCE()
Proportionaly ~ @. Normal(μ, μ*σ)(μσ)^2Pumas.FOCE()
Additive &<br>Proportionaly ~ @. Normal(μ, sqrt(σ_add^2 + (μ*σ_prop)^2))σ_add^2 + (μ*σ_prop)^2Pumas.FOCE()
Log Normaly ~ @. LogNormal(log(μ), σ)σ^2Pumas.FOCE()
Log Normallogy ~ @. Normal(log(μ), σ)σ^2Pumas.FOCE()
Student's Ty ~ @. LocationScale(μ, σ, TDist(ν))Pumas.LaplaceI()
Gammay ~ @. Gamma(ν, μ/ν)ν^-1Pumas.FOCE()
Discrete
Bernoullidv ~ @. Bernoulli(μ)1Pumas.FOCE()
Poissony ~ @. Poisson(μ)1Pumas.FOCE()
Negative Binomialp := n/(μ + n)<br> y ~ @. NegativeBinomial(n, p)nPumas.FOCE()
Categoricaly ~ @. Categorical(p₁, p₂, p₃)Pumas.FOCE() or <br>Pumas.FOCE()
Incomplete Data
Truncated (M2)y ~ @. truncated(Normal(μ, σ), l, u)Pumas.LaplaceI()
Censored (M3)y ~ @. Censored(Normal(μ, σ), lower, upper)Pumas.LaplaceI()