PumasEMModel-Domains

Similar to a $PumasModel$, a $PumasEMModel$ also requires you to specify domains. However, unlike a $PumasModel$ where this is done explicitly through use of types such as $VectorDomain$ or $PSDDomain$, this is done implicitly in the $PumasEMModel$ by specifying a probability distribution. Currently, only the $Normal$, $LogNormal$, and $LogitNormal$ distributions are supported, correspoding to domains of $(-∞,∞)$, $(0,∞)$, and $(0,1)$, respectively.

Declaring parameters: $@param$ and $@random$ blocks

In the $PumasEMModel$, both the $@param$ and $@random$ block allow you to declare variables with with a distribution-defined domain using the same syntax. The distinction is that $@param$ declares fixed effects, while $@random$ implicitly adds random effects. They are declared using a formula syntax:

@param begin
    Vc     ~ 1 | LogNormal
end
@random begin
    θbioav ~ 1 | LogitNormal
    CL     ~ 1 | LogNormal
end

This declares one fixed effect, $Vc ∈ (0,∞)$ and two subject-level random effects, $θbioav ∈ (0,1)$ and $CL ∈ (0,∞)$.

Note

Unlike in the $@param$ block of a $PumasModel$, the distribution in an $@param$ block of a $PumasEMModel$ is only used for inferring the domain; it does not place a prior on the associated parameter. The use of a distribution here is merely to provide consistency in specification between the $@param$ and the $@random$ blocks.

Then in $simobs(model, data, param)$ or $fit(model, data, param, approx)$, the parameters $param$ must be a $NamedTuple$ of values where the type fits in the domain. For example, the following is a valid definition of $param$ for the previous example, assuming our error model contained a single $dv$ with a $CombinedNormal$ error model:

param = (
    Vc = 2.0,
    θbioav = 0.75,
    CL = 0.1
    Ω = (1.0, 1.0), # diagonal elements of covariance matrix.
    σ = ((1.0,1.0),) # (σₐ, σₚ) of `CombinedNormal`
    )

Note that unlike in a $PumasModel$, we do not define the $Ω$ (variance of random effects) or $σ$ (dispersion in the error model) parameters in the $@param$ or $@random$ blocks, as these are defined implicitly by the number of formulas in the $@random$ and the error models in the $@error$ block. $simobs$ requires we provide values for $Ω$ and $σ$, although $fit$ will initialize $Ω$ to the identity matrix and each $σ$ to $1.0$ if absent.

Tip

If fitting with SAEM, it is recommended to use larger values for $Ω$ and $σ$ than what you estimate them to be. Doing so aids exploration of the parameter space, and enables the algorithm to escape local optima.

The structure of $Ω$ may be specified using an $@covariance$ block, which allows you to specify the size of dense blocks along the diagonal. For example, if we have 6 random effects in the model,

@covariance (2,1,3)

indicates a $6×6$ block-diagonal covariance matrix with $2×2$, $1×1$, and $3×3$ diagonal blocks. We could initialize this via:

using StaticArrays
param = (
    ...
    Ω = ( @SMatrix([1.0 0.05;
                    0.05 1.0]),
		  1.3,
	  @SMatrix([2.0 0.7 0.3;
	            0.7 1.5 0.2;
		    0.3 0.2 1.8]) )
    )

$Ω$s are specified as a $Tuple$ with one element per diagonal block. If a block is $1×1$, then it is given by a scalar. Otherwise, by an $SMatrix$. If absent, $Ω$ defaults to being diagonal.

Optionally, $ω$ may be specified instead, where each element of the tuple corresponds to the lower triangular Cholesky factor of the corresponding block of $Ω$. For example, the param field using $ω$ corresponding to $Ω$ example above would be

using StaticArrays
param = (
    ...
    ω = ( @SMatrix([1.0                                0.0;
                    0.05                0.998749217771909]),
	   	    1.140175425099138,
          @SMatrix([1.4142135623730951                 0.0                 0.0;
	            0.4949747468305832  1.1202678251204041                 0.0;
		    0.21213203435596423 0.08480115010871586 1.3220471871080242]) )
    )

Note

Currently fitting $PumasEMModel$s with $LaplaceI()$ requires structurally diagonal covariance matrices.

Formulas with covariates

It is possible to incorporate constant-time covariates in the $@param$ and $@random$ formulas. For example, given a $logwt$ covariate, we can specify

@random begin
    CL ~ 1 + logwt | LogNormal
    Vc ~ 1         | LogNormal
end

Under the hood, this defines

CL = exp(μ_cl + logwt*β_cl + η₁)
Vc = exp(μ_vc + η₂)

where $μ_cl$, $β_cl$, and $μ_vc$ are fixed effects, and $η₁$ & $η₂$ are the random effects corresponding to these formulas. That is, each term appearing in a formula is associated with a fixed effect. For example, $1$ and $logwt$ for $CL$'s formula are each multiplied by a corresponding fixed effect coefficient. Formulas in the $@random$ block also add a random effects term ($η$). Each of these parameters and the fixed effect are defined on the unconstrained scale, that is they're defined on $(-∞,∞)$, and then constrained to the appropriate domain. The only valid operator is $+$, while valid terms are either $1$ or the names of constant time covariates.

When specifying initial values, the transform/inverse transforms are applied to parameters corresponding to $1$ terms. Example valid param fields for this @random block are:

param = (
    ...
    CL = (4.0, 0.75), # 1 + logwt
    Vc =  70.0,       # 1
    ...
    )

Thus, given $logwt$ values of $0$ and $1$ respectively, the corresponding $CL$ values are $exp(log(4)) = 4$ and $exp(log(4)+0.75) ≈ 8.468$.