# Estimating Parameters of Pumas Models

Pumas can use the observational data of a `Subject`

or `Population`

to estimate the parameters in many types of models. This is done by two classes of methods. First, maximum likelihood methods find the parameters such that the observational data has the highest probability of occurring according to the chosen error distributions. Second, Bayesian methods find a posterior probability distribution for the parameters to describe the chance that a parameter has a given value given the data. The following section describes how to fit an NLME model in Pumas via the two methods.

Maximum likelihood methods find the parameters such that the observational data has the highest probability of occurring according to the chosen error distributions.

Bayesian methods find a posterior probability distribution for the parameters to describe the chance that a parameter has a given value given the data.

Maximum A Posteriori (MAP) can be considered as something in between maximum likelihood estimation and Bayesian inference. It incorporates priors of some or all variables and can be interpreted as penalized maximum likelihood or finding the mode of the posterior distribution in a Bayesian context.

The following sections describe how to fit a NLME model in Pumas via the two methods.

## Defining Data for Estimation

The observed data should be parsed using the same names as those found in the model. For example, if `subject.observations`

is a `NamedTuple`

with names `dv`

and `resp`

, the `@derived`

block or function in the `PumasModel`

should define distributions with matching names. Similarly, the `error`

block should define distributions with matching names if using a `PumasEMModel`

instead. If `dv`

is a scalar in the observation data, then `dv`

from `@derived`

should also be a scalar. Likewise, if `dv`

is an array like a time series, then `dv`

should be a size-matching time series when returned from `@derived`

. The likelihood of observing multiple dependent variables is calculated under the assumption of independence between the two.

## Maximum Likelihood Estimation

Maximum Likelihood Estimation (MLE) is performed using the `fit`

function:

`StatsAPI.fit`

— Method```
fit(
model::PumasModel,
population::Population,
param::NamedTuple,
approx::Union{LikelihoodApproximation, MAP};
optim_alg = Optim.BFGS(linesearch = Optim.LineSearches.BackTracking(), initial_stepnorm = 1.0),
optim_options::NamedTuple = NamedTuple(),
optimize_fn = nothing,
constantcoef::Tuple = (),
ensemblealg::SciMLBase.EnsembleAlgorithm = EnsembleThreads(),
checkidentification = true,
diffeq_options = NamedTuple())
```

Fit the Pumas model `model`

to the dataset `population`

with starting values `param`

using the estimation method `approx`

. Currently supported values for the `approx`

argument are `FO`

, `FOCE`

, `LaplaceI`

, `NaivePooled`

, and `BayesMCMC`

. See the online documentation for more details about the different methods.

To control the optimization procedure for finding population parameters (fixed effects), use the `optim_alg`

and `optim_options`

keyword arguments. In previous versions of Pumas the argument `optimize_fn`

was used, but is now discouraged and will be removed in a later version of Pumas. These options control the optimization all `approx`

methods except `BayesMCMC`

. The default optimization function uses the quasi-Newton routine `BFGS`

method from the `Optim`

package. It can be changed by setting the `optim_alg`

to an algorithm implemented in Optim.jl as long as it does not use second order derivatives. Optimization specific options can be passed in using the `optim_options`

keyword and has to be a NamedTuple with names and values that match the `Optim.Options`

type. For example, the optimization trace can be disabled and the algorithm can be changed to L-BFGS by passing `optim_alg=Optim.LBFGS(), optim_options = ;(show_trace=false)`

to `fit`

. See `Optim`

for more defails.

It is possible to fix one or more parameters of the fit by passing a Tuple of Symbols as the `constantcoef`

argument with elements corresponding to the names of the fixed parameters, e.g. `constantcoef=(:σ,)`

.

When models include an `@random`

block and fitting with `NaivePooled`

is requested, it is required that the user sets all varibility parameters to zero with `constantcoef`

such that these can be ignored in the optimization, e.g. `constantcoef = (:Ω,)`

while overwriting the corresponding values in `param`

with `(; init_params..., Ω = zeros(3, 3))`

.

Parallelization of the optimization is supported for most estimation methods via the ensemble interface of DifferentialEquations.jl. Currently, the only supported options are:

`EnsembleThreads()`

: the default. Accelerate fits by using multiple threads.`EnsembleSerial()`

: fit using a single thread.`EnsembleDistributed()`

: fit by using multiple worker processes.

The `fit`

function will check if any gradients and throw an exception if any of the elements are exactly zero unless `checkidentification`

is set to `false`

.

Further keyword arguments can be passed via the `diffeq_options`

argument. This allows for passing arguments to the differential equations solver such as `alg`

, `abstol`

, and `reltol`

. The default values for these are `AutoVern7(Rodas5(autodiff=true))`

, `1e-12`

, and `1e-8`

respectively. See the DifferentialEquations.jl documentation for more details.

The return type of `fit`

is a `FittedPumasModel`

.

### Marginal Likelihood Approximations

The following choices are available for the likelihood approximations:

`FO()`

: first order approximation.`FOCE()`

: first order conditional estimation (with automatic detection for when to include the interaction terms).`LaplaceI()`

: second order Laplace approximation with interaction.

Only `LaplaceI()`

is supported when using a `PumasEMModel`

:

`StatsAPI.fit`

— Method```
fit(
model::PumasEMModel,
population::Population,
param::NamedTuple,
approx::Union{SAEM,LaplaceI};
ensemblealg = EnsembleThreads(),
rng = default_rng()
)
```

Fit the `PumasEMModel`

to the dataset `population`

using the initial values specified in `param`

. The supported methods for the `approx`

argument are currently `SAEM`

and `LaplaceI`

. See the online documentation for more details about these two methods.

When fitting with `SAEM`

the variance terms of the random effects (`Ω`

) and the dispersion parameters for the error model (`σ`

) are initialized to the identity matrix or `1`

as appropriate. They may also be specified. With SAEM, it is reccomended to choose an init larger than the true values to facilitate exploration of the parameter space and avoid getting trapped in local optima early. Currently `LaplaceI`

fits with a `PumasEMModel`

require `Ω`

to be diagonal.

Options for `ensemblealg`

are `EnsembleSerial()`

and `EnsembleThreads()`

(the default). The fit will be parallel if `ensemblealg == EnsembleThreads()`

. SAEM imposes the additional requirement for threaded fits that either the `rng`

used supports `Future.randjump`

or that `rng`

be a vector of rngs, one per thread. The `default_rng()`

supports `randjump()`

.

### FittedPumasModel

The relevant fields of a `FittedPumasModel`

are:

`model`

: the`model`

used in the estimation process.`data`

: the`Population`

that was estimated.`optim`

: the result returned by the optimizer.`approx`

: the marginal likelihood approximation that was used.`param`

: the optimal parameters.

## Bayesian Estimation

Bayesian parameter estimation is performed by using the `fit`

function as follows:

`StatsAPI.fit`

— MethodDistributions.fit( model::PumasModel, data::Population, param::NamedTuple, alg::BayesMCMC, )

Samples from the posterior distribution of the `model`

's population and subject-specific parameters given the observations in `data`

using the Hamiltonian Monte Carlo (HMC) based No-U-Turn Sampler (NUTS) algorithm. The MCMC sampler is initialized starting from `param`

for the population parameters. The subject-specific parameters are also apprioriately initialized. The Bayesian inference algorithm's options are passed in the `alg`

struct. Run `?BayesMCMC`

for more details on the options that can be set.

We use a `NUTS`

sampler with the generalized no-U-turn termination criterion and multinomial sampling on an Hamiltonian system whose kinetic energy is specified with a diagonal metric (diagonal matrix with positive diagonal entries). For numerical integration of the Hamiltonian system, we use the ordinary leapfrog integrator. The adaptation is done using the Stan’s windowed adaptation routine with a default target acceptance ratio of `0.8`

.

The result is a `BayesMCMCResults`

type.

`BayesMCMCResults`

The MCMC chain is stored in the `chain`

field of the returned `BayesMCMCResults`

. Additionally, the result can be converted into a `Chains`

object from `MCMCChains`

, allowing utilization of diagnostics and visualization tooling. This is discussed further in the Bayesian Workflow tutorial.

## Maximum A Posteriori (MAP) Estimation

MAP estimation is carried out using the `fit`

function and by wrapping the chosen likelihood approximation in the `MAP`

type. The `fit`

signature is the same as for maximum likelihood. If we have a model (`model`

), a `Population`

(`data`

), and an initial parameter specification (`param`

) we can carry out the MAP estimation with the FOCE approximation of the log-likelihood using

`result = fit(model, data, param, MAP(FOCE()))`

The usual post-processing and diagnostics are available and are calculated according to the approximation wrapped in `MAP`

.