Estimating Parameters of Pumas Models
Pumas can use the observational data of a Subject
or Population
to estimate the parameters in many types of models. This is done by two classes of methods. First, maximum likelihood methods find the parameters such that the observational data has the highest probability of occurring according to the chosen error distributions. Second, bayesian methods find a posterior probability distribution for the parameters to describe the chance that a parameter has a given value given the data. The following section describes how to fit an NLME model in Pumas via the two methods.
- Maximum likelihood methods find the parameters such that the observational data has the highest probability of occurring according to the chosen error distributions.
- Bayesian methods find a posterior probability distribution for the parameters to describe the chance that a parameter has a given value given the data.
- Maximum A Posteriori (MAP) can be considered as something in between maximum likelihood estimation and Bayesian inference. It incorporates priors of some or all variables and can be interpreted as penalized maximum likelihood or finding the mode of the posterior distribution in a Bayesian context.
The following sections describe how to fit a NLME model in Pumas via the two methods.
Defining Data for Estimation
The observed data should be parsed using the same names as those found in the model. For example, if subject.observations
is a NamedTuple
with names dv
and resp
, the derived
block or function in the PumasModel
should define distributions with matching names. Similarly, the error
block should define distrubtions with matching named if using a PumasEMModel
instead. If dv
is a scalar in the observation data, then dv
from derived
should also be a scalar. Likewise, if dv
is an array like a time series, then dv
should be a size-matching time series when returned from derived
. The likelihood of observing multiple dependent variables is calculated under the assumtion of independence between the two.
Maximum Likelihood Estimation
Maximum Likelihood Estimation (MLE) is performed using the fit
function. This function's signature is:
Distributions.fit(model::AbstractPumasModel,
data::Population,
param::NamedTuple,
approx::LikelihoodApproximation;
optimize_fn = DEFAULT_OPTIMIZE_FN,
constantcoef::NamedTuple = NamedTuple(),
omegas::Tuple = tuple(),
ensemblealg::DiffEqBase.EnsembleAlgorithm = EnsembleSerial(),
checkidentification=true,
diffeq_options)
Fit the Pumas model model
to the dataset data
with starting values param
using the estimation method approx
. Currently supported values for the approx
argument are FO
, FOCE
, LaplaceI
, NaivePooled
, and BayesMCMC
. See the bottom of this page for how to do Maximum A Posteriori (MAP) estimation based on these likelihood approximations. Example fit syntax
myfit = fit(mymodel, drug_x_population, drug_x_parameters, Pumas.FOCE())
Note that the approx
has to be qualified with Pumas
and has to end with the empty parenthesis.
The argument optimize_fn
is used for optimizing the objective function for all approx
methods except BayesMCMC
. The default optimization function uses the quasi-Newton routine BFGS
method from the Optim
package. Optimization specific arguments can be passed to DefaultOptimizeFN
, e.g. the optimization trace can be disabled by passing DefaultOptimizeFN(show_trace=false)
. See Optim
for more details. Example syntax
myfit = fit(mymodel, drug_x_population, drug_x_parameters, Pumas.FOCE(),
optimize_fn=DefaultOptimizeFN(show_trace=false))
If model
is a PumasModel
, it is possible to fix It is possible to fix one or more parameters of the fit by passing a NamedTuple
as the constantcoef
argument with keys and values corresponding to the names and values of the fixed parameters, e.g. constantcoef=(σ=0.1,)
. Example syntax
myfit = fit(mymodel, drug_x_population, drug_x_parameters, Pumas.FOCE(),
constantcoef = (tvcl = 4, ),
optimize_fn=DefaultOptimizeFN(show_trace=false))
When models include an @random
block and fitting with NaivePooled
is requested, it is required that the user supplies the names of the parameters of the random effects as the omegas
argument such that these can be ignored in the optimization, e.g. omegas=(Ω,)
. Example syntax
myfit = fit(mymodel, drug_x_population, drug_x_parameters, Pumas.FOCEI(),
omegas=(Ω,),
optimize_fn=DefaultOptimizeFN(show_trace=false))
Parallelization of the optimization is supported for most estimation methods via the ensemble interface of DifferentialEquations.jl. Supported options are:
EnsembleThreads()
. This is the default. This will allow the PumasModel to use multiple threads while fitting. Note thatThreads.nthreads()
must be greater than 1 to see improved performance. Threads cannot be added to a running Julia process; Julia must be started with the full number of threads used. This can be configured in VSCode.EnsembleSerial()
. Use only a single thread while fitting. This can be useful if you are performing many model fits in parallel. In this case, you can specifyEnsembleSerial()
so that the fits are themselves singleselves singlethreaded. More granular parallelism (e.g. from performing many fits in parallel) will generally perform better due to needing less synchronization and likely having rarer serial parts.EnsembleDistributed()
. This is similar toEnsembleThreads()
, but runs in parallel accross separate processes. To use this, you mustusing Distributed
.nworkers()
must be greater than1
to see a performance improvement. Workers can be added to an existing Julia process viaaddprocs(Sys.CPU_THREADS, exeflags="--project=$(Base.active_project())")
. The chief difference betweenEnsembleThreads
andEnsembleDistributed
is that the latter's parallelism comes from running separate processses that manage their memory separately. This has pros and cons. The primary benefit is that it can reduce the amount of GC time needed, as it means each process has its own GC managing a much smaller amount of memory specific only to that process (which runs in serial). Thus, when seeing a larger percentage of the runtime being taken by the GC while usingEnsembleThreads()
, it may be worth tryingEnsembleDistributed()
instead. There are two primary disadvantages. The first is that communication between separate processes is more expensive than communication between threads within a process (as threads within a process share memory), thusEnsembleThreads()
is likely to be faster if GC time is not a major issue. The other disadvantage is that if you want to run code on separate processes, you need to ensure these separate processes have loaded the appropriate dependencies. After adding all the worker process, you will likely have to run@everywhere using Pumas, Distributions
to makePumas
and the probability distributions you are using available in all workers. Additionally, if you are using any functions inside your PumasModel that are not normally avaiable in a fresh Julia session, e.g. you have to load a package, or it is a function you've defined, you will also have to@everywhere
the necessary code (e.g. package load or function definition) to make it available.
Example fit syntax for this is
myfit = fit(mymodel, drug_x_population, drug_x_parameters, Pumas.FOCEI(),
ensemblealg = EnsembleSerial())
The fit
function will check if any gradients and throw an exception if any of the elements are exactly zero unless checkidentification
is set to false
. checkidentification
currently requires using a PumasModel
.
Keyword arguments that control the differential equation solver can be passed via the diffeq_options
argument. This accepts a NamedTuple
with arguments to the differential equations solver such as alg
, abstol
, and reltol
. The default values for these are AutoVern7(Rodas5())
, 1e-12
, and 1e-8
respectively. See the DifferentialEquations.jl documentation for more details.
The return type of fit
is a FittedPumasModel
.
Marginal Likelihood Approximations
The following choices are available for the likelihood approximations:
FO()
: first order approximation.FOCE()
: first order conditional estimation (with automatic detection for when to include the interaction terms).LaplaceI()
: second order Laplace approximation with interaction.
Only LaplaceI()
is supported when using a PumasEMModel
.
FittedPumasModel
The relevant fields of a FittedPumasModel
are:
model
: themodel
used in the estimation process.data
: thePopulation
that was estimated.optim
: the result returned by the optimizerapprox
: the marginal likelihood approximation that was used.param
: the optimal parameters.
Bayesian Estimation
Bayesian parameter estimation is performed by using the fit
function as follows:
Distributions.fit(
model::PumasModel,
data::Population,
param::NamedTuple,
::BayesMCMC;
nadapts::Integer=2000,
nsamples::Integer=10000,
kwargs...
)
We use a NUTS
sampler with the generalized no-U-turn termination criterion and multinomial sampling on Hamiltonian system
whose kinetic energy is specified with a diagonal metric (diagonal matrix with positive diagonal entries). For numerical intergation of the Hamiltonian system, we use the ordinary leapfrog integrator. The adaptation is done
using the Stan’s windowed adaptation routine with a default target acceptance ratio of 0.8
.
The arguments are:
model
: aPumasModel
, either defined by the@model
DSL.data
: aPopulation
.param
: a named tuple of parameters. Used as the initial condition for the sampler.- The
approx
must beBayesMCMC()
. nsamples
determines the number of samples taken along each chain.- Extra
args
andkwargs
are passed on to the internalsimobs
call and thus control the behavior of the differential equation solvers.
The result is a BayesMCMCResults
type.
BayesMCMCResults
The MCMC chain is stored in the chain
field of the returned BayesMCMCResults
. Additionally the result can be converted into a Chains
object from MCMCChains.jl, allowing utlilization of diagnostics and visualization tooling. This is discussed further in the Bayesian Estimation tutorial.
The following functions help with querying common results on the Bayesian posterior:
param_mean(br)
: returns a named tuple of parameters which represents the mean of each parameter's posterior distributionparam_var(br)
: returns a named tuple of parameters which represents the variance of each parameter's posterior distributionparam_std(br)
: returns a named tuple of parameters which represents the standard deviation of each parameter's posterior distribution
Maximum A Posteriori (MAP) Estimation
MAP estimation is carried out using the fit
function and by wrapping the chosen likelihood approximation in the MAP
type. The fit
signature is the same as for maximum likelihood. If we have a model (model
), a Population
(data
), and an initial parameter specification (param
) we can carry out the MAP estimation with the FOCE approximation of the log-likelihood using
result = fit(model, data, param, Pumas.MAP(Pumas.FOCE()))
The usual post-processing and diagnostics are available and are calculated according to the approximation wrapped in MAP
.