Estimating Parameters using SAEM

PumasEMModels can optionally be fitted using SAEM. Here is an example PumasEMModel definition:

using Pumas
using PumasUtilities
using Random

covariate_saem_cov = @emmodel begin
    @random begin
        CL ~ 1 + logwt | LogNormal
        v ~ 1 | LogNormal
    end
    @covariance 2
    @covariates wt
    @pre begin
        Vc = wt * v
    end
    @dynamics Central1
    @post begin
        cp = Central / Vc
    end
    @error begin
        dv ~ ProportionalNormal(cp)
    end
end

PumasEMModel
  Parameters with random effects:
    CL ~ (1, :logwt) | LogNormal
    v ~ (1,) | LogNormal
  Covariates: wt
  Pre-dynamical variables: Vc
  Dynamical system variables: Central
  Post-dynamical variables: cp

See the documentation on the @emmodel macro interface for an explanation of the syntax. To fit this model, we'll simulate data using simobs.

sim_params_covariate_cov = (;
    CL = (4.0, 0.75),
    v = 70.0,
    Ω = (Pumas.@SMatrix([0.1 0.05; 0.05 0.1]),),
    σ = ((0.2,),),
)

obstimes = 0.0:100.0

dose = DosageRegimen(1_000; addl = 2, ii = 24)

function choose_covariates()
    wt = (55 + 25rand()) / 70
    return (; wt, logwt = log(wt))
end

pop = [Subject(; id = i, events = dose, covariates = choose_covariates()) for i = 1:72]
sims = simobs(
    covariate_saem_cov,
    pop,
    sim_params_covariate_cov;
    obstimes,
    ensemblealg = EnsembleSerial(),
)
reread_df = DataFrame(sims);

pop_covariate = read_pumas(reread_df; observations = [:dv], covariates = [:logwt, :wt])

Population
  Subjects: 72
  Covariates: logwt, wt
  Observations: dv

When specifying inits, it is not necessary to specify any variance parameters for the random effects or error models.

init_covariate = (; CL = (2.0, 2 / 3), v = 50.0)

(CL = (2.0, 0.6666666666666666),
 v = 50.0,)

If unspecified, they will be initialized to 1.0 or the identity matrix for covariances. SAEM's stochastic exploration phase is most effective when these variance parameters are much larger than the true variances. Thus, if the true variances are believed to be around 1.0 or larger, it is recommended to specify larger initial values manually.

It is possible to pass a vector of random number generates as an argument to fit. If so, the fit will use one thread per RNG. By specifying the seeds of each RNG, SAEM() can be fully reproducible:

rngv = [MersenneTwister(1941964947i + 1) for i ∈ 1:Threads.nthreads()];

fit_covariate_cov1 = fit(
    covariate_saem_cov,
    pop_covariate,
    init_covariate,
    SAEM();
    ensemblealg = EnsembleThreads(),
    rng = rngv,
)

rngv = [MersenneTwister(1941964947i + 1) for i ∈ 1:Threads.nthreads()];

fit_covariate_cov2 = fit(
    covariate_saem_cov,
    pop_covariate,
    init_covariate,
    SAEM();
    ensemblealg = EnsembleThreads(),
    rng = rngv,
)

coef(fit_covariate_cov1) == coef(fit_covariate_cov2) # true

true

One can also specify the number of iterations for each of the three phases (rapid exploration, convergence, smoothing):

fit_covariate_cov =
    fit(covariate_saem_cov, pop_covariate, init_covariate, SAEM(; iters = (1000, 500, 500)))

FittedPumasEMModel

Dynamical system type:                 Closed form

Number of subjects:                             72

Observation records:         Active        Missing
    dv:                        7272              0
    Total:                     7272              0

Number of parameters:      Constant      Optimized
                                  0              7

Likelihood approximation:                     SAEM

Log-likelihood value:                   -11690.278

-------------------
         Estimate
-------------------
CL₁       3.9711
CL₂       0.83857
v        68.6
Ω₁,₁      0.084356
Ω₂,₁      0.048728
Ω₂,₂      0.093657
σ         0.19837
-------------------

The results of a fit can be analyzed normally:

infer_cov = infer(fit_covariate_cov)

Asymptotic inference results using sandwich estimator

Dynamical system type:                 Closed form

Number of subjects:                             72

Observation records:         Active        Missing
    dv:                        7272              0
    Total:                     7272              0

Number of parameters:      Constant      Optimized
                                  0              7

Likelihood approximation:                     SAEM

Log-likelihood value:                   -11690.278

--------------------------------------------------------
           Estimate   SE         95.0% C.I.
--------------------------------------------------------
CL_base     3.9711    0.13585    [  3.7048  ;  4.2373 ]
CL_logwt    0.83857   0.24152    [  0.36519 ;  1.312  ]
v_base     68.6       2.4805     [ 63.738   ; 73.462  ]
ω_1₁,₁      0.29044   0.032366   [  0.227   ;  0.35388]
ω_1₂,₁      0.16777   0.045844   [  0.077921;  0.25763]
ω_1₂,₂      0.25595   0.022163   [  0.21251 ;  0.29939]
σ_0         0.19837   0.001709   [  0.19502 ;  0.20172]
--------------------------------------------------------

coeftable(infer_cov)

7×7 DataFrame

Row	parameter	constant	estimate	se	relative_se	ci_lower	ci_upper
	String	Bool	Float64	Float64	Float64	Float64	Float64
1	CL_base	false	3.97105	0.135846	0.0342092	3.7048	4.23731
2	CL_logwt	false	0.838571	0.241525	0.28802	0.365191	1.31195
3	v_base	false	68.6	2.48048	0.0361585	63.7384	73.4617
4	ω_1₁,₁	false	0.29044	0.0323663	0.111439	0.227004	0.353877
5	ω_1₂,₁	false	0.167774	0.0458444	0.27325	0.077921	0.257628
6	ω_1₂,₂	false	0.255946	0.0221634	0.0865943	0.212506	0.299385
7	σ_0	false	0.198373	0.00170898	0.00861496	0.195024	0.201723

inspect_cov = inspect(fit_covariate_cov)

[ Info: Calculating predictions.
[ Info: Calculating weighted residuals.
[ Info: Calculating empirical bayes.
[ Info: Evaluating dose control parameters.
[ Info: Evaluating individual parameters.
[ Info: Done.