Pumas
Docstrings
DataFrames.DataFrame
— MethodDataFrame(
pop::Population;
include_covariates = true,
include_observations = true,
include_events = true,
)
Create a DataFrame
with the Population
information. The data follows Pumas Data Format ready to be imported with read_pumas
.
Examples:
julia> events = DosageRegimen(1.0; time = 1.0)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 1.0 1 1.0 1 0.0 0 0.0 0.0 0 NullRoute
julia> pop = [Subject(; id = 1, events), Subject(; id = 2, events)]
Population
Subjects: 2
Observations:
julia> DataFrame(pop)
2×12 DataFrame
Row │ id time amt evid cmt rate duration ss ii route tad dosenum
│ String Float64 Float64 Int8 Int64 Float64 Float64 Int8 Float64 NCA.Route Float64 Int64
─────┼──────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 1 1.0 1.0 1 1 0.0 0.0 0 0.0 NullRoute 0.0 1
2 │ 2 1.0 1.0 1 1 0.0 0.0 0 0.0 NullRoute 0.0 1
DataFrames.DataFrame
— MethodDataFrame(dr::DosageRegimen)
Create a DataFrame
with the information in the dosage regimen. The data follows Pumas Data Format ready to be imported with read_pumas
.
Examples:
julia> DataFrame(DosageRegimen(100))
1×10 DataFrame
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 100.0 1 0.0 0 0.0 0.0 0 NullRoute
julia> DataFrame(DosageRegimen(100; ii = 24, addl = 2))
1×10 DataFrame
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 100.0 1 24.0 2 0.0 0.0 0 NullRoute
DataFrames.DataFrame
— MethodDataFrame(subject::AbstractSubject)
Create a DataFrame
with the Subject
information. The data follows Pumas Data Format ready to be imported with read_pumas
.
Examples:
julia> DataFrame(Subject(; events = DosageRegimen(1.0; time = 1.0)))
1×12 DataFrame
Row │ id time amt evid cmt rate duration ss ii route tad dosenum
│ String Float64 Float64 Int8 Int64 Float64 Float64 Int8 Float64 NCA.Route Float64 Int64
─────┼──────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 1 1.0 1.0 1 1 0.0 0.0 0 0.0 NullRoute 0.0 1
julia> DataFrame(
Subject(;
id = "AKJ491",
events = DosageRegimen(1.0; time = 1.0),
observations = (; glucose = [8.31, 7.709, 5.19]),
time = [0.3, 0.9, 3.0],
covariates = (; bloodpressure = [(143, 95.01), (141.3, 94.8), (130.4, 85.1)]),
covariates_time = [0.0, 0.5, 5.0],
),
)
7×14 DataFrame
Row │ id time evid glucose amt cmt rate duration ss ii route b ⋯
│ String Float64 Int8 Float64? Float64? Int64? Float64? Float64? Int8? Float64? NCA.Route? T ⋯
─────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ AKJ491 0.0 2 missing missing missing missing missing missing missing missing ( ⋯
2 │ AKJ491 0.3 0 8.31 0.0 missing 0.0 0.0 0 0.0 NullRoute (
3 │ AKJ491 0.5 2 missing missing missing missing missing missing missing missing (
4 │ AKJ491 0.9 0 7.709 0.0 missing 0.0 0.0 0 0.0 NullRoute (
5 │ AKJ491 1.0 1 missing 1.0 1 0.0 0.0 0 0.0 NullRoute ( ⋯
6 │ AKJ491 3.0 0 5.19 0.0 missing 0.0 0.0 0 0.0 NullRoute (
7 │ AKJ491 5.0 2 missing missing missing missing missing missing missing missing (
3 columns omitted
DataFrames.DataFrame
— MethodDataFrame(
fpmi::FittedPumasModelInspection;
include_covariates = true,
include_observations = true,
include_events = true,
include_icoef = true,
include_dosecontrol = true,
include_ebes = true,
include_dynamics = true,
)
Create a DataFrame
with the information in an inspect
result fpmi
. The data follows Pumas Data Format ready to be imported with read_pumas
.
DataFrames.DataFrame
— MethodDataFrame(
obs::SimulatedObservations;
include_events = true,
include_covariates = true,
)
Create a DataFrame
with the information in a simobs
result obs
. The data follows Pumas Data Format ready to be imported with read_pumas
.
DataFrames.DataFrame
— MethodDataFrame(pred::SubjectPrediction)
Create a DataFrame
with the information in the subject's prediction pred
. The data follows Pumas Data Format ready to be imported with read_pumas
.
DataFrames.DataFrame
— MethodDataFrame(bts::Union{SimulatedInference, Bootstraps})
Returns a data frame of the parameter estimates.
DataFrames.DataFrame
— MethodDataFrame(vpred::Vector{<:SubjectPrediction})
Create a DataFrame
with the information in a population prediction vpred
. The data follows Pumas Data Format ready to be imported with read_pumas
.
DataFrames.DataFrame
— MethodDataFrame(vresid::Vector{<:SubjectResidual})
Create a DataFrame
with the information in a population weighted residuals vresid
. The data follows Pumas Data Format ready to be imported with read_pumas
.
Distributions.MvNormal
— MethodMvNormal(pmi::FittedPumasModelInference)
Returns an MvNormal
with a mean equal to the optimal parameter values in the fitted model and a covariance equal to the variance-covariance estimate.
Distributions.MvNormal
— MethodMvNormal(model::PumasModel, means::NamedTuple, vcov::NamedTuple)
Returns an MvNormal
with mean equal to the vectorized means
argument and a covariance equal to the block diagonal matrix of the covariance matrices in the vcov
argument. The mean
argument should be a named tuple of values in their natural domain. vcov
should be a named tuple of numbers (for real parameters) or matrices for vector and matrix parameters. For parameters that are themselves symmetric matrices, e.g:
Ω = [
Ω₁₁ Ω₁₂ Ω₁₃
Ω₂₁ Ω₂₂ Ω₂₃
Ω₃₁ Ω₃₂ Ω₃₃
]
the covariance matrix should be the covariance of the vector: [Ω₁₁, Ω₁₂, Ω₂₂, Ω₁₃, Ω₂₃, Ω₃₃]
. Note that because the matrix is symmetric, only the upper triangle is used iterating over the columns first in an outer loop then the over the rows of each column in an inner loop.
Example:
means = (θ1 = 1.0, θ2 = [1.0, 2.0], θ3 = [1.0 0.1; 0.1 1.0])
vcov =
(θ1 = 0.1, θ2 = [0.5 -0.2; -0.2 0.6], θ3 = [0.5 -0.2 0.1; -0.2 0.6 -0.3; 0.1 -0.3 0.45])
dist = MvNormal(model, means, vcov)
MCMCChains.Chains
— MethodChains(b::BayesMCMCResults; subject::Union{Nothing, Int} = nothing)
Constructs an MCMCChains.Chains
object for the samples of the population or subject-specific parameters. If subject
is nothing
(the default value) a Chains
object of the population parameters' samples is returned. If subject
is an integer, the samples of the subject-specific parameters corresponding to the subject with the given index will be returned instead.
NCA.NCASubject
— MethodNCASubject(subj; group, observations, [kwargs...])
Return an NCASubject constructed from a Subject. The main keyword arguments are
group
, a vector of symbols for covariates to use as grouping variables for the NCA calculations.observations
, refers to the variable to calculate NCA parameters for.
For other keywords arguments to pass to the NCASubject
constructor, see the docstring for the NCASubject
constructor that does not take in a Subject
but rather builds the NCASubject
from time series directly.
Pumas.AnalyticalPKPDProblem
— TypeAnalyticalPKPDProblem(
f,
u0,
tspan,
events,
time,
p,
bioav)
An analytical PK(PD) problem.
Fields:
f
: callable that returns the state of the dynamic systemu0
: initial conditionstspan
: time points that define the intervals in which the problem is definedevents
: events such as dose, reset events, etctime
: event timesp
: a function that returns pre block evaluated at a time point
Pumas.AnalyticalPKProblem
— TypeAnalyticalPKProblem(pkprob, prob2, sys2)
A problem that is partially an analytical problem that can be evaluated independently of the rest of the system.
Fields:
pkprob
: the analytical part of the problemprob2
: a problem that represents the rest of the systemsys2
(optional): the system ofprob2
- used for latexification.
Pumas.BayesMCMC
— TypeBayesMCMC
An instance of the Hamiltonian Monte Carlo (HMC) based No-U-Turn Sampler (NUTS) algorithm. All the options of the algorithm can be passed to the constructor of BayesMCMC
as keyword arguments. Example:
alg = BayesMCMC(; nsamples = 2000, nadapts = 1000)
The main options that can be set are:
target_accept
: (default is0.8
) target acceptance ratio for the NUTS algorithm.nsamples
: (default is2_000
) number of Markov Chain Monte Carlo (MCMC) samples to generate including adaptation samples.nadapts
: (default isnsamples ÷ 2
) number of adaptation steps in the NUTS algorithm.nchains
: (default is4
) number of MCMC chains to sample.ess_per_chain
: (default is∞
) target effective sample size (ESS) per chain, sampling terminates if the target is reached.check_every
: (default isess_per_chain ÷ 5
) the number of samples after which the ESS per chain is checked.time_limit
: (default is∞
) a time limit for sampling in seconds, sampling terminates if the time limit is reached.ensemblealg
: (default isEnsembleThreads()
) can be set toEnsembleSerial()
for serial sampling,EnsembleThreads()
for multi-threaded sampling,EnsembleDistributed()
for multi-processing (aka distributed parallelism) sampling, orEnsembleSplitThreads()
for multi-processing over chains, and multi-threading over subjects.parallel_chains
: (default istrue
if enough threads/workers exist) can be set totrue
orfalse
, if set totrue
the chains will be sampled in parallel using either multi-threading or multi-processing depending on the value ofensemblealg
.parallel_subjects
: (default istrue
if enough threads/workers exist) can be set totrue
orfalse
, if set totrue
the log probability computation will be parallelized over the subjects using either multi-threading or multi-processing depending on the value ofensemblealg
.rng
: the random number generator used.diffeq_options
: (default is an emptyNamedTuple
(;)
, which uses the downstream defaults) aNamedTuple
of all the differential equations solver's options.constantcoef
: (default is an emptyNamedTuple
(;)
) aNamedTuple
of the parameters to be fixed during sampling. This can be used to sample from conditional posteriors fixing some parameters to specific values, e.g.constantcoef = (σ = 0.1,)
fixes theσ
parameter to 0.1 and samples from the posterior of the remaining parameters conditional onσ
.check_initial
: (default istrue
) aBool
to check if the model's initial parameter values have a valid log-likelihood before begin fitting.
Additionally, there are options to fine-tune the NUTS sampler behavior:
max_chunk_size
: (default is8
) Forward-mode autodiff (currently the only autodiff method available) maximal chunks to use as chunk size. Forward-mode autodiff performs partial derivative evaluation on one "chunk" of the input vector at a time. Each differentiation of a chunk requires a call to the target function as well as additional memory proportional to the square of the chunk's size. Thus, a smaller chunk size makes better use of memory bandwidth at the cost of more calls to the target function, while a larger chunk size reduces calls to the target function at the cost of more memory bandwidth.adapt_init_buffer
: (default is0
) width of initial fast adaptation interval for the NUTS mass matrix adaptation.adapt_term_buffer
: (default is the maximum value betweennadapts ÷ 20
and50
) width of final fast adaptation interval for the NUTS mass matrix adaptation.adapt_window_size
: (default is5
) initial width of slow adaptation interval for the NUTS mass matrix adaptation.
For the NUTS mass matrix adaptation, if you want the same behavior as Stan, set the following values:
BayesMCMC(; ..., adapt_init_buffer = 75, adapt_term_buffer = 50, adapt_window_size = 25)
Pumas.Bootstrap
— MethodBootstrap(; rng=Random.default_rng, samples=200, stratify_by=nothing, ensemblealg=EnsembleThreads())
Constructs a Bootstrap object that allows the user to use bootstrapping with the supplies setting in infer
function calls. See ?infer
and https://docs.pumas.ai/stable/analysis/inference/ for more information about obtaining inference results.
The keyword arguments are as follows:
- rng: used to select a pseudo-random number generator to beused for the random resamples of the datasets.
- samples: controls the number of resampled
Population
s. - resamples: the number of model parameter samples from the original sample distribution to re-sample (by weighed re-sampling without replacement). The number of resamples must be less than that of the original sample. An original sample-to-resample ratio of at least 5:1 is suggested.
stratify_by
control the stratification used in the re-sampling scheme. Should be a set to aSymbol
with the name of a covariate it is possible to stratify by a covariate with a finite number of possible values (e.g:study
for a covariate namedstudy
).ensemblealg
controls the parallelization acrossfit
calls. Each fit is fitted using theensemblealg
that was used in the originalfit
.
Pumas.ByObservation
— TypeByObservation(; allsubjects = true)
Using this method, each observation or collection of observations is treated as a single data point. When computing predictive loglikelihoods using this method, the predictive loglikelihood computed is the conditional loglikelihood of one or more observations for one or more subjects.
This method has one option, allsubjects
. If allsubjects
is set to true
(the default value), the i
th observation of each subject are all grouped together into a single data point. If allsubjects
is set to false
, then each observation per subject is its individual data point.
Examples:
Assume there are 2 subjects and 3 observations per subject. When using LeaveK(K = 1)
as the splitting method together with ByObservation(allsubjects = false)
, the training and validation splittings are:
Training subset | Validation subset |
---|---|
Sub 1 (obs 1, 2, 3), sub 2 (obs 1, 2) | Sub 2 (obs 3) |
Sub 1 (obs 1, 2, 3), sub 2 (obs 1, 3) | Sub 2 (obs 2) |
Sub 1 (obs 1, 2, 3), sub 2 (obs 2, 3) | Sub 2 (obs 1) |
Sub 1 (obs 1, 2), sub 2 (obs 1, 2, 3) | Sub 1 (obs 3) |
Sub 1 (obs 1, 3), sub 2 (obs 1, 2, 3) | Sub 1 (obs 2) |
Sub 1 (obs 2, 3), sub 2 (obs 1, 2, 3) | Sub 1 (obs 1) |
On the other hand if allsubjects
is set to true
, the training and validation splittings are:
Training subset | Validation subset |
---|---|
Sub 1 (obs 1, 2), sub 2 (obs 1, 2) | Sub 1 (obs 3) and sub 2 (obs 3) |
Sub 1 (obs 1, 3), sub 2 (obs 1, 3) | Sub 1 (obs 2) and sub 2 (obs 2) |
Sub 1 (obs 2, 3), sub 2 (obs 2, 3) | Sub 1 (obs 1) and sub 2 (obs 1) |
Pumas.BySubject
— TypeBySubject(; marginal = LaplaceI())
Using this method, each subject is treated as a single data point. The predictive loglikelihood computed for each subject can be either the marginal loglikelihood or conditional loglikelihood.
This method has one option, marginal
. If marginal
is set to nothing
, the predictive loglikelihood computed for each subject is the conditional loglikelihood using 0s for the subject-specific parameters. Otherwise, the predictive loglikelihood computed for each subject is the marginal loglikelihood using marginal
as the marginalization algorithm.
The default value of marginal
is LaplaceI()
which uses the Laplace method to integrate out the subject-specific parameters. Other options include: FOCE()
, FO()
and Pumas.LLQuad()
. Pumas.LLQuad()
is a quadrature integration algorithm. To learn more about the options of LLQuad
, you can type ?LLQuad
in the REPL.
Pumas.Central1
— TypeCentral1()
An analytical model for a one compartment model with dosing into Central
. Equivalent to
Central' = -CL/Vc*Central
where clearance, CL
, and volume, Vc
, are required to be defined in the @pre
block.
Pumas.Central1Meta1
— TypeCentral1Meta1()
An analytical model for a model with a central compartment, Central
, and a metabolite, Metabolite
. Equivalent to Central' = -(CL+CLfm)/VcCentral Metabolite' = CLfm/VcCentral - CLm/Vm*Metabolite where clearances (CL
and CLm
) and volumes (Vc
and Vm
) and formation clearance of metabolite CLfm
are required to be defined in the @pre
block.
Pumas.Central1Periph1
— TypeCentral1Periph1()
An analytical model for a two-compartment model with a central compartment, Central
and a peripheral compartment, Peripheral
. Equivalent to
Central' = -(CL+Q)/Vc*Central + Q/Vp*Peripheral
Peripheral' = Q/Vc*Central - Q/Vp*Peripheral
where clearance, CL
, and volumes, Vc
and Vp
, and distribution clearance, Q
, are required to be defined in the @pre
block.
Pumas.Central1Periph1Meta1
— TypeCentral1Periph1Meta1()
An analytical model for a two compartment model with a central compartment, Central
, with a peripheral compartment, Peripheral
, and a metabolite compartment, Metabolite
. Equivalent to
Central' = -(CL+Q+CLfm)/Vc*Central + Q/Vp*CPeripheral
CPeripheral' = Q/Vc*Central - Q/Vp*CPeripheral
Metabolite' = -CLm/Vm*Metabolite + CLfm/Vc*Central
where clearances (CL
and CLm
) and volumes (Vc
, Vp
and Vm
), distribution clearance (Q
), and formation clearance of metabolite CLfm
are required to be defined in the @pre
block.
Pumas.Central1Periph1Meta1Periph1
— TypeCentral1Periph1Meta1Periph1()
An analytical model for a two compartment model with a central compartment, Central
, with a peripheral compartment, Peripheral
, and a metabolite compartment, Metabolite
, with a peripheral compartment, MPeripheral
. Equivalent to
Central' = -(CL+Q+CLfm)/Vc*Central + Q/Vp*CPeripheral
CPeripheral' = Q/Vc*Central - Q/Vp*CPeripheral
Metabolite' = -(CLm+Qm)/Vm*Metabolite + Qm/Vmp*MPeripheral + CLfm/Vc*Central
MPeripheral' = Qm/Vm*Metabolite - Qm/Vmp*MPeripheral
where clearances (CL
and CLm
) and volumes (Vc
, Vp
, Vm
and Vmp
), distribution clearances (Q
and Qm
) and formation clearance of metabolite CLfm
are required to be defined in the @pre
block.
Pumas.ConstDomain
— Type@param x = val
@param x ∈ ConstDomain(val)
Specifies a parameter as a constant.
Pumas.Constrained
— TypeConstrained
Constrain a Distribution
within a Domain
. The most common case is an MvNormal
constrained within a VectorDomain
. The only supported method for Constrained
is logpdf
. Notice that the result does not represent a probability distribution since the remaining probability mass is not scaled by the mass excluded by the constraints.
Example
julia> d = Constrained(MvNormal(fill(1.0, 1, 1)), lower = -1, upper = 1)
Constrained{ZeroMeanFullNormal{Tuple{Base.OneTo{Int64}}}, VectorDomain{Vector{Int64}, Vector{Int64}, Vector{Float64}}}(ZeroMeanFullNormal(
dim: 1
μ: Zeros(1)
Σ: [1.0;;]
)
, VectorDomain{Vector{Int64}, Vector{Int64}, Vector{Float64}}([-1], [1], [0.0]))
julia> logpdf(d, [0])
-0.9189385332046728
julia> logpdf(d, [-2])
-Inf
Pumas.CorrDomain
— TypeCorrDomain(template)
Return a correlation domain.
template
provides both the shape and initial values for the resulting correlation parameter.
- if
template
is anInt
then an identity matrix of sizetemplate
is returned. - if
template
is a squareMatrix
then theCorrDomain
will match its size and will have initial values according to thetemplate
elements.
Pumas.Depots1Central1
— TypeDepots1Central1()
An analytical model for a one compartment model with a central compartment, Central
, and a depot, Depot
. Equivalent to
Depot' = -Ka*Depot
Central' = Ka*Depot - CL/Vc*Central
where absoption rate, Ka
, clearance, CL
, and volume, Vc
, are required to be defined in the @pre
block.
Pumas.Depots1Central1Periph1
— TypeDepots1Central1Periph1()
An analytical model for a two-compartment model with a central compartment, Central
, a peripheral compartment, Peripheral
, and a depot Depot
. Equivalent to
Depot' = -Ka*Depot
Central' = Ka*Depot - (CL+Q)/Vc*Central + Q/Vp*Peripheral
Peripheral' = Q/Vc*Central - Q/Vp*Peripheral
where absorption rate, Ka
, clearance, CL
, and volumes, Vc
and Vp
, and distribution clearance, Q
, are required to be defined in the @pre
block.
Pumas.Depots2Central1
— TypeDepots2Central1()
An analytical model for a one compartment model with a central compartment, Central
, and two depots, Depot1
and Depot2
. Equivalent to
Depot1' = -Ka1*Depot1
Depot2' = -Ka2*Depot2
Central' = Ka1*Depot1 + Ka2*Depot2 - CL/Vc*Central
where absorption rates, Ka1
and Ka2
, clearance, CL
, and volume, Vc
, are required to be defined in the @pre
block.
When using this model during simulation or estimation, it is preferred to have 2 dosing rows for each subject in the dataset, where the first dose goes into cmt =1
(or cmt = Depot1
) and the second dose goes into cmt=2
(or cmt=Depot2
). Central compartment gets cmt=3
or (cmt = Central
). e.g.
ev = DosageRegimen([100,100],cmt=[1,2]) s1 = Subject(id=1, events=ev)
Pumas.DiffEqWrapper
— TypeDiffEqWrapper
Modifies the diffeq system f
to add allow for modifying rates
f
: original diffeq systemparams
: ODE parametersrates_on
: should rates be modified?rates
: amount to be added to rates
Pumas.DosageRegimen
— TypeDosageRegimen(
amt::Numeric;
time::Numeric = 0,
cmt::Union{Numeric,Symbol} = 1,
evid::Numeric = 1,
ii::Numeric = zero.(time),
addl::Numeric = 0,
rate::Numeric = zero.(amt)./oneunit.(time),
duration::Numeric = zero(amt)./oneunit.(time),
ss::Numeric = 0,
route::NCA.Route = NCA.NullRoute,
)
Lazy representation of a series of events.
Examples
julia> DosageRegimen(100; ii = 24, addl = 6)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 100.0 1 24.0 6 0.0 0.0 0 NullRoute
julia> DosageRegimen(50; ii = 12, addl = 13)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 50.0 1 12.0 13 0.0 0.0 0 NullRoute
julia> DosageRegimen(200; ii = 24, addl = 2)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 200.0 1 24.0 2 0.0 0.0 0 NullRoute
julia> DosageRegimen(200; ii = 24, addl = 2, route = NCA.IVBolus)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 200.0 1 24.0 2 0.0 0.0 0 IVBolus
You can also create a new DosageRegimen
from various existing DosageRegimen
s:
evs = DosageRegimen(
regimen1::DosageRegimen,
regimen2::DosageRegimen;
offset = nothing
)
offset
specifies if regimen2
should start after an offset following the end of the last event in regimen1
.
Examples
julia> e1 = DosageRegimen(100; ii = 24, addl = 6)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 100.0 1 24.0 6 0.0 0.0 0 NullRoute
julia> e2 = DosageRegimen(50; ii = 12, addl = 13)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 50.0 1 12.0 13 0.0 0.0 0 NullRoute
julia> evs = DosageRegimen(e1, e2)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 100.0 1 24.0 6 0.0 0.0 0 NullRoute
2 │ 0.0 1 50.0 1 12.0 13 0.0 0.0 0 NullRoute
julia> DosageRegimen(e1, e2; offset = 10)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 100.0 1 24.0 6 0.0 0.0 0 NullRoute
2 │ 178.0 1 50.0 1 12.0 13 0.0 0.0 0 NullRoute
Pumas.ExactCrossvalidation
— TypeExactCrossvalidation(; split_method, split_by, ensemblealg)
Defines an instance of the ExactCrossvalidation
algorithm for crossvalidation. In this algorithm, the fitting is re-run while leaving out a subset of the data each time.
The way by which the data is split between training and validation sets is determined using the keyword arguments split_method
and split_by
. The split_method
argument can be of any of the following types:
LeaveK
for leave-K-out cross-validationKFold
for K-fold cross-validationLeaveFutureK
for leaving K future points at a time
while split_by
can be of any of the following types:
BySubject
for leaving out subjectsByObservation
for leaving out individual observations per subject
Please refer to the documentation of each of the above types for the parameters of each split method, e.g. typing ?LeaveK
or ?BySubject
.
Examples:
Assume there are 5 subjects and 10 observations per subject and that ft
is the result of the fit
function. The following are some of the combinations in which the above inputs can be used:
- Leave-1-observation-out cross-validation, leaving 1 observation for all the subjects at a time.
allsubjects = true
means that the same observation index is removed for all the subjects, e.g. the 10th observation for all the subjects is used for validation in the first run, then the 9th observation is used for validation in the second run, etc.
split_method = LeaveK(K = 1)
split_by = ByObservation(allsubjects = true)
cv_method = ExactCrossvalidation(; split_method = split_method, split_by = split_by, ensemblealg = EnsembleThreads())
r = crossvalidate(ft, cv_method)
- Leave-2-future-observations-out cross-validation, leaving 2 future observations per subject such that no less than 4 observations are used in training.
allsubjects = false
means that only the data of one subject at a time will be split. So in the first run, only observations 1 to 4 of subject 1 are used for training and observations 5 and 6 of subject 1 are used for validation. In the second run, observations 1 to 6 of subject 1 are used for training and observations 7 and 8 are used for validation. In the third run, observations 1 to 8 of subject 1 get used for training and observations 9 and 10 are used for validation. For all 3 runs, all the observations of subjects 2 to 5 are used in training. Then in the fourth to sixth runs, subject 2's data gets split. In the seventh to ninth runs, subject 3's data gets split, etc.
split_method = LeaveFutureK(K = 2, minimum = 4)
split_by = ByObservation(allsubjects = false)
cv_method = ExactCrossvalidation(; split_method = split_method, split_by = split_by, ensemblealg = EnsembleThreads())
r = crossvalidate(ft, cv_method)
- Leave-1-subject-out cross-validation, marginalizing the subject-specific parameters using quadrature integration when computing predictive likelihoods. Using this method, subjects {1, 2, 3, 4} are used for training in the first run and subject {5} is used for validation. In the second run, subjects {1, 2, 3, 5} are used for training and subject {4} is used for validation, etc. The predictive likelihoods computed are the marginal likelihood of the subject computed using Gaussian quadrature with a maximum of 1000 integration points. Refer to the documentation of
LLQuad
by typing?LLQuad
for more details on the parameters of the quadrature algorithm.FO()
,FOCE()
andLaplaceI()
can also be used instead.
split_method = LeaveK(K = 1)
split_by = BySubject(marginal = LLQuad(imaxiters=1000))
cv_method = ExactCrossvalidation(; split_method = split_method, split_by = split_by, ensemblealg = EnsembleThreads())
r = crossvalidate(ft, cv_method)
- Leave-1-future-subject-out cross-validation, where the predictive likelihoods are conditional likelihoods computed using
0
s for the subject-specific parameters instead of marginalizing. The minimum number of subjects is 3 so there are only 2 runs. In the first run, the training subjects are {1, 2, 3} and the validation subject is {4}. In the second run, the training subjects are {1, 2, 3, 4} and the validation subject is {5}.
split_method = LeaveFutureK(K = 1, minimum = 3)
split_by = BySubject(marginal = nothing)
cv_method = ExactCrossvalidation(; split_method = split_method, split_by = split_by, ensemblealg = EnsembleThreads())
r = crossvalidate(ft, cv_method)
Pumas.FittedPumasEMModel
— TypeFittedPumasEMModel
Contains the results of fitting a PumasEMModel with SAEM.
Fields:
- model: The PumasEMModel that was fit.
- data: The population that was fit.
- μ: Final estimates of
@param
and@random
parameters. - ω: Final estimates of
ω
parameters. - σ: Final estimated standard deviations of residuals.
- opthistory: If
approx
isSAEM
, contains the full history of unconstrained-μ
,ω
, and,σ
estimates across iterations. Otherwise, contains the optimization results. - approx: Fitting method used.
- η: Random effects from the fit. If
approx
isSAEM
, contains alength(pop) × nη × iteration
array ofη
samples from the smoothing phase. Otherwise, contains orthogonal random effects. - kwargs: Input keywords stored for use in diagnostics. Includes
ensemblealg
anddiffeq_options
.
Pumas.Formula
— TypeO ~ C | D
For example: CL ~ 1 + wt | Normal -> Formula{:CL,(1,:wt),:Normal}()
Pumas.IDR
— MethodIDR(; direct=true, mode=:stimulation_sigmoid)
Get an indirect response model (IDR) for use as the pharmacodynamic part of model macro generation.
- direct::Bool - toggle whether the drug acts on production (true) or inactivation (false)
- mode::Symbol - pick the mode of drug action. Valid options are: :stimulationlinear, :stimulationemax, :stimulationsigmoid, :stimulationinfinite, :stimulationlogarithmic, :inhibitionemax, :inhibitionsigmoid, :inhibitionhill,
Pumas.JointMAP
— TypeJointMAP
An instance of the maximum a-posteriori (MAP) algorithm where the mode of the joint posterior of the population and subject-specific parameters is found using the BFGS optimization algorithm. All the options of the algorithm can be passed to the constructor of JointMAP
as keyword arguments. Example:
alg = JointMAP(ensemblealg = EnsembleThreads())
The main options that can be set are:
ensemblealg
: can be set toEnsembleSerial()
for serial sampling,EnsembleThreads()
for multi-threaded sampling orEnsembleDistributed()
for multi-processing (aka distributed parallelism) samplingdiffeq_options
: aNamedTuple
of all the differential equations solver's options
Pumas.KFold
— TypeKFold(; K = 5, shuffle = false, rng = nothing)
K-fold splitting method. In this method, the data is split K
times into 2 disjoint groups, each time starting from the full data set. The 2 groups are typically called training and validation subsets, where the validation subset has floor(N / K)
data points, N
being the total number of data points. In the next iteration, the whole data set is then re-split using another disjoint validation subset of floor(N / K)
different points, disjoint from the the previous validation subsets. This process is done iteratively until almost each data point has shown up in 1 and only 1 validation subset. If N
is divisible by K
, each point will show up in 1 and only 1 validation subset. Otherwise, the remainder points will be part of the training subset for all the splittings and will not show up in any validation subset. This method can be used for computing predictive loglikelihoods, in-sample loglikelihoods and doing cross-validation using subsets of out-of-sample data.
The data is typically a vector of some sort, e.g. observations or subjects, and the splittings are order-dependent. Before performing the splittings, you can randomly shuffle the data vector by setting the shuffle
keyword argument to true
(default is false
) getting rid of the sensitivity to the original order of the data. You can additionally pass an optional pseudorandom number generator rng
to control the pseudo-randomness for maximum reproducibility.
Example:
Assume the original data is ["A", "B", "C", "D"]
. 4-fold splitting without shuffling results in the following splittings of the data:
Training subset | Validation subset |
---|---|
["A", "B", "C"] | ["D"] |
["A", "B", "D"] | ["C"] |
["A", "C", "D"] | ["B"] |
["B", "C", "D"] | ["A"] |
where each data point showed once and only once in a validation subset.
2-fold splitting without shuffling results in the following splittings:
Training subset | Validation subset |
---|---|
["A", "B"] | ["C", "D"] |
["C", "D"] | ["A", "B"] |
Pumas.LLQuad
— TypeLLQuad(; iabstol = 0.0, ireltol = 1e-8, imaxiters = 10^10)
Gaussian quadrature integration approximation method for computing the marginal loglikelihood. The integration volume is adaptively subdivided, using a cubature rule due to Genz and Malik (1980), until the estimated error E
satisfies E ≤ max(ireltol*norm(I), iabstol)
where I
is the approximate integrand, i.e. ireltol
and iabstol
are the relative and absolute tolerances requested, respectively. However, the number of conditional loglikelihood evaluations is not allowed to exceed imaxiters
. The default values of iabstol
, ireltol
and imaxiters
are 0
, 1e-8
and 10^10
respectively.
Example:
alg = LLQuad(ireltol = 1e-6, imaxiters = 10_000)
Pumas.marginal_nll(model, subject, init_params(model), alg)
Pumas.LeaveFutureK
— TypeLeaveFutureK(; K = 1, minimum = 2)
Leave-future-K-out splitting method. In this method, the data is assumed to be a timeseries. The goal is to split the data into "past" and "future". Using this method, the data is split multiple times into 3 disjoint groups where the third group is discarded, each time starting from the full data set. The first 2 groups are typically called the past/training subset and the future/validation subset, where the future validation subset has K
future data points. In the next iteration, the whole data set is then re-split using another disjoint subset of K
data points as the future validation subset. This process is done iteratively until the training set has no less than minimum
number of points remaining. Using this method, each data point can show up in at most 1 future validation subset. This method can be used for computing predictive loglikelihoods, in-sample loglikelihoods and doing cross-validation using subsets of future out-of-sample data. The default values of K
and minimum
are 1 and 2 respectively.
Example:
Assume the original data is ["A", "B", "C", "D", "E", "F"]
. Leave-1-future-out splitting with minimum = 2
results in the following splittings of the data where the third column is discarded:
Past training subset | Future validation subset | Discarded subset |
---|---|---|
["A", "B", "C", "D", "E"] | ["F"] | [] |
["A", "B", "C", "D"] | ["E"] | ["F"] |
["A", "B", "C"] | ["D"] | ["E", "F"] |
["A", "B"] | ["C"] | ["D", "E", "F"] |
Leave-2-future-out splitting with minimum = 2
results in the following splittings:
Past training subset | Future validation subset | Discarded subset |
---|---|---|
["A", "B", "C", "D"] | ["E", "F"] | [] |
["A", "B"] | ["C", "D"] | ["E", "F"] |
Pumas.LeaveK
— TypeLeaveK(; K = 5, shuffle = false, rng = nothing)
Leave-K-out splitting method. In this method, the data is split multiple times into 2 disjoint groups, each time starting from the full data set. The 2 groups are typically called training and validation subsets, where the validation subset has K
data points. In the next iteration, the whole data set is then re-split using another disjoint subset of K
data points as the validation subset. This process is done iteratively until almost each data point has shown up in 1 and only 1 validation subset. This method can be used for computing predictive loglikelihoods, in-sample loglikelihoods and doing cross-validation using subsets of out-of-sample data.
The data is typically a vector of some sort, e.g. observations or subjects, and the splittings are order-dependent. Before performing the splittings, you can randomly shuffle the data vector by setting the shuffle
keyword argument to true
(default is false
) getting rid of the sensitivity to the original order of the data. You can additionally pass an optional pseudorandom number generator rng
to control the pseudo-randomness for maximum reproducibility.
Example:
Assume the original data is ["A", "B", "C", "D"]
. Leave-1-out splitting without shuffling results in the following splittings of the data:
Training subset | Validation subset |
---|---|
["A", "B", "C"] | ["D"] |
["A", "B", "D"] | ["C"] |
["A", "C", "D"] | ["B"] |
["B", "C", "D"] | ["A"] |
where each data point showed once and only once in a validation subset.
Leave-2-out splitting without shuffling results in the following splittings:
Training subset | Validation subset |
---|---|
["A", "B"] | ["C", "D"] |
["C", "D"] | ["A", "B"] |
Pumas.MarginalMCMC
— TypeMarginalMCMC
An instance of the marginal Hamiltonian Monte Carlo (HMC) based No-U-Turn Sampler (NUTS) algorithm where the subject-specific parameters are marginalized and only the population parameters are sampled. All the options of the algorithm can be passed to the constructor of MarginalMCMC
as keyword arguments. Example:
alg = MarginalMCMC(marginal_alg = LaplaceI(), nsamples = 2000, nadapts = 1000)
The main options that can be set are:
marginal_alg
: (default isLaplaceI()
) the algorithm used to marginalize out the subject-specific parameters, defaults toLaplaceI()
but can also beFO()
orLaplaceI()
target_accept
: (default is 0.8) target acceptance ratio for the NUTS algorithmnsamples
: (default is 2000) number of Markov Chain Monte Carlo (MCMC) samples to generate including adaptation samplesnadapts
: (default isnsamples ÷ 2
) number of adaptation steps in the NUTS algorithmnchains
: (default is 4) number of MCMC chains to sample, default value is 4ess_per_chain
: (default is infinity) target effective sample size (ESS) per chain, sampling terminates if the target is reachedcheck_every
: (default isess_per_chain ÷ 5
) the number of samples after which the ESS per chain is checkedtime_limit
: (default is infinity) a time limit for sampling in seconds, sampling terminates if the time limit is reachedensemblealg
: (default isEnsembleThreads()
) can be set toEnsembleSerial()
for serial sampling,EnsembleThreads()
for multi-threaded sampling orEnsembleDistributed()
for multi-processing (aka distributed parallelism) samplingparallel_chains
: (default is true if enough threads/workers exist) can be set totrue
orfalse
, if set totrue
the chains will be sampled in parallel using either multi-threading or multi-processing depending on the value ofensemblealg
parallel_subjects
: (default is true if enough threads/workers exist) can be set totrue
orfalse
, if set totrue
the log probability computation will be parallelized over the subjects using either multi-threading or multi-processing depending on the value ofensemblealg
rng
: the random number generator useddiffeq_options
: (default is an empty named tuple(;)
, which uses the downstream defaults) aNamedTuple
of all the differential equations solver's optionsconstantcoef
: (default is an empty named tuple(;)
) aNamedTuple
of the parameters to be fixed during sampling. This can be used to sample from conditional posteriors fixing some parameters to specific values, e.g.constantcoef = (σ = 0.1,)
fixes theσ
parameter to 0.1 and samples from the posterior of the remaining parameters conditional onσ
.
Pumas.NL
— TypeNL(pkmodel; hill=false)
Specify nonlinear clearance of a pkmodel
for model generation.
pkmodel::Pumas.Explicitmodel - The pharmacokinetic model, e.g. Depots1Central1()
. hill::Bool - toggle the use of a Hill coefficient.
Pumas.ObsAdditive
— TypeObsAdditive(obsname::Symbol)
Get an additive error model for automatic model macro generation.
- obsname::Symbol - choose what the output should be called. If you'll be fitting the model towards data then this symbol should match the identifier of the
observation
in your data.
Pumas.ObsCombined
— TypeObsCombined(obsname::Symbol)
Get a combined (proportional and additive) error model for automatic model macro generation.
- obsname::Symbol - choose what the output should be called. If you'll be fitting the model towards data then this symbol should match the identifier of the
observation
in your data.
Pumas.ObsProportional
— TypeObsProportional(obsname::Symbol)
Get a proportional error model for automatic model macro generation.
- obsname::Symbol - choose what the output should be called. If you'll be fitting the model towards data then this symbol should match the identifier of the
observation
in your data.
Pumas.PDiagDomain
— Type@param x ∈ PDiagDomain(n::Int)
@param x ∈ PDiagDomain(; init::AbstractVector{<:Real})
Specify a parameter as an n
-by-n
positive semi-definite diagonal matrix by providing either the number n
of rows and columns or the initial diagonal (variance) elements init
.
If the number n
of rows and columns in the matrix is specified, the initial diagonal elements are all set to 1.
Pumas.PSDDomain
— Type@param x ∈ PSDDomain(n::Int)
@param x ∈ PSDDomain(; init=Matrix{Float64}(I, n, n))
Specifies a parameter as a symmetric n
-by-n
positive semi-definite matrix. The user must either specify n
as the number of rows and columns in the matrix or init
as the initial value of the parameter. init
should be a positive semi- definite Matrix
of Float64
s.
Pumas.PSISCrossvalidation
— TypePSISCrossvalidation(; split_method, split_by, filter_nans = true, pareto_shape_threshold = 0.7)
Defines an instance of the PSISCrossvalidation
algorithm for crossvalidation. In this algorithm, the fitting is not re-run but the MCMC samples are re-weighted using importance sampling (IS) while leaving out a subset of the data each time. The weights are then smoothed using a generalized Pareto distribution. The smoothed weights are then used to estimate the predictive likelihood of the out-of-sample data. This algorithm is known as the Pareto-smoothed importance sampling, leave-one-out (PSIS-LOO) crossvalidation method. In Pumas
, we generalize the approach to handle any training-validation splitting of the data.
The shape parameters of the Pareto distribution fitted to the importance weights provide a diagnostic for the reliability of estimates of the predictive likelihoods. This happens typically when the samples are too sensitive to the a certain data point or subset removed and used for validation. In some cases NaN
s can even result. The filter_nans
set to true
by default can be used to ignore the cases where the smoothing fails eliminating this run from the cross-validation result. Additionally, any run leading to a shape parameter higher than pareto_shape_threshold
(default is 0.7) will also be removed from the cross-validation result. So if there are 100 different training-validation splittings and 9 of those lead to NaN
or a shape parameter higher than pareto_shape_threshold
, only the remaining 91 will be used for estimating the predictive loglikelihoods.
The way by which the data is split between training and validation sets is determined using the keyword arguments split_method
and split_by
. The split_method
argument can be of any of the following types:
LeaveK
for leave-K-out cross-validationKFold
for K-fold cross-validationLeaveFutureK
for leaving K future points at a time
while split_by
can be of any of the following types:
BySubject
for leaving out subjectsByObservation
for leaving out individual observations per subject
Please refer to the documentation of each of the above types for the parameters of each split method, e.g. typing ?LeaveK
or ?BySubject
.
Examples:
Assume there are 5 subjects and 10 observations per subject and that ft
is the result of the fit
function. The following are some of the combinations in which the above inputs can be used:
- Leave-1-observation-out cross-validation, leaving 1 observation for all the subjects at a time.
allsubjects = true
means that the same observation index is removed for all the subjects, e.g. the 10th observation for all the subjects is used for validation in the first run, then the 9th observation is used for validation in the second run, etc.
split_method = LeaveK(K = 1)
split_by = ByObservation(allsubjects = true)
cv_method = PSISCrossvalidation(;
split_method = split_method, split_by = split_by,
filter_nans = true, pareto_shape_threshold = 0.7,
)
r = crossvalidate(ft, cv_method)
- Leave-2-future-observations-out cross-validation, leaving 2 future observations per subject such that no less than 4 observations are used in training.
allsubjects = false
means that only the data of one subject at a time will be split. So in the first run, only observations 1 to 4 of subject 1 are used for training and observations 5 and 6 of subject 1 are used for validation. In the second run, observations 1 to 6 of subject 1 are used for training and observations 7 and 8 are used for validation. In the third run, observations 1 to 8 of subject 1 get used for training and observations 9 and 10 are used for validation. For all 3 runs, all the observations of subjects 2 to 5 are used in training. Then in the fourth to sixth runs, subject 2's data gets split. In the seventh to ninth runs, subject 3's data gets split, etc.
split_method = LeaveFutureK(K = 2, minimum = 4)
split_by = ByObservation(allsubjects = false)
cv_method = PSISCrossvalidation(;
split_method = split_method, split_by = split_by,
filter_nans = true, pareto_shape_threshold = 0.7,
)
r = crossvalidate(ft, cv_method)
- Leave-1-subject-out cross-validation, marginalizing the subject-specific parameters using quadrature integration when computing predictive likelihoods. Using this method, subjects {1, 2, 3, 4} are used for training in the first run and subject {5} is used for validation. In the second run, subjects {1, 2, 3, 5} are used for training and subject {4} is used for validation, etc. The predictive likelihoods computed are the marginal likelihood of the subject computed using Gaussian quadrature with a maximum of 1000 integration points. Refer to the documentation of
LLQuad
by typing?LLQuad
for more details on the parameters of the quadrature algorithm.FO()
,FOCE()
andLaplaceI()
can also be used instead.
split_method = LeaveK(K = 1)
split_by = BySubject(marginal = LLQuad(imaxiters=1000))
cv_method = PSISCrossvalidation(;
split_method = split_method, split_by = split_by,
filter_nans = true, pareto_shape_threshold = 0.7,
)
r = crossvalidate(ft, cv_method)
- Leave-1-future-subject-out cross-validation, where the predictive likelihoods are conditional likelihoods computed using
0
s for the subject-specific parameters instead of marginalizing. The minimum number of subjects is 3 so there are only 2 runs. In the first run, the training subjects are {1, 2, 3} and the validation subject is {4}. In the second run, the training subjects are {1, 2, 3, 4} and the validation subject is {5}.
split_method = LeaveFutureK(K = 1, minimum = 3)
split_by = BySubject(marginal = nothing)
cv_method = PSISCrossvalidation(;
split_method = split_method, split_by = split_by,
filter_nans = true, pareto_shape_threshold = 0.7,
)
r = crossvalidate(ft, cv_method)
Pumas.ParamSet
— TypeParamSet(params::NamedTuple)
Construct a Pumas.jl parameter set.
params
should be a NamedTuple
that maps a parameter name to a Domain
or a Distribution
.
Example
ParamSet((;
tvKa = RealDomain(lower = 0.0),
tvCL = RealDomain(lower = 0.0),
tvω = PDiagDomain(2),
))
Pumas.Population
— TypeA Population
is an AbstractVector
of Subject
s.
Pumas.RealDomain
— Type@param x ∈ RealDomain(;lower=-∞,upper=∞,init=0)
Specifies a parameter as a real value. lower
and upper
are the respective bounds, init
sets the initial value of the parameter and is used for init_params
.
Pumas.SAEM
— TypeSAEM(; iters::Tuple{Int,Int,Int} = (200,100,100), α::Float64 = 0.7, λ::Float64 = 0.975, repeat::Int = 1, verbosity::Int = 0)
iters
: A tuple indicating the number of iterations to run in the exploratory, middle, and smoothing phases of SAEM.α
: Thelearning rate
for the middle phase. On thek
th update,α^k
weight is placed on thek
th iteration, and1 - α^k
on the previous.λ
: Maximum standard deviation change rate. A value of0.95
means thatω_new = clamp(ω_new, ω_old * 0.95, ω_old / 0.95)
on each iteration. Slowing the decliine inω
s andσ
s increases exploration, while limiting the growth prevents excessive flattening of the log likelihood.repeat
: Indicates how many times to repeatiters
. Defaults to1
.verbosity
: How verbose should it be?0
: silent,1
: print on eachrepeat
,2
: print once per iteration.
Pumas.SIR
— MethodSIR(; rng=Random.default_rng, samples, resamples)
Constructs an SIR object that allows the user to use sampling importance re-sampling (SIR) with the supplies settings in infer
function calls. See ?infer
and https://docs.pumas.ai/stable/analysis/inference/ for more information about obtaining inference results.
The keyword arguments are as follows:
- rng: used to select a pseudo-random number generator to beused within the SIR procedure. Requires loading Random package (using Random)
- samples: the number of model parameter samples simulated from a truncated multivariate normal distribution (proposal distribution) based on the maximum-likelihood estimates and robust variance-covariance matrix. This constitutes the original sample (uncertainty) distribution.
- resamples: the number of model parameter samples from the original sample distribution to re-sample (by weighed re-sampling without replacement). The number of resamples must be less than that of the original sample. An original sample-to-resample ratio of at least 5:1 is suggested.
Both samples and resamples must be supplied, as there are no defaults.
Note that the proposal distribution for the SIR procedure is based on the robust variance-covariance matrix and will therefore fail whenever the computation of the robust variance-covariance matrix fails. Confidence intervals based on SIR respect bounds on the parameters (e.g. the confidence interval for a strictly positive parameter cannot contain negative values).
Pumas.Subject
— TypeSubject
The data corresponding to a single subject:
Fields:
id
: identifierobservations
: a named tuple of the dependent variablescovariates
: a named tuple containing the covariates, ornothing
.events
: aDosageRegimen
, ornothing
.time
: a vector of time stamps for the observations
When there are time varying covariates, each covariate is interpolated with a common covariate time support. The interpolated values are then used to build a multi-valued interpolant for the complete time support.
From the multi-valued interpolant, certain discontinuities are flagged in order to use that information for the differential equation solvers and to correctly apply the analytical solution per region as applicable.
Constructor
Subject(;id = "1",
observations = NamedTuple(),
events = nothing,
time = observations isa AbstractDataFrame ? observations.time : nothing,
covariates::Union{Nothing, NamedTuple} = nothing,
covariates_time = observations isa AbstractDataFrame ? observations.time : nothing,
covariates_direction = :left)
Subject
may be constructed from an <:AbstractDataFrame
with the appropriate schema or by providing the arguments directly through separate DataFrames
/ structures.
Examples:
julia> Subject()
Subject
ID: 1
julia> Subject(
id = 20,
events = DosageRegimen(200, ii = 24, addl = 2),
covariates = (WT = 14.2, HT = 5.2),
)
Subject
ID: 20
Events: 3
Covariates: WT, HT
julia> Subject(covariates = (WT = [14.2, 14.7], HT = fill(5.2, 2)), covariates_time = [0, 3])
Subject
ID: 1
Covariates: WT, HT
Pumas.Subject
— MethodSubject
Constructor
Subject(simsubject::SimulatedObservations)
Roundtrip the result of simobs
, i.e. SimulatedObservations to a Subject
/Population
Example:
sims = simobs(model, pop, params)
To convert sims
to a Population
, broadcast as below
Subject.(sims)
Pumas.Subject
— MethodSubject(subject::Subject; id::AbstractString, events)
Construct a Subject
from an existing subject
while replacing information according to the keyword arguments. The possible keyword arguments are
id
, sets the subject identifier to the stringid
. Defaults to the identifier already insubject
.events
, sets the subject events to the inputDosageRegimen
. Defaults to the events already present insubject
.covariates
andcovariates_times
, used as input to add covariate information
Examples:
julia> s1 = Subject(; id = "AKJ491", events = DosageRegimen(1.0; time = 1.0))
Subject
ID: AKJ491
Events: 1
julia> Subject(s1; id = "AKJ492")
Subject
ID: AKJ492
Events: 1
Pumas.TimeToEvent
— TypeTimeToEvent{T}
Distribution-like struct to store the hazard and the cumulative hazard in a time-to-event model. The dependent variable in a model that uses TimeToEvent
should be a censoring variable that is one when the variable is observed (not censored) zero when the variable is right censored. Currently, no other censoring types are supported.
Example
...
@pre begin
θeff = θ*DOSE
λ = λ₀*exp(θeff)
end
@dynamics begin
Λ' = λ
end
@derived begin
DV ~ @. TimeToEvent(λ, Λ)
end
...
Pumas.VectorDomain
— Type@param x ∈ VectorDomain(n::Int; lower=-∞,upper=∞,init=0)
Specifies a parameter as a real vector of length n
. lower
and upper
are the respective bounds, init
sets the initial value of the parameter and is used for init_params
. The keyword arguments can all take either numbers or vectors. If numbers then the same value will be applied to all n
vector elements. If a you specify a vector (of length n
) then you can adjust the parameters of the VectorDomain
elements individually.
Pumas.ZeroSumDomain
— TypeZeroSumDomain(n; init = zeros(n))
ZeroSumDomain(init)
ZeroSumDomain(; init)
Return a zero-sum domain of dimension n
. Any n-dimensional vector with a sum of 0 is in this domain. Instead of passing n
, an initial value init
can be passed instead either as a positional or keyword argument.
Base.isvalid
— Methodisvalid(s::SimulatedObservations) -> Bool
Check if ODE solution of a simulation was successful.
Examples
mdl = @model begin
@options begin
checklinear = false # Force numerical integration of ODE
end
@param begin
θkₑ ∈ RealDomain(lower = 0.0)
end
@pre begin
kₑ = θkₑ
cᵥ = 0.1
V = 1.0
end
@dynamics begin
Central' = -kₑ * Central
end
@derived begin
μ = Central ./ V
y ~ @. Gamma(1 / cᵥ^2, abs(μ) * cᵥ^2)
end
end
sim = simobs(
mdl,
Subject(events = DosageRegimen(100), time = range(1.0, 40.0, length = 12)),
(; θkₑ = 1e20),
)
julia> isvalid(sim)
false
Base.truncate
— MethodPumas.truncate(b::BayesMCMCResults; burnin = 0, ratio = 1.0)
Removes burnin
samples from the beginning of each chain and then keeps only ratio
* 100% of the remaining samples in each chain (i.e. thinning).
CommonSolve.solve
— Functionsol = solve(
model::AbstractPumasModel,
population::Union{Population, Subject},
param::NamedTuple,
randeffs::NamedTuple=sample_randeffs(rng, model, param, population);
ensemblealg=EnsembleSerial(),
diffeq_options=NamedTuple(),
rng=default_rng()
)
Solve the model
applied to the Subject
(s) within population
using parameters param
and random effects randeffs
. By default, the times at which to save the solution are taken to be the observation times for each Subject
within population
. This can be overridden by supplying a vector or range as saveat
to diffeq_options
- e.g. diffeq_options=(; saveat=[0., 0.5, 1.])
.
Arguments
model
may either be aPumasModel
or aPumasEMModel
.population
may either be aPopulation
ofSubject
s or a singleSubject
.param
is parameter set in the form of aNamedTuple
, e.g.(; tvCL=1., tvKa=2., tvV=1.)
.randeffs
is an optional argument that, if used, should specify the random effects for each subject inpopulation
. Such random effects are specified byNamedTuple
s forPumasModels
(e.g.(; tvCL=1., tvKa=2.)
) and byTuples
forPumasEMModel
s (e.g.(1., 2.)
). Ifpopulation
is a singleSubject
(without being enclosed in a vector) then a single such random effect specifier should be passed. If, however,population
is aPopulation
of multipleSubject
s thenrandeffs
should be a vector containing one such specifier for eachSubject
. The functionsinit_randeffs
,zero_randeffs
, andsample_randeffs
are sometimes convenient for generatingrandeffs
:
randeffs = zero_randeffs(model, param, population)
solve(model, population, param, randeffs)
If no randeffs
is provided, then random ones are generated according to the distribution in the model.
ensemblealg
is a keyword argument that allows you to choose between different modes of parallelization. Options includeEnsembleSerial()
,EnsembleThreads()
andEnsembleDistributed()
.diffeq_options
is a keyword argument that takes aNamedTuple
of options to pass on to the differential equation solver.steadystate_options
is a keyword argument that takes aNamedTuple
of options to pass on to the steady state solver.rng
is a keyword argument where you can specify which random number generator to use.
Distributions.logpdf
— Methodlogpdf(d::Pumas.Constrained, x)
Evaluate the logarithm of the probability density of the constrained distribution, d
, at x
.
- If
d
is a constrained univariate distribution thenx
should be a scalar. - If
d
is a constrained multivariate distribution thenx
should be a vector.
Evaluations of x
outside of d
's constraints returns -Inf
.
Note that d
itself is not a true distribution since its probability mass is not rescaled to 1.
GlobalSensitivity.gsa
— Functionfunction GlobalSensitivity.gsa(
model::PumasModel,
population::Union{Population,Subject},
params::NamedTuple,
method::GlobalSensitivity.GSAMethod,
vars = [:dv],
p_range_low::NamedTuple = NamedTuple(),
p_range_high::NamedTuple = NamedTuple();
constantcoef::Tuple = (),
rng::AbstractRNG = Random.default_rng(),
diffeq_options::NamedTuple = NamedTuple(),
ensemblealg = EnsembleSerial(),
samples = 1000,
kwargs...,
)
Perform a global sensitivity analysis (GSA) of the population mean of the derived variables vars
on the parameters of the Pumas model
.
Arguments
model
: aPumasModel
.population
: aPopulation
ofSubject
s or a singleSubject
.params
: a named tuple of parameters.method
: one of the GSA methods from GlobalSensitivity.jl such asSobol()
,Morris()
,eFAST()
, orRegressionGSA()
.vars
: a list of the observed variables (defined in the@observed
block) to run GSA on.p_range_low
&p_range_high
: the lower and upper bounds for the parameters you want to run the GSA on. For parameters without lower or upper bounds inp_range_low
orp_range_high
, respectively, bound(s) in the@param
block are used. If bounds for a parameter are neither specified as arguments nor in the@param
block, an error is thrown.
Keyword arguments
constantcoef
: A tuple of the names of the parameters that are not to be varied in the GSA.rng
: The random number generator to use.samples
: The number of samples to use for the GSA.diffeq_options
: The options of the differential equation solver, passed on tosimobs
when computing the derived variables of interest.ensemblealg
: The mode of parallelization of thesimobs
calls when computing the derived variables of interest.kwargs...
: The remaining keyword arguments are passed on toGlobalSensitivity.gsa
, as documented in GlobalSensitivity.jl. In particular, evaluation of GSA samples can be parallelized by settingbatch = true
(default:batch = false
).
References
For more details regarding the GSA methods, their options, and keyword arguments please consult the GlobalSensitivity.jl documentation.
LinearAlgebra.cond
— Methodcond(pmi::FittedPumasModelInference; correlation = true)
Return the condition number of the correlation matrix (or variance-covariance matrix with correlation = false) stored in pmi
. Throws an error if pmi
is the result of a call to infer
with Pumas.Bootstrap
or if the variance-covariance calculation failed.
MCMCDiagnosticTools.gewekediag
— Methodgewekediag(b::BayesMCMCResults; subject::Union{Nothing,Int} = nothing, first::Real = 0.1, last::Real = 0.5)
Computes the Geweke diagnostic [Geweke1991] for each chain outputting a p-value per parameter. If subject
is nothing
(default), the chains diagnosed are those of the population parameters. If subject
is set to an integer index, the chains diagnosed are those of the subject-specific parameters corresponding to the subject with the input index.
The Geweke diagnostic compares the sample means of two disjoint sub-chains X₁
and X₂
of the entire chain using a normal difference of means hypothesis test where the null and alternative hypotheses are defined as:
H₀: μ₁ = μ₂
H₁: μ₁ ≂̸ μ₂
whereμ₁
andμ₂
are the population means. The first sub-chainX₁
is taken as the first(first * 100)
% of the samples in the chain, wherefirst
is a keyword argument defaulting to0.1
. The second sub-chainX₂
is taken as the last(last * 100)
% of the samples in the chain, wherelast
is a keyword argument defaulting to0.5
.
The test statistic used is: z₀ = x̅₁ - x̅₂ / √(s₁² + s₂²)
where x̅₁
and x̅₂
are the sample means of X₁
and X₂
respectively, and s₁
and s₂
are the Markov Chain standard error (MCSE) estimates of X₁
and X₂
respectively. Auto-correlation is assumed within the samples of each individual sub-chain, but the samples in X₁
are assumed to be independent of the samples in X₂
. The p-value output is an estimate of P(|z| > |z₀|)
, where z
is a standard normally distributed random variable.
Low p-values indicate one of the following:
- The first and last parts of the chain are sampled from distributions with different means, i.e. non-convergence,
- The need to discard some initial samples as burn-in, or
- The need to run the sampling for longer due to lack of samples or high auto-correlation.
High p-values indicate the inability to conclude that the means of the first and last parts of the chain are different with statistical significance. However, this alone does not guarantee convergence to a fixed posterior distribution because:
- Either the standard deviations or higher moments of
X₁
andX₂
may be different, or - The independence assumption between
X₁
andX₂
may not be satisfied when high auto-correlation exists.
MCMCDiagnosticTools.heideldiag
— Methodheideldiag(b::BayesMCMCResults; subject::Union{Nothing,Int} = nothing, alpha::Real = 0.05, eps::Real = 0.1, start::Integer = 1)
Compute the Heidelberger and Welch diagnostic [Heidelberger1983] for each chain. If subject
is nothing
(default), the chains diagnosed are those of the population parameters. If subject
is set to an integer index, the chains diagnosed are those of the subject-specific parameters corresponding to the subject with the input index.
The Heidelberger diagnostic attempts to:
- Identify a cutoff point for the initial transient phase for each parameter, after which the samples can be assumed to come from a steady-state posterior distribution. The initial transient phase can be removed as burn-in. The cutoff point for each parameter is given in the
burnin
column of the output dataframe. - Estimate the relative confidence interval (confidence interval divided by the mean) for the mean of the steady-state posterior distribution of each parameter, assuming such steady-state distribution exists in the samples. A large confidence interval implies either the lack of convergence to a stationary distribution or lack of samples. Half the relative confidence interval is given in the
halfwidth
column of the output dataframe. Thetest
column will betrue
(1) if thehalfwidth
is less than the input targeteps
(default is 0.1) andfalse
(0) otherwise. Note that parameters with mean value close to 0 can have erroneous relative confidence intervals because of the division by the mean. Thetest
value can therefore be expected to befalse
(0) for those parameters without concluding lack of convergence. - Quantify the extent to which the distribution of the samples is stationary using statistical testing. The returned p-value, shown in the
pvalue
column of the output dataframe, can be considered a measure of mean stationarity. A p-value lower than the input thresholdalpha
(default is 0.05) implies lack of stationarity of the mean, i.e. the posterior samples did not converge to a steady-state distribution with a fixed mean.
The Heidelberger diagnostic only tests for the mean of the distribution. Therefore, it can only be used to detect lack of convergence and not to prove convergence. In other words, even if all the numbers seem normal, one cannot conclude that the chain converged to a stationary distribution or that it converged to the true posterior.
MCMCDiagnosticTools.rafterydiag
— Methodrafterydiag(b::BayesMCMCResults; subject::Union{Nothing,Int} = nothing, q = 0.025, r = 0.005, s = 0.95, eps = 0.001)
Compute the Raftery and Lewis diagnostic [Raftery1992]. This diagnostic is used to determine the number of iterations required to estimate a specified quantile q
within a desired degree of accuracy. The diagnostic is designed to determine the number of autocorrelated samples required to estimate a specified quantile $\theta_q$, such that $\Pr(\theta \le \theta_q) = q$, within a desired degree of accuracy. In particular, if $\hat{\theta}_q$ is the estimand and $\Pr(\theta \le \hat{\theta}_q) = \hat{P}_q$ the estimated cumulative probability, then accuracy is specified in terms of r
and s
, where $\Pr(q - r < \hat{P}_q < q + r) = s$. Thinning may be employed in the calculation of the diagnostic to satisfy its underlying assumptions. However, users may not want to apply the same (or any) thinning when estimating posterior summary statistics because doing so results in a loss of information. Accordingly, sample sizes estimated by the diagnostic tend to be conservative (too large).
Furthermore, the argument r
specifies the margin of error for estimated cumulative probabilities and s
the probability for the margin of error. eps
specifies the tolerance within which the probabilities of transitioning from initial to retained iterations are within the equilibrium probabilities for the chain. This argument determines the number of samples to discard as a burn-in sequence and is typically left at its default value.
Pumas.Censored
— MethodCensored(distribution::UnivariateDistribution, lower::Real, upper::Real)
Construct a censored distribution based on distribution
and the censoring limits lower
and upper
.
Pumas._convergencedata
— Method_convergencedata(obj; metakey="x")
Returns the "timeseries" of optimization as a matrix, with series as columns.
This must return parameter data in the same order that _paramnames
returns names.
Pumas._lpdf
— Method_lpdf(d,x)
The log pdf: this differs from Distributions.logdpf
definition in a couple of ways:
- if
d
is a non-distribution it assumes the Dirac distribution. - if
x
isNaN
orMissing
, it returns 0. - if
d
is aNamedTuple
of distributions, andx
is aNamedTuple
of observations, it computes the sum of the observed variables.
Pumas._objectivefunctionvalues
— Method_objectivefunctionvalues(obj)
Returns the objective function values during optimization. Must return a Vector{Number}
.
Pumas._paramnames
— Method_paramnames(obj)
Returns the names of the parameters which convergence is being checked for.
This must return parameter names in the same order that _convergencedata
returns data.
Pumas.add_t
— Methodadd_t(ex, vars)
Append (t)
to all the vars
symbols in the expression ex
.
Pumas.apply_formatting
— Methodapplyformatting(inputex, vars; show_t=true, italicize=true)
Apply formatting to the vars of an input expression before latexification.
Pumas.apply_stat
— Methodapply_stat(stat, v::AbstractVector)
Apply the summarizing function stat
to the contents of the vector in a way that is equivalent to calling stat
on a vector of all the elements that are in the i
th position of each element of v
. For example, this can be used to create a NamedTuple
of the mean across a vector of NamedTuple
s such that the key k
in the output is the mean of all the values belonging to the key k
in each element of the vector v
.
Example
julia> v = [(a = 3, b = 4), (a = 1, b = 9), (a = 6, b = 0)]
3-element Vector{NamedTuple{(:a, :b), Tuple{Int64, Int64}}}:
(a = 3, b = 4)
(a = 1, b = 9)
(a = 6, b = 0)
julia> apply_stat(mean, v)
(a = 3.3333333333333335,
b = 4.333333333333333,)
julia> apply_stat(var, v)
(a = 6.333333333333334,
b = 20.333333333333336,)
julia> apply_stat(std, v)
(a = 2.5166114784235836,
b = 4.509249752822894,)
Pumas.build_model_str
— Methodbuild_model_str(pk, pd, obs, dcp)
Generate a string that fully specifies an @model
macro call.
- pk - a pharmacokinetic model, for example
Depots1Central1()
- pd::Union{Nothing, AbstractPDModule} - a pharmacodynamic model, for example
IDR(; direct=true, mode=:stimulation_emax)
- obs::AbstractObservationModule - an observational error model, for example
ObsCombined(:MyOutputName)
- dcp::DoseControl - a dose control specification, for example
DoseControl(; bioav=true)
Pumas.censored_latent
— Methodcensored_latent(
distribution::UnivariateDistribution,
lower::Union{Real,Nothing},
upper::Union{Real,Nothing},
)
censored_latent(
distribution::UnivariateDistribution;
lower::Union{Real,Nothing} = nothing,
upper::Union{Real,Nothing} = nothing,
)
Construct a censored distribution based on distribution
and the censoring limits lower
and upper
. When fitting a model with a censored_latent
dependent variable, the log probability of the censored distribution is used in the likelihood calculation. However, when calculating the mean or other summary statistics, the underlying uncensored distribution is used.
Pumas.conditional_nll
— Methodconditional_nll(m::AbstractPumasModel, subject::Subject, param, randeffs; diffeq_options)
Compute the conditional negative log-likelihood of model m
for subject
with parameters param
and random effects randeffs
. diffeq_options
is a NamedTuple
of options passed to the ODE solver. Requires that the derived produces distributions.
Pumas.cor2cov
— Method`cor2cov(ρ::AbstractVector, ω::AbstractVector)`
`cor2cov(ω::AbstractMatrix)`
`cor2cov(ω::AbstractVector)`
Converts correlations and standard deviations to a covariance matrix. There are 3 ways to use the cor2cov
function.
The first method has 2 inputs:
ρ
: a vector of correlations which is a vector of the strict lower triangular elements of the correlation matrix of the random variables.ω
: a vector of standard deviations of the random variables Example:
ω11, ρ12, ω22 = 0.2, 0.1, 0.3
ρ = [ρ12]
ω = [ω11, ω22]
Pumas.cor2cov(ρ, ω)
gives the covariance matrix:
[ ω11^2 ρ12*ω11*ω22;
ρ12*ω11*ω22 ω22^2 ]
which is equal to:
2×2 Matrix{Float64}:
0.04 0.006
0.006 0.09
The second method has a single input matrix ω
which is a symmetric matrix whose diagonal is the standard deviations of the random variables and off-diagonals are their correlations. Example:
ω = [
ω11 ρ12
ρ12 ω22
]
Pumas.cor2cov(ω)
gives the covariance matrix:
[
ω11^2 ρ12*ω11*ω22
ρ12*ω11*ω22 ω22^2
]
which is equal to:
2×2 Matrix{Float64}:
0.04 0.006
0.006 0.09
The third method has a single input vector ω
which is a vector of the lower triangular elements of the symmetric matrix whose diagonal is the standard deviations of the random variables and off-diagonals are their correlations.
Example:
ω = [ω11, ρ12, ω22]
Pumas.cor2cov(ω)
gives the covariance matrix:
[
ω11^2 ρ12*ω11*ω22
ρ12*ω11*ω22 ω22^2
]
which is equal to:
2×2 Matrix{Float64}:
0.04 0.006
0.006 0.09
Pumas.correlation_diagnostic
— Methodcorrelation_diagnostic(
pmi::FittedPumasModelInference;
high_cor_threshold = 0.7,
medium_cor_threshold = high_cor_threshold / 2,
)
A function which accepts the output of the infer
function as its argument and prints the lists of parameter pairs with high, medium or low correlations. Parameter pairs with high correlation are those with a correlation higher than high_cor_threshold
. Parameter pairs with a low correlation are those with a correlation lower than medium_cor_threshold
. The remaining parameter pairs have a medium correlation. Assume fpm
is the output of fit
. The following is an example of how to use correlation_diagnostic
:
fpmi = infer(fpm)
correlation_diagnostic(fpmi)
Pumas.covariate_select
— Methodfunction covariate_select(
model::AbstractPumasModel,
population::Population,
param::NamedTuple,
approx::LikelihoodApproximation;
control_param::Union{Tuple,Nothing},
criterion = aic,
method::CovariateSelection.T = CovariateSelection.Forward,
fit_options::NamedTuple = (;),
verbose::Bool = false,
) -> CovariateSelectionResults
Use a covariate selection procedure to find the best fit of model
to population
with starting values param
and estimation method/likelihood approximation approx
according to criterion
by restricting parameters to zero for subsets of control_param
traversed according to method
.
The model
should be written such that it includes all relevant covariates, possibly included with multiple different functional forms, such that that restricting on a parameter listed in control_param
to zero eliminates the effect of the covariate (see example below).
Currently, the two methods CovariateSelection.Forward
(the default) and CovariateSelection.Backward
are supported. The method CovariateSelection.Forward
starts from the models where all parameters in control_param
are restricted to zero. In the next step, a model is fit where all parameters in control_param
but one are restricted to zero. If none of these models attain a smaller value of criterion
then the model that restricts all parameters to zero is chosen. Otherwise, the model with the lowest criterion
is chosen and the procedure continues by fitting all models with two estimated parameters where one of the estimated parameters must be the one corresponding to the best fit in the previous step. New steps are taken as long as the criterion is reduced. If the CovariateSelection.Backward
method is chosen then the procedure starts with all parameters in control_param
unrestricted and then one parameter is restricted at a time until no there is no further improvement.
Any callable object that returns a real value when a fitted model is passed can be used as criterion
. It is always assumed that a smaller value is better. The default value is the Akaike Information Criterion aic
but another commonly used criterion the Bayesian Information Criterion bic
which puts a higher penalty on the number of parameters compared to aic
.
To avoid that logging messages from the calls to fit
fills up the output window or default log, all logging messages produced by fit
are by default suppressed. This might hide important issues with the model fitting so for it is possible to enable complete logging by passing verbose = true
.
Example
model = @model begin
@param begin
β0 ∈ RealDomain()
β1 ∈ RealDomain()
β2 ∈ RealDomain()
β3 ∈ RealDomain()
β4 ∈ RealDomain()
end
@covariates x1 x2 x3 x4
@pre μ = β0 + β1 * x1 + β2 * x2 + β3 * x3 + β4 * x4
@derived begin
y ~ Normal.(μ, 0.1)
end
end
selection = covariate_select(
model,
population,
init_params(model),
NaivePooled();
control_param = (:β1, :β2, :β3, :β4),
method = CovariateSelection.Forward,
criterion = bic,
)
Pumas.covariates_interpolant
— Methodcovariates_interpolant(
covariates_nt::NamedTuple,
covariates_times::NamedTuple,
id::AbstractString;
[covariates_direction::Symbol]) -> time_grid, interpolant
Creates an interpolation of the elements of covariates_nt
and covariates_times
using the a piecewise constant interpolation scheme. The two NamedTuples should have the same keys. The values of covariates_nt
and covariates_times
should contain the interpolation values and time point respectively. The argument id
is a string indicating the subject ID and the argument is only used for providing useful error messages.
Returns a callable interpolation struct as well as the common time grid for the covariate observations. This is safe for values which are not time-varying and it also allows for mixing subjects with multiple measurements and subjects with a single measurement. The default is left-sided constant interpolation.
Pumas.covariates_interpolant
— Methodcovariates_interpolant(
covariates::Vector{Symbol},
data::AbstractDataFrame,
time::Symbol,
id::AbstractString;
[covariates_direction::Symbol]) -> time_grid, interpolant
Creates an interpolation of the columns of data
listed in the covariates
vector at the time points in the time
column of data
using the a piecewise constant interpolation scheme. The argument id
is a string indicating the subject ID and the argument is only used for providing useful error messages.
Returns a callable interpolation struct as well as the common time grid for the covariate observations. This is safe for values which are not time-varying and it also allows for mixing subjects with multiple measurements and subjects with a single measurement. The default is left-sided constant interpolation.
Pumas.crossvalidate
— Methodcrossvalidate(b::BayesMCMCResults, cv_method::ExactCrossvalidation)
Runs cross-validation by re-running the MCMC sampling while leaving out some data points each time. The first argument b
is the output of the MCMC fit. The choice of how the data is split in each run is specified in the cv_method
argument which is an instance of ExactCrossvalidation
. Please refer to the documentation of ExactCrossvalidation
by typing ?ExactCrossvalidation
for details on the parameters of the method.
Pumas.crossvalidate
— Methodcrossvalidate(b::BayesMCMCResults, cv_method::PSISCrossvalidation)
Performs cross-validation using the Pareto smoothed importance sampling (PSIS) approach by re-weighing the samples while leaving out some data points each time. The first argument b
is the output of the MCMC fit. The choice of how the data is split in each run is specified in the cv_method
argument which is an instance of PSISCrossvalidation
.
Pumas.discard
— Methoddiscard(b::BayesMCMCResults; burnin = 0, ratio = 1.0)
Removes burnin
samples from the beginning of each chain and then keeps only ratio
* 100% of the remaining samples in each chain (i.e. thinning).
Pumas.dosecontrol
— Methoddosecontrol(fpm::AbstractFittedPumasModel)::Vector{ConstantInterpolationStructArray}
Return the dosecontrol parameters from fpm
. The dosecontrol parameters are the variables bioav
, lags
, duration
, rate
if they're defined in the @dosecontrol
block in the @model
macro. The function returns a vector of covariate interpolation objects. Each of them can be evaluated at an arbitrary time point, but the interpolant is piecewise constant between the event times. Note, that in the model solution, subject simulation, and parameter fitting the dose control parameters can be time-varying, but they only affect the dosing according to their value at the beginning of each dose event. This means that the dose control parameters are only relevevant at these time points. Each of the covariate interpolation objects can be converted to a DataFrame
by calling the DataFrame
constructor on each element, or a DataFrame
with all subjects can be obtained by calling the DataFrame
constructor on the vector output by this function. This DataFrame
will only contain the times that correspond to the start of each dose event for the reason discussed above.
Pumas.dosenum
— Methoddosenum(t, events) -> # of doses for each time
Creates an array that matches t
in length which denotes the number of events that have occurred prior to the current point in time. If t
is a scalar, outputs the number of events before that time point. Only dose events (evid 1 and 4) are counted.
Pumas.eiwres
— Methodeiwres(model::PumasModel, subject::Subject, param::NamedTuple, nsim::Integer)
Calculate the Expected Simulation based Individual Weighted Residuals (EIWRES).
Pumas.elpd
— Methodelpd(r::CVResult)
Computes an estimate of the expected log predictive density (ELPD) for an out-of-sample data point/set using the result r
of crossvalidate
. The output is a scalar value estimating the model's mean predictive accuracy on out-of-sample data.
Pumas.em_to_m_params
— Methodem_to_m_params(m::PumasEMModel, params::NamedTuple)
Given that params
is a NamedTuple
of coefficients used as inits for fitting m
or returned by coef(fpm::FittedPumasEMModel)
, then em_to_m_params
will return a NamedTuple
of coefficients usable as inits for fitting or performing diagnostics on a PumasModel
created by calling convert(PumasModel, m::PumasEMModel)
.
Pumas.empirical_bayes
— Methodempirical_bayes(
fpm::FittedPumasModel,
data::Union{Subject, Population};
diffeq_options=NamedTuple()
)
Return empirical bayes estimates for a fitted model, fpm
, when applied to data
.
Pumas.empirical_bayes
— Methodempirical_bayes(fpm::Pumas.FittedPumasModel)
empirical_bayes(fpm::Pumas.FittedPumasEMModel)
empirical_bayes(insp::Pumas.FittedPumasModelInspection)
Return sampled random effects or empirical bayes estimates from a fit or model inspection. If the model was estimated with the FO
likelihood approximation methods the empirical bayes estimates will be obtained using the LaplaceI
approximation. If either FOCE
or LaplaceI
was used the final empirical bayes estimates will be returned. If Pumas.SAEM
was used to fit the empirical bayes estimates are obtained using the LaplaceI
approximation.
Pumas.empirical_bayes
— Methodempirical_bayes(
model::PumasModel,
data::Union{AbstractSubject, Population},
param::NamedTuple,
approx::LikelihoodApproximation;
diffeq_options=NamedTuple()
)
Return sampled random effects or empirical bayes estimates. The empirical bayes estimates will be obtained using the LaplaceI
approximation regardless of the value of the approx
argument except for NaivePooled
where an empty NamedTuple
is returned.
Pumas.empirical_bayes_dist
— Methodempirical_bayes_dist(fpm::AbstractFittedPumasModel)
Estimate the distribution of random effects (typically a Normal approximation of the empirical Bayes posterior) for the subjects in a FittedPumasModel
. The result is returned as a vector of MvNormal
s.
Pumas.epredict
— Methodepredict(fpm::AbstractFittedPumasModel; nsim::Integer, rng::AbstractRNG=default_rng(), ensemblealg::::SciMLBase.EnsembleAlgorithm=EnsembleThreads())
Calculate the expected simulation based population predictions.
The keyword nsim
controls the number of times a subject is simulated and the keyword rng
can be used to provide a custom random number generator. The parallelization can be controlled using the ensemblealg
keyword that defaults to EnsembleThreads()
.
Pumas.eventnum
— Methodeventnum(t, events) -> # of doses for each time
Creates an array that matches t
in length which denotes the number of events that have occurred prior to the current point in time. If t
is a scalar, outputs the number of events before that time point. The following events are counted: 1 (dose), 3 (reset), 4 (reset and dose).
Pumas.expectation
— Methodexpectation(g, ::MonteCarloExpectation, model::PumasModel, subject::Subject, dist::MvNormal; imaxiters = 10000, diffeq_options::NamedTuple = NamedTuple(), rng::AbstractRNG = Pumas.default_rng())
Computes the expected predictions for subject
with respect to the population parameters and the subject-specific random effects using Monte Carlo integration. dist
is a multivariate normal distribution used to sample the population parameters e.g. dist = MvNormal(pmi)
where pmi
is the output of infer
. Sampling is done using rejection sampling to ensure the parameter values are in the parameters' domains. imaxiters
is the number of Monte Carlo samples used. diffeq_options
can be used to customize the options of the differential equation solver used. rng
is the random number generator used during the sampling, which defaults to Pumas.default_rng()
.
Pumas.expectation
— Methodexpectation(g, ::MonteCarloExpectation, model::PumasModel, subject::Subject, param_dists::NamedTuple; imaxiters = 10000, diffeq_options::NamedTuple = NamedTuple(), rng::AbstractRNG = Pumas.default_rng())
Computes the expected predictions for subject
with respect to the population parameters and the subject-specific random effects using Monte Carlo integration. param_dists
should be a named tuple of distributions for the population parameters. imaxiters
is the number of Monte Carlo samples to use. diffeq_options
can be used to customize the options of the differential equation solver used. rng
is the random number generator used in the sampling, which defaults to Pumas.default_rng()
.
Pumas.expectation
— Methodexpectation(g, quant::QuadratureExpectation, model::PumasModel, subject::Subject, param_dists::NamedTuple; ireltol=1e-3, iabstol=1e-3, imaxiters=10000, diffeq_options::NamedTuple = NamedTuple())
Computes the expected predictions for subject
with respect to the population parameters using quadrature methods. param_dists
should be a named tuple of distributions for the population parameters. Currently, random effects are not supported. ireltol
and iabstol
are the tolerances for the integration. The integration algorithm will terminate if the tolerance or imaxiters
is reached, whichever is first. diffeq_options
can be used to customize the options of the differential equation solver.
Pumas.findinfluential
— Methodfindinfluential(m::AbstractPumasModel,
data::AbstractPopulation,
param::NamedTuple,
approx::LikelihoodApproximation;
diffeq_options::NamedTuple = NamedTuple())
findinfluential(fpm::AbstractFittedPumasModel,
diffeq_options::NamedTuple = NamedTuple())
Return a vector of the negative log-likelihood values.
It can be used before fit
ting a model in order to find the subjects that contributes the most the log-likelihood value or to identify subjects for which the evaluation of the log-likelihood fails, e.g. due to incorrect data. The latter will show up as NaN
s.
It can also be used on a fitted model to identify the subjects with maximum contribution to the log-likelihood.
A DataFrame
method is defined on the resulting object for easy viewing and plotting of the results.
fi = findinfluential(model, pop, parameters, FOCE())
fi_df = DataFrame(fi)
If the above results contain NaNs
the id column can be useful in finding problematic data points. To visualize the negative log-likelihood values across subjects a bar plot can be produced as follows
using AlgebraOfGraphics
draw(
data(fi_df) * mapping(:nll) * visual(BarPlot);
axis = (;
xlabel = "id",
ylabel = "negative log-likelihood",
xticks = 1:length(pop),
xtickformat = x -> fi_df.id,
),
)
or a histogram to see the distribution:
using AlgebraOfGraphics
draw(
data(fi_df) *
mapping(:nll) *
histogram() *
visual(BarPlot);
axis = (; xlabel = "negative loglikelihood"),
)
or a density plot if preferred over a histogram:
using AlgebraOfGraphics
draw(
data(fi_df) *
mapping(:nll) *
density() *
visual(BarPlot);
axis = (; xlabel = "negative loglikelihood"),
)
Pumas.get_model_combinations
— Methodget_model_combinations()
Get all valid arguments to build_model_str
or build_model_expr
.
Returns a tuple (pks, pds, obs, dcps)
where pks is a vector of all possible PK specifications, and so on, for the model generation.
Example usage:
for args in Iterators.product(get_model_combinations())
obs = make_obs(args[3], args[1], args[2])
build_model_str(args[1], args[2], obs, args[4])
end
to create all model strings (~15k different ones, 1M lines of code).
Useful for testing and possibly for automatic building, etc.
Pumas.getblock
— Methodgetblock(blockname, model::AbstractPumasModel) -> array of expressions for a PumasModel macro block.
The blockname
should match the name of the blocks used for model creation with the @model
or @emmodel
macro. Only the blocks that were explicitly used in model creation can be used. Valid such blocknames
are :covariates
, :derived
, :dynamics
, :dosecontrol
, :init
, :observations
, :options
, :param
, :post
, :pre
, :random
, :reactions
and :vars
. Only models that are defined using the @model
or @emmodel
macro are supported.
Pumas.icoef
— Functionicoef(model::PumasModel, subject::Subject, param::NamedTuple, approx::LikelihoodApproximation = LaplaceI(); obstimes, diffeq_options::NamedTuple = NamedTuple())::Vector{ConstantInterpolationStructArray}
Return the individual coefficients for the input arguments. The individual coefficients are the variables defined in the @pre
block in the @model
macro. The empirical bayes estimates given the input param
will be calculated if the input subjects have observations. If any subject does not have any observations, the mode of the domain-transformed prior will be used. The function returns a vector of covariate interpolation objects defined on obstimes
. If no obstimes
are given, the time points used to define the covariates will be used. Each of them can be evaluated at any time point, but the interpolant is piecewise constant between the obstimes
. Each of the covariate interpolation objects can be converted to a DataFrame
by calling the DataFrame
constructor on each element, or a DataFrame
with all subjects can be obtained by calling the DataFrame
constructor on the vector output by this function.
Pumas.icoef
— Methodicoef(fpm::AbstractFittedPumasModel; obstimes)::Vector{ConstantInterpolationStructArray}
Return the individual coefficients from fpm
. The individual coefficients are the variables defined in the @pre
block in the @model
macro. The function returns a vector of covariate interpolation objects defined on obstimes
. If no obstimes
are given, the time points used to define the covariates will be used. Each of them can be evaluated at any time point, but the interpolant is piecewise constant between the obstimes
. Each of the covariate interpolation objects can be converted to a DataFrame
by calling the DataFrame
constructor on each element, or a DataFrame
with all subjects can be obtained by calling the DataFrame
constructor on the vector output by this function.
Pumas.icoef
— Methodicoef(model::PumasModel, population::Population, param::NamedTuple; kwargs...)::Vector{ConstantInterpolationStructArray}
Return the individual coefficients for the input arguments. The individual coefficients are the variables defined in the @pre
block in the @model
macro. The empirical bayes estimates given the input param
will be calculated if the input subjects have observations. If any subject does not have any observations, the mode of the domain-transformed prior will be used. The function returns a vector of covariate interpolation objects defined on obstimes
. If no obstimes
are given, the time points used to define the covariates will be used. Each of them can be evaluated at any time point, but the interpolant is piecewise constant between the obstimes
. Each of the covariate interpolation objects can be converted to a DataFrame
by calling the DataFrame
constructor on each element, or a DataFrame
with all subjects can be obtained by calling the DataFrame
constructor on the vector output by this function.
Pumas.infer
— Methodinfer(fpm::FittedPumasModel; level=0.95, rethrow_error::Bool=false, sandwich_estimator::Bool=true) -> FittedPumasModelInference
Compute the vcov
matrix and return a struct used for inference based on the fitted model fpm
. The confidence intervals are calculated as the (1-level)/2
and (1+level)/2
quantiles of the estimated parameters. sandwich_estimator
is a boolean that switches on or off the sandwich estimator. If rethrow_error
is false
(the default value), no error will be thrown if the variance-covariance matrix estimator fails. If it is true
, an error will be thrown if the estimator fails.
Pumas.infer
— Methodinfer(fpm::FittedPumasModel, bts::Pumas.Bootstrap; level=0.95)
Perform bootstrapping by resampling the Subject
s from the Population
stored in fpm
. The confidence intervals are calculated as the (1-level)/2
and (1+level)/2
quantiles of the estimated parameters. The number of samples
used in the bootstrapping is bts.samples
. bts.ensemblealg
specifies the ensemblealg
used here. If ensemblealg
is EnsembleSerial()
, a single thread will be used. If it is EnsembleThreads()
(the default value), multiple threads will be used. See the documentation of Bootstrap for more details on constructing an instance of Bootstrap
.
Pumas.infer
— Methodinfer(fpm::FittedPumasModel, alg::MarginalMCMC; level=0.95)
Infer the uncertainty of the parameters of the fitted model fpm
by sampling from the marginal likelihood with the MCMC algorithm alg
.
The confidence intervals are calculated as the (1-level)/2
and (1+level)/2` quantiles of the sampled parameters.
Pumas.infer
— Methodinfer(fpm::FittedPumasModel, sir::SIR; level=0.95, ensemblealg = EnsembleThreads()) -> FittedPumasModelInference
Perform sampling importance re-sampling for the Subject
s from the Population
stored in fpm
. The confidence intervals are calculated as the (1-level)/2
and (1+level)/2quantiles of the estimated parameters.
ensemblealgcan be
EnsembleThreads()(the default value) to use multi-threading or
EnsembleSerial()` to use a single thread.
Pumas.init_params
— Functioninit_params(model::PumasModel)
Create a parameter set populated with the initial parameters of PumasModel
.
Pumas.init_randeffs
— Functioninit_randeffs(model::AbstractPumasModel, param::NamedTuple, [, pop::Population])
Create an object with random effects locked to their mean values.
The optional argument pop
takes a Population
and changes the output to a vector of such random effects with one element for each subject within the population.
Pumas.inspect
— Methodinspect(fpm::AbstractFittedPumasModel; wres_approx::LikelihoodApproximation, nsim::Int, rng)
Output a summary of the model predictions, residuals, Empirical Bayes estimates, and NPDEs (when requested).
Called on a fit
output and allows the keyword argument wres_approx
for approximation method to be used in residual calculation. The default value is the approximation method used for the marginal likelihood calculation in the fit
that produced fpm
. The keyword nsim
controls the number of times each subject is simulated for npde
computation. A FittedPumasModelInspection
object with pred
, wres
, ebes
, ewres
, and npdes
is output.
Pumas.lrtest
— Methodlrtest(fpm_0::AbstractFittedPumasModel, fpm_A::AbstractFittedPumasModel)::LikelihoodRatioTest
Compute the likelihood ratio test statistic of the null hypothesis defined by fpm_0
against the the alternative hypothesis defined by fpm_A
. The pvalue
function can be used for extracting the p-value based on the asymptotic Χ²(k)
distribution of the test statistic.
Pumas.lrtest
— Methodlrtest(loglikelihood_0::Real, loglikelihood_A::Real, dof_0::Integer, dof_A::Integer)::LikelihoodRatioTest
Compute the likelihood ratio test using number.
Pumas.marginal_nll
— Functionmarginal_nll(model, subject, param, approx, ...)
marginal_nll(model, population, param, approx, ...)
Compute the marginal negative loglikelihood of a subject or dataset, using the integral approximation approx
.
See also loglikelihood
.
Pumas.npde
— Methodnpde(fpm::AbstractFittedPumasModel; nsim::Integer)
Calculate the normalised prediction distribution errors (NPDE).
The keyword nsim
controls the number of times each subject is simulated.
Pumas.penalized_conditional_nll
— Methodpenalized_conditional_nll(m::AbstractPumasModel, subject::Subject, param::NamedTuple, randeffs::NamedTuple; diffeq_options)
Compute the penalized conditional negative log-likelihood. This is the same as conditional_nll
, except that it incorporates the penalty from the prior distribution of the random effects. This penalty is equal up to a constant to the log-probability-density of the given randeffs
under the prior distribution.
Here param
can be either a NamedTuple
or a vector (representing a transformation of the random effects to Cartesian space).
Pumas.postprocess
— Methodpostprocess(obs::SimulatedObservations)
postprocess(func::Function, obs::SimulatedObservations)
postprocess(obs::SimulatedPopulation; stat = identity)
postprocess(func::Function, obs::SimulatedPopulation; stat = identity)
Post-processes the output of simobs
. The output of a simobs
call stores the simulated observations but also all the intermediate values computed such as the parameters used, individual coefficients, dose control parameters, covariates, differential equation solution, etc. There are a number of post-processing operations you can do on the simulation output to compute various queries and summary statistics based on the simulations.
The postprocess
function is a powerful tool that allows you to make various queries using the simulation results. There are multiple ways to use the postprocess
function. The first way to use the postprocess
function is to extract all the information stored in the simulation result in the form of a vector of named tuples. Each named tuple has all the intermediate values evaluated when simulating 1 run. Let sims
be the output of any simobs
operation.
generated = Pumas.postprocess(sims)
generated
is a vector of simulation runs. Hence, generated[i]
is the result from the i
th simulation run. Time-dependent variables are stored as a vector with 1 element for each time point where there is an observation. Alternatively, if obstimes
was set instead when calling simobs
, the time-dependent variables will be evaluated at the time points in obstimes
instead.
The second way to use postprocess
is by passing in a post-processing function. The post-processing function can be used to:
- Transform the simulated quantities, or
- Compare the simulated quantities to the observations.
We use the do
syntax here which is short for passing in a function as the first argument to postprocess
. The do
syntax to pass in a post-processing function is:
Pumas.postprocess(sims) do gen, obs
...
end
where gen
is the named tuple of all generated quantities from 1 simulation run and obs
is the named tuple of observations. For instance to query the ratio of simulated observations dv
that are higher than the observed quantity dv
at the respective time points, you can use:
Pumas.postprocess(sims) do gen, obs
sum(gen.dv .> obs.dv) / length(gen.dv)
end
gen.dv
is the simulated vector of dv
whose length is the same as the number of observation time points. obs.dv
is the observation vector dv
. gen.dv .> obs.dv
returns a vector of true
/false
, with one element for each time point. The sum of this vector gives the number of time points where the simulation was higher than the observation. Dividing by the number of time points gives the ratio. When using postprocess
in this way, the output is always a vector of the query results. In the query function body, you can choose to use only gen
or only obs
but the header must always have both gen
and obs
:
Pumas.postprocess(sims) do gen, obs
...
end
The third way to use the postprocess
function is to compute summary statistics of the simulated quantities or of functions of the simulated quantities. Summary statistics can be computed by passing a statistic function as the stat
keyword argument. For example in order to estimate the probability that a simulated value is higher than an observation, you can use:
Pumas.postprocess(sims, stat = mean) do gen, obs
gen.dv .> obs.dv
end
This function will do 2 things:
- Concatenate the query results (e.g.
gen.dv .> obs.dv
) from all the simulation runs into a single vector. - Compute the mean value of the combined vector.
Alternatively, you can use the mean
function to do the same thing without using the keyword argument. The following will call the postprocess
function under the hood.
mean(sims) do gen, obs
gen.dv .> obs.dv
end
The result of this operation will be a scalar equal to the mean value of the concatenated vector of queries.
In order to get the probability that the simulated quantity is higher than the observation for each time point, you can call the mean
function externally as such:
generated = Pumas.postprocess(sims) do gen, obs
gen.dv .> obs.dv
end
mean(generated)
This returns a vector of probabilities of the same length as the number of time points without concatenating all the queries together.
To compute a summary statistic of all the generated quantity, you can also call:
Pumas.postprocess(sims, stat = mean)
without specifying a post-processing function or for short:
mean(sims)
Besides mean
, you can also use any of the following summary statistic functions in the same way:
std
for element-wise standard deviationvar
for element-wise variancecor
for correlation between multiple quantitiescov
for covariance between multiple quantities
These functions can be passed in as the stat
keyword argument to postprocess
or they can be used in the short form, e.g.:
generated = Pumas.postprocess(sims, stat = std) do gen, obs
...
end
std(sims) do gen, obs
...
end
std(sims)
generated = Pumas.postprocess(sims, stat = var) do gen, obs
...
end
var(sims) do gen, obs
...
end
var(sims)
The cor
and cov
statistics are unique in that they require a post-processing function that outputs a vector. For example, to estimate the correlation between the CL
and Vc
parameters in a 1-compartment model, you can use any of the following:
Pumas.postprocess(s, stat = cor) do gen, obs
[gen.CL[1], gen.Vc[1]]
end
cor(s) do gen, obs
[gen.CL[1], gen.Vc[1]]
end
Note that gen.CL
is a vector of simulated CL
values for all the time points. But since the value is constant across time, we can use the first element gen.CL[1]
only. cov
can be used instead of cor
to compute the covariance matrix. The output of this operation is either a correlation or a covariance matrix.
Pumas.probstable
— Methodprobstable(fpm::FittedPumasModel)
Return a DataFrame with outcome probabilities of all discrete dependent variables.
Pumas.pvalue
— Methodpvalue(t::LikelihoodRatioTest)::Real
Compute the p-value of the likelihood ratio test t
based on the asymptotic Χ²(k)
distribution of the test statistic.
Pumas.qradd!
— Methodqradd!(Q::AbstractMatrix, R::UpperTriangular, v::AbstractVector)
Replace the right-most column of F = Q * R
with v
by updating Q
and R
, and return the updated entry R[end, end]
indicating the norm of the projection of the new column to the new orthogonal vector Q[:, end]
.
If v
is mutable, this implementation modifies vector v
as well. Only Q
and R
are valid on exit.
Pumas.qrdelete!
— Methodqrdelete!(Q, R)
Delete the left-most column of F = Q * R
by updating Q
and R
.
Only Q[:, begin:(end-1)]
and R[begin:(end-1), begin:(end-1)]
are valid on exit.
Pumas.read_pumas
— Methodread_pumas(filepath::AbstractString; missingstring = ["", ".", "NA"], kwargs...)
read_pumas(df::AbstractDataFrame; kwargs...)
Import NMTRAN-formatted data. You can either pass a CSV file path or a data frame as the first and only positional argument.
Keyword Arguments
observations
(default:[:dv]
): a vector of column names of dependent variables.covariates
(default:Symbol[]
): a vector of column names of covariates.id::Symbol
(default::id
): the name of the column with the IDs of the individuals. Each individual should have a unique integer or string.time::Symbol
(default::time
): the name of the column with the time corresponding to the row. Time should be unique per ID, i.e., there should be no duplicate time values for a given subject.evid::Union{Symbol,Nothing}
(default:nothing
): the name of the column with event IDs, ornothing
. Possible event IDs are:0
: observation1
: dose event2
: other type event3
: reset event (the amounts in each compartment are reset to zero and the on/off status of each compartment is reset to its initial status)4
: reset and dose event
The event ID defaults to 0 if the dose amount is 0 or missing, and 1 otherwise.
amt::Symbol
(default::amt
): the name of the column of dose amounts. If the event ID is specified and non-zero, the dose amount should be non-zero. The default dose amount is 0.addl::Symbol
(default::addl
): the name of the column that indicates the number of repeated dose events. The number of additional doses defaults to 0.ii
(default::ii
): the name of the column of inter-dose intervals. When the number of additional doses is specified and non-zero, this is the time to the next dose. For steady-state events with multiple infusions or bolus doses, this is the time between implied doses. The default inter-dose interval is 0. It is required to be non-zero for steady-state events with multiple infusions or bolus doses, and it is required to be zero for steady-state events with constant infusion.cmt::Symbol
(default::cmt
): the name of the column with the compartment to be dosed. Compartments can be specified by integers, strings andSymbol
s. The default compartment is 1.rate::Symbol
(default::rate
): the name of the column with the rate of administration. A rate=-2 allows the rate to be determined by Dose Control Parameters (DCP). Defaults to 0. Possible values are:0
: instantaneous bolus dose>0
: infusion dose administered at a constant rate for a duration equal toamt/rate
-2
: infusion rate or duration specified by the dose control parameters (see@dosecontrol
)
ss::Symbol
(default::ss
): the name of the column that indicates whether a dose is a steady-state dose. A steady-state dose is defined as the result of having applied the same dose from the infinite past. Possible values of the steady-state indicator are:0
: dose is not a steady state dose.1
: dose is a steady state dose, and the compartment amounts are to be reset to the steady-state amounts resulting from the given dose. Compartment amounts resulting from prior dose event records are "zeroed out", and infusions in progress or pending additional doses are cancelled.2
: dose is a steady state dose and the compartment amounts are to be set to the sum of the steady-state amounts resulting from the given dose plus the amounts at the event time were the steady-state dose not given.
The default value is 0.
route::Symbol
: the name of the column that specifies the route of administration.mdv::Union{Symbol,Nothing}
(default:nothing
): the name of the column that indicates if observations aremissing
, ornothing
.event_data::Bool
(default:true
): toggles assertions applicable to event data. More specifically, checks if the following columns are present in theDataFrame
, either as the default values or as user-defined values::id
,:time
, and:amt
. If no:evid
column is present, then a warning will be thrown, and:evid
is set to1
when:amt
values are>0
or notmissing
, or:evid
is set to0
when:amt
values aremissing
and observations are notmissing
. Otherwise,read_pumas
will throw an error.covariates_direction::Symbol
(default::left
): the direction of covariate interpolation. Either:left
(Last Observation Carried Forward, LOCF) (default), or:right
(Next Observation Carried Backward, NOCB). Notice, that for models with occasion variables it is important to use:left
for the correct behavior of the interpolation.check::Bool
(default:event_data
): toggles NMTRAN compliance check of the input data. More specifically, checks if the following columns are present in theDataFrame
, either as the default values or as user-defined values::id
,:time
,:amt
,:cmt
,:evid
,:addl
,:ii
,:ss
, and:route
. Additional checks are:- all variables in
observations
are numeric, i.e.Integer
orAbstractFloat
. :amt
column is numeric, i.e.Integer
orAbstractFloat
.:cmt
column is either a positiveInteger
, anAbstractString
, or aSymbol
.:amt
values must bemissing
or0
whenevid = 0
; or>=0
otherwise.- all variables in
observation
aremissing
whenevid = 1
. :ii
column must be present if:ss
is specified or present.:ii
values must bemissing
or0
whenevid = 0
.:ii
column must be>0
if:addl
values are>0
, and vice-versa.:addl
column, if present, must be>=0
whenevid = 1
.:evid
values must be!=0
when:amt
values are>0
, or:addl
and:ii
values are>0
.
- all variables in
adjust_evid34::Bool
(default:true
): toggles adjustment of time vector for reset events (evid = 3
andevid = 4
). Iftrue
(the default) then the time of the previous event is added to the time on record to ensure that the time vector is monotonically increasing.
If a column does not exist, its values are imputed to be the default values.
Pumas.remove_italics
— Methodremove_italics(ex, vars)
Ensure that the variables specified in vars
are not italicized during latexification of ex
.
Pumas.sample_params
— Methodsample_params(model::PumasModel; rng = Pumas.default_rng())
Create a parameter set populated with values for the parameters of a PumasModel
. Random values will be drawn from the prior distributions for parameters that have a prior. Otherwise, a deterministicitic value will be assigned to the parameter similar to init_params
.
Pumas.sample_randeffs
— Functionsample_randeffs([rng::AbstractRNG=Random.default_rng(),] model::AbstractPumasModel, param::NamedTuple [, pop::Population])
Generate a random set of random effects for model model
, using parameters param
. Optionally, a random number generator object rng
can be passed as the first argument.
The optional argument pop
takes a Population
and changes the output to a vector of such random effects with one element for each subject within the population.
Pumas.simobs
— Functionsimobs(pmi::FittedPumasModelInference, population::Population = pmi.fpm.data, randeffs::Union{Nothing, AbstractVector{<:NamedTuple}} = nothing; samples::Int, rng = default_rng(), kwargs...,)
Simulates observations from the fitted model using a truncated multivariate normal distribution for the parameter values when a covariance estimate is available and using non-parametric sampling for simulated inference and bootstrapping. When a covariance estimate is available, the optimal parameter values are used as the mean and the variance-covariance estimate is used as the covariance matrix. Rejection sampling is used to avoid parameter values outside the parameter domains. Each sample uses a different parameter value. samples
is the number of samples to sample. population
is the population of subjects which defaults to the population associated the fitted model. randeffs
can be set to a vector of named tuples, one for each sample. If randeffs
is not specified (the default behaviour), it will be sampled from its distribution.
Pumas.simobs
— Functionsimobs(fpm::FittedPumasModel, [population::Population,] vcov::AbstractMatrix, randeffs::Union{Nothing, AbstractVector{<:NamedTuple}} = nothing; samples::Int, rng = default_rng(), kwargs...)
Simulates observations from the fitted model using a truncated multi-variate normal distribution for the parameter values. The optimal parameter values are used for the mean and the user supplied variance-covariance (vcov
) is used as the covariance matrix. Rejection sampling is used to avoid parameter values outside the parameter domains. Each sample uses a different parameter value. samples
is the number of samples to sample. population
is the population of subjects which defaults to the population associated the fitted model so it's optional to pass. randeffs
can be set to a vector of named tuples, one for each sample. If randeffs
is not specified (the default behaviour), it will be sampled from its distribution.
Pumas.simobs
— Functionsimobs(
model::AbstractPumasModel,
population::Union{Subject, Population},
param,
randeffs=sample_randeffs(model, param, population);
obstimes=nothing,
ensemblealg=EnsembleSerial(),
diffeq_options=NamedTuple(),
rng=Random.default_rng(),
simulate_error=Val(true),
)
Simulate random observations from model
for population
with parameters param
at obstimes
(by default, use the times of the existing observations for each subject). If no randeffs
is provided, then random ones are generated according to the distribution in the model.
Arguments
model
may either be aPumasModel
or aPumasEMModel
.population
may either be aPopulation
ofSubject
s or a singleSubject
.param
should be either a single parameter set, in the form of aNamedTuple
, or a vector of such parameter sets. If a vector then each of the parameter sets in that vector will be applied in turn. Example:(; tvCL=1., tvKa=2., tvV=1.)
randeffs
is an optional argument that, if used, should specify the random effects for each subject inpopulation
. Such random effects are specified byNamedTuple
s forPumasModels
(e.g.(; tvCL=1., tvKa=2.)
) and byTuples
forPumasEMModel
s (e.g.(1., 2.)
). Ifpopulation
is a singleSubject
(without being enclosed in a vector) then a single such random effect specifier should be passed. If, however,population
is aPopulation
of multipleSubject
s thenrandeffs
should be a vector containing one such specifier for eachSubject
. The functionsinit_randeffs
,zero_randeffs
, andsample_randeffs
are sometimes convenient for generatingrandeffs
:
randeffs = zero_randeffs(model, param, population)
solve(model, population, param, randeffs)
If no randeffs
is provided, then random ones are generated according to the distribution in the model.
obstimes
is a keyword argument where you can pass a vector of times at which to simulate observations. The default,nothing
, ensures the use of the existing observation times for eachSubject
.ensemblealg
is a keyword argument that allows you to choose between different modes of parallelization. Options includeEnsembleSerial()
,EnsembleThreads()
andEnsembleDistributed()
.diffeq_options
is a keyword argument that takes aNamedTuple
of options to pass on to the differential equation solver.rng
is a keyword argument where you can specify which random number generator to use.simulate_error
is a keyword argument that allows you to choose whether a sample (Val(true)
ortrue
) or the mean (Val(false)
orfalse
) of the predictive distribution's error model will be returned for each subject.
Pumas.simobs
— Methodsimobs(b::BayesMCMCResults; subject = nothing, samples::Int = 10, simulate_error = false, rng::AbstractRNG = default_rng())
Simulates model predictions using the posterior samples for the population and subject-specific parameters. The predictions are either for the subject with index subject
or the whole population if no index is input. samples
is the number of simulated predictions returned per subject. If simulate_error
is false
, the mean of the predictive distribution's error model will be returned, otherwise a sample from the predictive distribution's error model will be returned for each subject. rng
is the pseudo-random number generator used in the sampling.
Pumas.simobs
— Methodsimobs(model::PumasModel, population::Population; samples::Int = 10, simulate_error = true, rng::AbstractRNG = default_rng())
Simulates model predictions for population
using the prior distributions for the population and subject-specific parameters. samples
is the number of simulated predictions returned per subject. If simulate_error
is false
, the mean of the predictive distribution's error model will be returned, otherwise a sample from the predictive distribution's error model will be returned for each subject. rng
is the pseudo-random number generator used in the sampling.
Pumas.simobs
— Methodsimobs(model::PumasModel, subject::Subject; samples::Int = 10, simulate_error = true, rng::AbstractRNG = default_rng())
Simulates model predictions for subject
using the prior distributions for the population and subject-specific parameters. samples
is the number of simulated predictions returned. If simulate_error
is false
, the mean of the predictive distribution's error model will be returned, otherwise a sample from the predictive distribution's error model will be returned for each subject. rng
is the pseudo-random number generator used in the sampling.
Pumas.simobstte
— Functionsimobstte(
model::PumasModel,
subject::Union{Subject,Population},
param::NamedTuple,
randeffs::Union{Vector{<:NamedTuple}, NamedTuple, Nothing}=nothing;
minT::Real=0.0,
maxT::Real,
nT::Integer=10,
repeated::Bool=false,
rng::AbstractRNG = default_rng())
Simulate observations from a time-to-event model, i.e. one with a TimeToEvent
dependent variable and return either a new Subject
or Population
with random event time stamps in the time
vector. The observations vector contains the censoring information (see TimeToEvent
for more details on the data coding).
The simobstte
function first computes nT
values of the survival probability from t=minT
to maxT
and then interpolates with a cubic spline to get a smooth survival function. The user must supply maxT
. Given a survival function, it is possible to simulate from the distribution by using inverse cdf sampling. Instead of sampling a uniform variate to be compared with the survival function, we sample an exponential variate and compare with cumulative hazard. The two approaches are equivalent. The Roots package is then used for computing the root. If repeated=true
then new event times are drawn until maxT
has been reached.
Pumas.strip_t
— Methodstrip_t(ex, vars)
Strip the (t)
from the vars
in the expression ex
.
Works both when (t)
is appended using expressions, :(x(t))
, or symbols, Symbol("x(t)")
.
Pumas.successful_fits
— Methodsuccessful_fits(vfpm::Vector{<:AbstractFittedPumasModel)
successful_fits(fpm::FittedPumasModelInference{<:Any, <:Bootstraps})
successful_fits(fpm::Bootstraps)
Returns a vector of all successful fits stored in a vector of fits or the results of Bootstrap
inference. Can be compared to the input number of samples or length of vector to calculate the fraction of successful fits.
Pumas.tad
— Methodtad(t, events) -> time after most recent dose
Converts absolute time t
(scalar or array) to relative time after the most recent dose. If t
is earlier than all dose times, then the (negative) difference between t
and the first dose time is returned instead. If no dose events exist, t
is returned unmodified.
Pumas.truncated_latent
— Methodtruncated_latent(
distribution::UnivariateDistribution,
lower::Union{Real,Nothing},
upper::Union{Real,Nothing},
)
truncated_latent(
distribution::UnivariateDistribution;
lower::Union{Real,Nothing} = nothing,
upper::Union{Real,Nothing} = nothing,
)
Construct a truncated distribution based on distribution
and the truncation limits lower
and upper
. When fitting a model with a truncated_latent
dependent variable, the log probability of the truncated distribution is used in the likelihood calculation. However, when calculating the mean or other summary statistics, the underlying untruncated distribution is used.
Pumas.variables
— Functionvariables(dynobj)
Get a list of variable names for a model or dynamics object.
Pumas.vech
— Methodvech(A; diag = true)
Converts the symmetric matrix A
to a vector of lower triangular elements. If diag
is set to true, the vector will include the diagonal elements of A
. If diag
is false, the vector will only include the strict lower triangle and the diagonal elements of A
will be ignored. For example when diag
is true:
A = [
1.0 0.1
0.1 2.0
]
Pumas.vech(A, diag = true)
gives
3-element Vector{Float64}:
1.0
0.1
2.0
And when diag
is false:
Pumas.vech(A, diag = false)
gives
1-element Vector{Float64}:
0.1
Pumas.vechinv
— Methodvechinv(x; diag = nothing, lower = nothing, lower = true)
Converts a vector of lower or upper triangular elements (traversed column by column) to a symmetric matrix. If diag
is set to nothing
, the vector is assumed to include diagonal elements. Otherwise, the vector is assumed to only include the strict triangle and the diagonal elements of the output matrix will be set to the value of diag
. The keyword argument lower
can be used to specify whether the vector is that of lower triangular elements or upper triangular elements. For example when lower
is true and diag
is not set:
Pumas.vechinv([1.0, -0.1, 2.0], lower = true)
gives
2×2 Matrix{Float64}:
1.0 -0.1
-0.1 2.0
And when diag
is 1:
Pumas.vechinv([-0.1], diag = 1)
gives
2×2 Matrix{Float64}:
1.0 -0.1
-0.1 1.0
Pumas.vpc
— Methodvpc(
sims;
qreg_method = IP(),
observations::Array{Symbol}, # defaults to all observations
stratify_by::Array{Symbol} = Symbol[],
quantiles::NTuple{3,Float64} = (0.1, 0.5, 0.9),
level::Real = 0.95,
ensemblealg = EnsembleThreads(),
ensemblealg = EnsembleSerial(),
bandwidth::Real = 2,
maxnumstrats::Array{<:Integer} = [6 for i = 1:length(stratify_by)],
covariates::Array{Symbol} = [:time],
smooth::Bool = true,
)
Pumas.vpc
— Methodvpc(
fpm::AbstractFittedPumasModel;
samples::Integer = 499
qreg_method = IP(),
observations::Vector{Symbol} = [keys(fpm.data[1].observations)[1]],
stratify_by::Vector{Symbol} = Symbol[],
quantiles::NTuple{3,Float64}=(0.1, 0.5, 0.9),
level::Real=0.95,
ensemblealg=EnsembleThreads(),
bandwidth=nothing,
maxnumstrats=[6 for i in 1:length(stratify_by)],
covariates::Vector{Symbol} = [:time],
smooth::Bool = true,
rng::AbstractRNG=default_rng(),
nodes::Dict = Dict{Tuple{},Vector{Float64}}(),
nnodes::Integer = 0,
prediction_correction = false,
)
Computes the quantiles for VPC for a FittedPumasModel
or FittedPumasEMModel
with simulated confidence intervals around the empirical quantiles based on samples
simulated populations.
The following keyword arguments are supported:
samples
: The number of simulated populations to generate, defaults to499
quantiles::NTuple{3,Float64}
: A three-tuple of the quantiles for which the quantiles will be computed. The default is(0.1, 0.5, 0.9)
which computes the 10th, 50th and 90th percentile.level::Real
: Probability level to use for the simulated confidence intervals. The default is0.95
.observations::Vector{Symbol}
: The name of the dependent variable to use for the VPCs. The default is the first dependent variable in thePopulation
.stratify_by
: The covariates to be used for stratification. Takes anVector
of theSymbol
s of the stratification covariates.ensemblealg
: This is the parallelization choice.bandwidth
: The kernel bandwidth in the quantile regression. If you are seeingNaN
s or an error, increasing the bandwidth should help in most cases. With higher values of thebandwidth
you will get more smoothened plots of the quantiles so it's a good idea to check with your data to pick the rightbandwidth
. For a DiscreteVPC, the default value is nothing to indicate automatic bandwidth tuning. For a CensoredVPC, the bandwidth should be given as a tuple where the first value applies to the ContinuousVPC and the second to the DiscreteVPC for the censoring variable. For example,(4.0, nothing)
applies a bandwidth of4.0
to the ContinuousVPC and automatically tunes the DiscreteVPC. Automatic tuning is only applicable to DiscreteVPCs at this moment.maxnumstrats
: The maximum number of strata for each covariate. Takes an Vector with the number of strata for the corresponding covariate, passed instratify_by
. It defaults to6
for each of the covariates.covariates
: The independent variable for VPC, defaults to[:time]
.smooth
: In case of discrete VPC used to determine whether to interpolate the dependent variable if independent variable is continuous, defaults totrue
.rng
: A random number generator, uses thedefault_rng
fromRandom
as default.nodes
: A Dict specifying the local regression nodes. Ifstratify_by
argument has been specified, the dict should have tuples of values as keys corresponding to each stratum. Otherwise, the key should be an empty tuple. The values of the Dict should be the nodes of the local regressions. If thenodes
argument is not passed, at mostnnodes
quantiles is used from each stratum.nnodes
: An integer specifying the number of points to use for the local regressions for each stratum. The default value is 11 but the actual number of nodes will not exceed the number of unique covariate values in the stratum.qreg_method
: Defaults toIP()
. For most users the method used in quantile regression is not going to be of concern, but if you see large run times switchingqreg_method
toIP(true)
should help in improving the performance with a tradeoff in the accuracy of the fitting.prediction_correction
: Default tofalse
. Set totrue
to enable prediction correction that multiplies all observations with the ratio between the mean prediction and each individuals population prediction.
Pumas.wresiduals
— Functionwresiduals(fpm::AbstractFittedPumasModel, approx::LikelihoodApproximation; nsim=nothing)
Calculate the individual and population weighted residual.
Takes a fit
result, an approximation method for the marginal likelihood calculation which defaults to the method used in the fit
and the number of simulations with the keyword argument nsim
. If nsim
is specified only the Expected Simulation based Individual Weighted Residuals (EIWRES) is included in the output as individual residual and population residual is not computed. Using the FO
approximation method corresponds to the WRES and while FOCE(I)
corresponds to CWRES. The output is a SubjectResidual
object that stores the population (wres
) and individual (iwres
) residuals along with the subject
and approximation method (approx
).
Pumas.zero_randeffs
— Functionzero_randeffs(model::AbstractPumasModel, param::NamedTuple)
zero_randeffs(model::AbstractPumasModel, pop::Population, param::NamedTuple)
Create an object to signify that the random effects should be zero.
The optional argument pop
takes a Population
and changes the output to a vector of such random effects with one element for each subject within the population.
Pumas.ηshrinkage
— Methodηshrinkage(fpm::AbstractFittedPumasModel)
Calculate the η-shrinkage.
Takes the result of a fit
as the only input argument. A named tuple of the random effects and corresponding η-shrinkage values is output.
Pumas.ϵshrinkage
— Methodϵshrinkage(fpm::AbstractFittedPumasModel)
Calculate the ϵ-shrinkage.
Takes the result of a fit
as the only input argument. A named tuple of derived variables and corresponding ϵ-shrinkage values is output.
StatsAPI.aic
— Methodaic(fpm::AbstractFittedPumasModel)
Calculate the Akaike information criterion (AIC) of the fitted Pumas model fpm
.
StatsAPI.bic
— Methodbic(fpm::AbstractFittedPumasModel)
Calculate the Bayesian information criterion (BIC) of the fitted Pumas model fpm
.
StatsAPI.coeftable
— Methodcoeftable(pmi::FittedPumasModelInference; level = pmi.level) -> DataFrame
Construct a DataFrame of parameter names, estimates, standard error, and confidence interval with confidence level level
. The default value of level
is the one that was passed in infer
.
StatsAPI.coeftable
— Methodcoeftable(fpm::FittedPumasModel) -> DataFrame
Construct a DataFrame of parameter names and estimates from fpm
.
StatsAPI.coeftable
— Methodcoeftable(vfpm::Vector{<:FittedPumasModel}) -> DataFrame
Construct a DataFrame of parameter names and estimates and their standard deviation from vector of fitted single-subject models vfpm
.
StatsAPI.fit
— Methodfit(fpm::AbstractFittedPumasModel; kwargs...)
Convenience method to start a new fit where an old one left off.
This uses both the fitted parameters and the fitted random effects from fpm
as the starting points for the new fit
.
StatsAPI.fit
— MethodDistributions.fit(
model::PumasModel,
data::Population,
param::NamedTuple,
alg::BayesMCMC,
)
Samples from the posterior distribution of the model
's population and subject-specific parameters given the observations in data
using the Hamiltonian Monte Carlo (HMC) based No-U-Turn Sampler (NUTS) algorithm. The MCMC sampler is initialized starting from param
for the population parameters. The subject-specific parameters are also apprioriately initialized. The Bayesian inference algorithm's options are passed in the alg
struct. Run ?BayesMCMC
for more details on the options that can be set.
StatsAPI.fit
— Methodfit(
model::PumasModel,
population::Population,
param::NamedTuple,
approx::Union{LikelihoodApproximation, MAP};
optim_alg = Optim.BFGS(linesearch = Optim.LineSearches.BackTracking(), initial_stepnorm = 1.0),
optim_options::NamedTuple = NamedTuple(),
optimize_fn = nothing,
constantcoef::Tuple = (),
ensemblealg::SciMLBase.EnsembleAlgorithm = EnsembleThreads(),
checkidentification = true,
diffeq_options = NamedTuple(),
verbose = true,
ignore_numerical_error = true,
)
Fit the Pumas model model
to the dataset population
with starting values param
using the estimation method approx
. Currently supported values for the approx
argument are FO
, FOCE
, LaplaceI
, NaivePooled
, and BayesMCMC
. See the online documentation for more details about the different methods.
To control the optimization procedure for finding population parameters (fixed effects), use the optim_alg
and optim_options
keyword arguments. In previous versions of Pumas the argument optimize_fn
was used, but is now discouraged and will be removed in a later version of Pumas. These options control the optimization all approx
methods except BayesMCMC
. The default optimization function uses the quasi-Newton routine BFGS
method from the Optim
package. It can be changed by setting the optim_alg
to an algorithm implemented in Optim.jl as long as it does not use second order derivatives. Optimization specific options can be passed in using the optim_options
keyword and has to be a NamedTuple with names and values that match the Optim.Options
type. For example, the optimization trace can be disabled and the algorithm can be changed to L-BFGS by passing optim_alg=Optim.LBFGS(), optim_options = ;(show_trace=false)
to fit
. See Optim
for more defails.
It is possible to fix one or more parameters of the fit by passing a Tuple of Symbols as the constantcoef
argument with elements corresponding to the names of the fixed parameters, e.g. constantcoef=(:σ,)
.
When models include an @random
block and fitting with NaivePooled
is requested, it is required that the user sets all variability parameters to zero with constantcoef
such that these can be ignored in the optimization, e.g. constantcoef = (:Ω,)
while overwriting the corresponding values in param
with (; init_params..., Ω = zeros(3, 3))
.
Parallelization of the optimization is supported for most estimation methods via the ensemble interface of DifferentialEquations.jl. Currently, the only supported options are:
EnsembleThreads()
: the default. Accelerate fits by using multiple threads.EnsembleSerial()
: fit using a single thread.EnsembleDistributed()
: fit by using multiple worker processes.
The fit
function will check if any gradients and throw an exception if any of the elements are exactly zero unless checkidentification
is set to false
.
Further keyword arguments can be passed via the diffeq_options
argument. This allows for passing arguments to the differential equations solver such as alg
, abstol
, and reltol
. The default values for these are AutoVern7(Rodas5P(autodiff=true))
, 1e-12
, and 1e-8
respectively. See the DifferentialEquations.jl documentation for more details.
The keyword verbose
controls if info statements about initial evaluations of the loglikelihood function and gradient should be printed or not. Defaults to true.
Since the numerical optimization routine can sometimes visit extreme regions of the parameter space during it's exploration we automatically handle the situation where the input parameters result in errors when solving the dynamical system. This allows the algorithm to recover and continue in many situations that would have otherwise stalled early. Sometimes, it is useful to turn this error handling off when debugging a model fit, and this can be done by setting ignore_numerical_error = false
.
StatsAPI.fit
— Methodfit(
model::PumasEMModel,
population::Population,
param::NamedTuple,
approx::Union{SAEM,LaplaceI};
ensemblealg = EnsembleThreads(),
rng = default_rng()
)
Fit the PumasEMModel
to the dataset population
using the initial values specified in param
. The supported methods for the approx
argument are currently SAEM
and LaplaceI
. See the online documentation for more details about these two methods.
When fitting with SAEM
the variance terms of the random effects (Ω
) and the dispersion parameters for the error model (σ
) are initialized to the identity matrix or 1
as appropriate. They may also be specified. With SAEM, it is recommended to choose an init larger than the true values to facilitate exploration of the parameter space and avoid getting trapped in local optima early. Currently LaplaceI
fits with a PumasEMModel
require Ω
to be diagonal.
Options for ensemblealg
are EnsembleSerial()
and EnsembleThreads()
(the default). The fit will be parallel if ensemblealg == EnsembleThreads()
. SAEM imposes the additional requirement for threaded fits that either the rng
used supports Future.randjump
or that rng
be a vector of rngs, one per thread. The default_rng()
supports randjump()
.
StatsAPI.loglikelihood
— Methodloglikelihood(fpm::AbstractFittedPumasModel)
Compute the loglikelihood of a fitted Pumas model.
StatsAPI.loglikelihood
— Methodloglikelihood(b::BayesMCMCResults; split_method, split_by)
Computes a 3D array of the in-sample loglikelihoods of all the data points/subsets using the output of the MCMC fit
function. The output is a 3D array of size (ndata, nsamples, nchains)
where ndata
is the number of data points, nsamples
is the number of MCMC samples per chain and nchains
is the number of chains.
The way by which the data is split into points/subsets is determined using the split_method
and the split_by
keyword arguments. The split_method
argument can be of any of the following types:
LeaveK
for a leave-K-out splittingKFold
for a K-fold splittingLeaveFutureK
for leaving K future points at a time whilesplit_by
can be of any of the following types:BySubject
for leaving out subjectsByObservation
for leaving out individual observations per subject
Please refer to the documentation of each of the above types for the parameters of each split method, e.g. by typing ?LeaveK
or ?BySubject
.
Examples:
Assume there are 5 subjects and 10 observations per subject. The following are some of the combinations in which the above inputs can be used:
- Leave-1-observation-out splitting, splitting 1 observation for all the subjects at a time.
allsubjects = true
means that the same observation index is removed for all the subjects, e.g. the 10th observation for all the subjects is used as a single subset, then the 9th observation is used for all the subjects is used as another subset, etc.
split_method = LeaveK(K = 1)
split_by = ByObservation(allsubjects = true)
loglikelihood(b; split_method = split_method, split_by = split_by)
- Leave-2-future-observations-out splitting, splitting 2 future observations per subject such that no less than 4 past observations are remaining.
allsubjects = false
means that only the data of one subject at a time will be split. So observations 5 and 6 of subject 1 are the first data subset. The second subset is observations 7 and 8 of subject 1. The third subset is observations 9 and 10 of subject 1, etc. Then in the fourth to sixth runs, subject 2's data gets split. In the seventh to ninth runs, subject 3's data gets split, etc.
split_method = LeaveFutureK(K = 2, minimum = 4)
split_by = ByObservation(allsubjects = false)
loglikelihood(b; split_method = split_method, split_by = split_by)
- Leave-1-subject-out splitting, marginalizing the subject-specific parameters using quadrature integration when computing the in-sample likelihoods. Using this method, each subject is a data point. The in-sample likelihoods computed are the marginal likelihoods of the subjects computed using Gaussian quadrature with a maximum of 1000 integration points. Refer to the documentation of
LLQuad
by typing?LLQuad
for more details on the parameters of the quadrature algorithm.FO()
,FOCE()
andLaplaceI()
can also be used instead.
split_method = LeaveK(K = 1)
split_by = BySubject(marginal = LLQuad(imaxiters = 1000))
loglikelihood(b; split_method = split_method, split_by = split_by)
- Leave-1-future-subject-out splitting, where the in-sample likelihoods are conditional likelihoods computed using
0
s for the subject-specific parameters instead of marginalizing. In this method, each subject after the third one is considered a data point, i.e. the 4th and 5th subjects are the only 2 points for which the in-sample likelihood is computed.
split_method = LeaveFutureK(K = 1, minimum = 3)
split_by = BySubject(marginal = nothing)
loglikelihood(b; split_method = split_method, split_by = split_by)
StatsAPI.loglikelihood
— Methodloglikelihood(r::BayesianCVResult)
Computes the out-of-sample loglikelihoods from the result r
of crossvalidate
. The output is vector of matrices, e.g. loglik
where loglik[i]
is the loglikelihoods matrix of data point/set i
which was not used in training. loglik[i][j, k]
is the loglikelihood using the parameter values from the j
th sample of the k
th chain.
StatsAPI.predict
— Functionpredict(
model::AbstractPumasModel,
subject::Subject,
param::NamedTuple,
[randeffs,];
[obstimes::AbstractVector,
diffeq_options::NamedTuple]
)::Union{SubjectPrediction,Vector{SubjectPrediction}}
Compute population and individual predictions for the single subject
based on model
and the population parameters param
. A NamedTuple
of random effects, randeffs
, can be omitted or provided by the user. If they are omitted, they will be estimated from the data in the subject
.
If the optional obstimes
argument is passed then the time points in obstimes
are used for the predictions. Otherwise, the time points of the observations of the subject
are used for the predictions.
The function allows for extra keyword arguments to be passed on to the differential equations solver through the diffeq_options
keyword. See the online documentation for more details.
StatsAPI.predict
— Methodpredict(
fpm::AbstractFittedPumasModel,
[population::Union{Subject,Population}];
[obstimes::AbstractVector]
)::Union{SubjectPrediction,Vector{SubjectPrediction}}
Compute population and individual predictions for the fitted model fpm
. By default, the predictions are computed for the estimation data but the predictions can also be computed for user supplied data by passing either a single subject or a vector of subjects (Population
) as the population
argument.
If the optional obstimes
argument is passed then the time points in obstimes
are used for the predictions. Otherwise, the time points of the observations for each subject in the population
are used for the predictions.
Any optional keyword arguments used when fitting fpm
are reused when computing the predictions.
StatsAPI.stderror
— Methodstderror(f::AbstractFittedPumasModel) -> NamedTuple
Compute the standard errors of the population parameters and return the result as a NamedTuple
matching the NamedTuple
of population parameters.
StatsAPI.vcov
— Methodvcov(f::AbstractFittedPumasModel) -> AbstractMatrix
Compute the covariance matrix of the population parameters
Pumas.@cache
— Macro@cache
Defines the model's cache variables. Must be used in an @model
block. For example:
@model begin
@cache begin
onoff = 1.0
end
end
Pumas.@covariance
— Macro@covariance
Allows specifying a block-diagonal covariance structure for the Ω
parameters in an @emmodel
. The default is fully diagonal.
For example, to specify that the Ω
is block diagonal with 1x1
, 2x2
, and 2x2
blocks:
@emmodel begin
@random begin
Ka ~ 1 | LogNormal
CL ~ 1 | LogNormal
Vc ~ 1 | LogNormal
Q ~ 1 | LogNormal
Vp ~ 1 | LogNormal
end
@covariance 1 2 2
end
This would mean that both CL
and Vc
may be correlated with each other, and Q
and Vp
may also be, but all other pairwise correlations are zero. @covariance 5
would result in a dense matrix.
Only diagonal is currently supported when a PumasEMModel
with LaplaceI
.
Pumas.@covariates
— Macro@covariates
Defines the model's covariates, e.g. wt
. Must be used in an @model
block. For example:
@model begin
@covariates wt
end
Pumas.@delay
— Macro@delay(
distribution::Distribution,
cmt,
) -> Real
Compute the distributed delay contribution for distribution
for all dose amounts in administered to cmt
up until the current time point. This macro should be used in the @pre
block to compute an absorption rate explicitly added to the ODE in the @dynamics
block.
Examples
Weibull delay
The model below implements the Weibull absorption model of Hartmann, D., Gysel, D., Dubach, U. C., & Forgo, I. (1990)
mdl_weibull = @model begin
@param begin
θCb ∈ RealDomain(lower = 0.0)
θVc ∈ RealDomain(lower = 0.0)
θVm ∈ RealDomain(lower = 10.0)
θKm ∈ RealDomain(lower = 0.0)
θA ∈ RealDomain(lower = 0.0)
θB ∈ RealDomain(lower = 0.0)
θf ∈ RealDomain(lower = 0.0)
σ ∈ RealDomain(lower = 0.0)
end
@pre begin
Baseline = θCb * θVc
Vc = θVc
VVm = Vc * θVm
VKm = Vc * θKm
λ = θA^(-1 / θB)
B = θB
f = θf
dose_density = @delay(Weibull(B, λ), Central)
end
@dosecontrol begin
bioav = (; Central = 0.0)
end
@init begin
Central = Baseline
end
@dynamics begin
Central' =
VVm * (Baseline / (VKm + Baseline) - Central / (VKm + Central)) +
f * Vc * dose_density
end
@derived begin
μ := @. Central ./ Vc
y ~ @. Gamma(1 / σ^2, μ * σ^2)
end
end
Gamma delay
The model below below uses the the delay
function to implement a gamma distributed delay demonstrated in Savic, Jonker, Kerbusch, and Karlsson (2007).
mdl_gamma_delay = @model begin
@param begin
θn ∈ RealDomain(lower = 0.0)
θmat ∈ RealDomain(lower = 0.0)
θCL ∈ RealDomain(lower = 0.0)
θVc ∈ RealDomain(lower = 0.0)
σ ∈ RealDomain(lower = 0.0)
end
@pre begin
kₑ = θCL / θVc
Vc = θVc
dose_density = @delay(Gamma(θn, θmat / θn), Central)
end
@dosecontrol begin
bioav = (; Central = 0.0)
end
@dynamics begin
Central' = dose_density - kₑ * Central
end
@derived begin
μ := Central ./ Vc
y ~ @. Gamma(1 / σ^2, μ * σ^2)
end
end
Pumas.@derived
— Macro@derived
Defines the error model of the dependent variables. Must be used in an @model
block. For example:
@model begin
@derived begin
dv ~ @. Normal(1000 * Central / Vc, 1000 * Central / Vc * σ)
end
end
Pumas.@dosecontrol
— Macro@dosecontrol
Define dose control parameters as a function of fixed and random effects. Options include bioav
, duration
, lags
, and rate
. Must be used in an @model
block. For example:
@model begin
@dosecontrol begin
bioav = (Depot1 = max(0, θ[5]), Depot2 = clamp(1 - θ[5], 0.0, 1.0), Central = 1)
lags = (Depot1 = 0, Depot2 = max(0.1, θ[6]), Central = 0)
end
end
Pumas.@dynamics
— Macro@dynamics
Defines the dynamic system by specifying differential equations. Must be used in an @model
block.
A @model
block with a @dynamics
block must not contain a @reactions
block.
Examples
A @dynamics
block with a system of differential equations:
@model begin
@dynamics begin
Depot1' = -ka * Depot1
Depot2' = ka * Depot1 - ka * Depot2
Central' = ka * Depot2 - CL / Vc * Central
end
end
A @dynamics
block with a differential equation model:
@model begin
@dynamics Depots1Central1
end
Pumas.@emmodel
— Macro@emmodel
Define a PumasEMModel
. It may have the following blocks:
@param
and@random
: Defines fixed and random effects, e.g.
@random begin
CL ~ 1 + wt | LogNormal
θbioav ~ 1 | LogitNormal
end
Distributions specify the parameter's support. Only Normal
(-∞,∞) LogNormal
(0,∞), and LogitNormal
(0,1) are currently supported.
These define 1 unconstrained parameter per term in the formula, for example CL = exp(μ + wt * β + η)
where μ, β ∈ (-∞,∞)
and wt
is a covariate constant with respect to time. η
is a random effect. The @param
block is equivalent, with the exception that there is no random effect η
. These variables are accessible in @pre
, @dosecontrol
, @dynamics
, @post
, and @error
blocks.
@covariates
: Covariates available in the@pre
,@dosecontrol
,@dynamics
, and@post
blocks.@pre
: Block evaluated before the@dynamics
.@dosecontrol
: Block specifying dose control parameters. Function of@param
and@random
variables only.@init
: return initial conditions for dynamical variables.@dynamics
: Dynamics, equivalent to the functionality in the@model
macro.@post
: Block evaluated after the@dynamics
and before the@error
.@error
: Block describing the error model. Dispersion parameters are implicit.Y ~ ProportionalNormal(μ)
indicates thatY
has aNormal
distribution with meanμ
.Y
must be observed subject data, whileμ
can can be defined in the@param
,@random
,@pre
,@dynamics
, or@post
blocks.
Pumas.@error
— Macro@error
Defines the error model of the dependent variables in a PumasEMModel
. Distributions are parameterized by the mean only, the dispersion parameter(s), if any, are left implicit. See the documentation for details on the available error models. The mean parameter(s) must be (a) symbol(s). Use an earlier block, e.g. @post
to define any necessary transformations.
Must be used in an @emmodel
block. For example:
@emmodel begin
@random begin
CL ~ 1 | LogNormal
Vc ~ 1 | LogNormal
end
@dynamics Central1
@post begin
cp = Central / Vc # cp is calculated here
end
@error begin
dv ~ CombinedNormal(cp) # cp is the mean of the error distribution
end
end
Pumas.@init
— Macro@init
Defines initial conditions of the dynamic system. Must be used in an @model
block. For example:
@model begin
@init begin
Depot1 = 0.0
Depot2 = 0.0
Central = 0.0
end
end
Pumas.@metadata
— Macro@metadata
Allows including additional details of a model as metadata. These take the form of key/value pairs defined with =
as follows, where the values can take on <:Any
value:
@model begin
@metadata begin
desc = "A short description of the model."
timeu = u"hr"
end
end
In the above example, a textual description of the model is provided by the desc
key. It must be written in plaintext and should preferably not take up more than a single line. The timeu
key is assigned a Unitful
value to describe the units used in the model. At present desc
and timeu
are the only metadata keys that are used.
Pumas.@model
— Macro@model
Defines a Pumas model, with the following possible blocks:
@param
: defines the model's fixed effects and corresponding domains, e.g.tvcl ∈ RealDomain(lower = 0)
.@random
: defines the model's random effects and corresponding distribution, e.g.η ~ MvNormal(Ω)
.@covariates
: Names of covariates used in the model.@pre
: pre-processes variables for the dynamic system and statistical specification.@dosecontrol
: defines dose control parameters as a function of fixed and random effects.@vars
: define variables usable in other blocks. It is recommended to define them in@pre
instead if not a function of dynamic variables.@init
: defines initial conditions of the dynamic system.@dynamics
: defines the dynamic system.@derived
: defines the statistical model of the dependent variables.@observed
: lists model information to be stored in the solution.@options
specifies model options, includingchecklinear
,inplace
, andsubject_t0
.
Pumas.@nca
— Macro @nca
Defines a macro that can be used in the observed
block to compute NCA metrics of the computed model solutions.
Example
...
@derived begin
cp := Central/Vc
DV ~ @. Normal(cp, σ*cp)
end
@observed begin
nca = @nca cp
auc = NCA.auc(nca, interval = (0,4))
cmax = NCA.cmax(nca)
end
...
Pumas.@observed
— Macro@observed
Lists model information to be stored in the solution. Must be used in an @model
block. For example:
@model begin
@observed begin
conc = @. 1000 * Central / Vc
end
end
Pumas.@options
— Macro@options
Specifies model options, including the following:
checklinear::Bool
: determines whether the solver should check if the system defined in the@dynamics
block is linear. If the system is linear, setting this option totrue
(default) enables calculating the solution through matrix exponentials. If it is set tofalse
or time (t
) appears in the@pre
block, this optimization is disabled. This option can be useful when the matrix exponential solver is not superior to general numerical integrators or for debugging purposes.inplace::Bool
: controls whether the solver uses mutable operations onArray
s (true
), which can improve performance for large systems. By default, it is set totrue
for systems containing more than 5 dynamic variables, and set tofalse
otherwise.subject_t0::Bool
: determines the starting time point from which the solver integrates the system. By default, all systems are solved from time 0 to the last time point in the subject, makingsubject_t0
false
. Ifsubject_t0
is set totrue
, the solver will begin integration from the first time point found within the subject.
Must be used in an @model
block. For example (defaults are shown below):
@model begin
@options begin
checklinear = true
inplace = length(odevars) < 5
subject_t0 = false
end
end
Pumas.@param
— Macro@param
Defines the model's fixed effects and corresponding domains, e.g. tvcl ∈ RealDomain(lower = 0)
. Must be used in an @model
block. For example:
@model begin
@param begin
tvcl ∈ RealDomain(lower = 0)
tvv ∈ RealDomain(lower = 0)
Ω ∈ PDiagDomain(2)
σ_prop ∈ RealDomain(lower = 0, init = 0.04)
end
end
For PumasEMModel
s, the @param
block uses a formula syntax for covariate relationships. A distribution from the normal family is used to indicate domain (currently only Normal
, LogNormal
, and LogitNormal
are supported):
@emmodel begin
@param begin
CL ~ 1 + wt | LogNormal
Vc ~ 1 | Normal
end
end
this gives CL = exp(log(θ.CL[1]) + θ.CL[2] * wt)
and VC = θ.VC
, where θ
is the parameter named tuple. If we only have 1
in the formula, then it is a scalar. If we have covariates, we have a tuple of parameters for variable.
Pumas.@post
— Macro@post
Allows for post-processing dynamical variables, to define any variables needed in the @error
block.
Must be used in an @emmodel
block. For example:
@emmodel begin
@random begin
CL ~ 1 | LogNormal
Vc ~ 1 | LogNormal
end
@dynamics Central1
@post begin
cp = Central / Vc
end
@error begin
dv ~ CombinedNormal(cp)
end
end
Pumas.@pre
— Macro@pre
Pre-processes variables for the dynamic system and statistical specification. Must be used in an @model
block. For example:
@model begin
@params begin
tvcl ∈ RealDomain(lower = 0)
tvv ∈ RealDomain(lower = 0)
ωCL ∈ RealDomain(lower = 0)
ωV ∈ RealDomain(lower = 0)
σ_prop ∈ RealDomain(lower = 0, init = 0.04)
end
@random begin
ηCL ~ Normal(ωCL)
ηV ~ Normal(ωV)
end
@covariates wt
@pre begin
CL = tvcl * (wt / 70)^0.75 * exp(ηCL)
Vc = tvv * (wt / 70) * exp(ηV)
ka = CL / Vc
Q = Vc
end
end
Pumas.@random
— Macro@random
defines the model's random effects and corresponding distribution, e.g. η ~ MvNormal(Ω)
. Must be used in an @model
block. For example:
@model begin
@random begin
η ~ MvNormal(Ω)
end
end
For PumasEMModel
s, the @random
block is equivalent to the @param
s block, except that each variable defined here has an additional random effect. For example:
@emmodel begin
@random begin
CL ~ 1 + wt | LogNormal
Vc ~ 1 | Normal
end
end
this gives CL = exp(log(θ.CL[1]) + θ.CL[2] * wt + η_CL)
and VC = θ.VC + η_VC
, where θ
is the parameter named tuple. If we only have 1
in the formula, then it is a scalar. If we have covariates, we have a tuple of parameters for variable.
Pumas.@reactions
— Macro@reactions
Defines the dynamic system by specifying chemical reactions with the Catalyst.jl DSL. Must be used in an @model
block.
A @model
block with a @reactions
block must not contain a @dynamics
block.
Example
@model begin
@reactions begin
ka, Depot1 --> Depot2
ka, Depot2 --> Central
CL / Vc, Central --> 0
end
end
This is equivalent to
@model begin
@dynamics begin
Depot1' = -ka * Depot1
Depot2' = ka * Depot1 - ka * Depot2
Central' = ka * Depot2 - CL / Vc * Central
end
end
Pumas.@vars
— Macro@vars
Define variables usable in other blocks. It is recommended to define them in @pre
instead if not a function of dynamic variables. Must be used in an @model
block. For example:
@model begin
@vars begin
conc = Central / Vc
end
end
- Geweke1991Geweke, J. F. (1991). Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments (No. 148). Federal Reserve Bank of Minneapolis.
- Heidelberger1983Heidelberger, P., & Welch, P. D. (1983). Simulation run length control in the presence of an initial transient. Operations Research, 31(6), 1109-1144.
- Raftery1992A L Raftery and S Lewis. Bayesian Statistics, chapter How Many Iterations in the Gibbs Sampler? Volume 4. Oxford University Press, New York, 1992.