Pumas Docstrings
DataFrames.DataFrame — TypeDataFrame(events::DosageRegimen, expand::Bool = false)Create a DataFrame with the information in the dosage regimen. If expand, create a DataFrame with the information in the event list (expanded form).
DataFrames.DataFrame — MethodDataFrame(bts::Union{SimulatedInference, Bootstraps})Returns a data frame of the parameter estimates.
Distributions.MvNormal — MethodMvNormal(pmi::FittedPumasModelInference)Returns an MvNormal with a mean equal to the optimal parameter values in the fitted model and a covariance equal to the variance-covariance estimate.
Distributions.MvNormal — MethodMvNormal(model::PumasModel, means::NamedTuple, vcov::NamedTuple)Returns an MvNormal with mean equal to the vectorized means argument and a covariance equal to the block diagonal matrix of the covariance matrices in the vcov argument. The mean argument should be a named tuple of values in their natural domain. vcov should be a named tuple of numbers (for real parameters) or matrices for vector and matrix parameters. For parameters that are themselves symmetric matrices, e.g:
Ω = [
Ω₁₁ Ω₁₂ Ω₁₃;
Ω₂₁ Ω₂₂ Ω₂₃;
Ω₃₁ Ω₃₂ Ω₃₃;
]the covariance matrix should be the covariance of the vector: [Ω₁₁, Ω₁₂, Ω₂₂, Ω₁₃, Ω₂₃, Ω₃₃]. Note that because the matrix is symmetric, only the upper triangle is used iterating over the columns first in an outer loop then the over the rows of each column in an inner loop.
Example:
means = (θ1 = 1.0, θ2 = [1.0, 2.0], θ3 = [1.0 0.1; 0.1 1.0])
vcov = (
θ1 = 0.1,
θ2 = [0.5 -0.2; -0.2 0.6],
θ3 = [0.5 -0.2 0.1; -0.2 0.6 -0.3; 0.1 -0.3 0.45],
)
dist = MvNormal(model, means, vcov)MCMCChains.Chains — MethodChains(b::BayesMCMCResults; subject::Union{Nothing, Int} = nothing)Constructs an MCMCChains.Chains object for the samples of the population or subject-specific parameters. If subject is nothing (the default value) a Chains object of the population parameters' samples is returned. If subject is an integer, the samples of the subject-specific parameters corresponding to the subject with the given index will be returned instead.
Pumas.AnalyticalPKPDProblem — TypeAnalyticalPKPDProblem(
f,
u0,
tspan,
events,
time,
p,
bioav)An analytical PK(PD) problem.
Fields:
f: callable that returns the state of the dynamic systemu0: initial conditionstspan: time points that define the intervals in which the problem is definedevents: events such as dose, reset events, etctime: event timesp: a function that returns pre block evaluated at a time point
Pumas.AnalyticalPKProblem — TypeAnalyticalPKProblem(pkprob, prob2, sys2)
A problem that is partially an analytical problem that can be evaluated independently of the rest of the system.
Fields:
pkprob: the analytical part of the problemprob2: a problem that represents the rest of the systemsys2(optional): the system ofprob2- used for latexification.
Pumas.BayesMCMC — TypeBayesMCMC
An instance of the Hamiltonian Monte Carlo (HMC) based No-U-Turn Sampler (NUTS) algorithm. All the options of the algorithm can be passed to the constructor of BayesMCMC as keyword arguments. Example:
alg = BayesMCMC(nsamples = 1000)The main options that can be set are:
target_accept: target acceptance ratio for the NUTS algorithmnsamples: number of Markov Chain Monte Carlo (MCMC) samples to generatenadapts: number of adaptation steps in the NUTS algorithmnchains: number of MCMC chains to sampleess_per_chain: target effective sample size (ESS) per chain, sampling terminates if the target is reachedcheck_every: the number of samples after which the ESS per chain is checkedtime_limit: a time limit for sampling in seconds, sampling terminates if the time limit is reachedensemblealg: can be set toEnsembleSerial()for serial sampling,EnsembleThreads()for multi-threaded sampling orEnsembleDistributed()for multi-processing (aka distributed parallelism) samplingparallel_chains: can be set totrueorfalse, if set totruethe chains will be sampled in parallel using either multi-threading or multi-processing depending on the value ofensemblealgparallel_subjects: can be set totrueorfalse, if set totruethe log probability computation will be parallelized over the subjects using either multi-threading or multi-processing depending on the value ofensemblealgrng: the random number generator useddiffeq_options: aNamedTupleof all the differential equations solver's optionsconstantcoef: aNamedTupleof the parameters to be fixed during sampling. This can be used to sample from conditional posteriors fixing some parameters to specific values, e.g.constantcoef = (σ = 0.1,)fixes theσparameter to 0.1 and samples from the posterior of the remaining parameters conditional onσ.
Pumas.Bootstrap — MethodBootstrap(; rng=Random.default_rng, samples=200, stratify_by=nothing, ensemblealg=EnsembleThreads())The keyword rng can be used to set the seed for the random resamples of the datasets, samples controls the number of resampled Populations, stratify_by control the number of resampled datasets, and by specifying keyword stratify_by to be a Symbol with the name of a covariate it is possible to stratify by a covariate with a finite number of possible values, ensemblealg controls the parallization of each fit call.
Pumas.ByObservation — TypeByObservation(; allsubjects = true)Using this method, each observation or collection of observations is treated as a single data point. When computing predictive loglikelihoods using this method, the predictive loglikelihood computed is the conditional loglikelihood of one or more observations for one or more subjects.
This method has one option, allsubjects. If allsubjects is set to true (the default value), the ith observation of each subject are all grouped together into a single data point. If allsubjects is set to false, then each observation per subject is its individual data point.
Examples:
Assume there are 2 subjects and 3 observations per subject. When using LeaveK(K = 1) as the splitting method together with ByObservation(allsubjects = false), the training and validation splittings are:
| Training subset | Validation subset |
|---|---|
Sub 1 (obs 1, 2, 3), sub 2 (obs 1, 2) | Sub 2 (obs 3) |
Sub 1 (obs 1, 2, 3), sub 2 (obs 1, 3) | Sub 2 (obs 2) |
Sub 1 (obs 1, 2, 3), sub 2 (obs 2, 3) | Sub 2 (obs 1) |
Sub 1 (obs 1, 2), sub 2 (obs 1, 2, 3) | Sub 1 (obs 3) |
Sub 1 (obs 1, 3), sub 2 (obs 1, 2, 3) | Sub 1 (obs 2) |
Sub 1 (obs 2, 3), sub 2 (obs 1, 2, 3) | Sub 1 (obs 1) |
On the other hand if allsubjects is set to true, the training and validation splittings are:
| Training subset | Validation subset |
|---|---|
Sub 1 (obs 1, 2), sub 2 (obs 1, 2) | Sub 1 (obs 3) and sub 2 (obs 3) |
Sub 1 (obs 1, 3), sub 2 (obs 1, 3) | Sub 1 (obs 2) and sub 2 (obs 2) |
Sub 1 (obs 2, 3), sub 2 (obs 2, 3) | Sub 1 (obs 1) and sub 2 (obs 1) |
Pumas.BySubject — TypeBySubject(; marginal = LaplaceI())Using this method, each subject is treated as a single data point. The predictive loglikelihood computed for each subject can be either the marginal loglikelihood or conditional loglikelihood.
This method has one option, marginal. If marginal is set to nothing, the predictive loglikelihood computed for each subject is the conditional loglikelihood using 0s for the subject-specific parameters. Otherwise, the predictive loglikelihood computed for each subject is the marginal loglikelihood using marginal as the marginalization algorithm.
The default value of marginal is LaplaceI() which uses the Laplace method to integrate out the subject-specific parameters. Other options include: FOCE(), FO() and Pumas.LLQuad(). Pumas.LLQuad() is a quadrature integration algorithm. To learn more about the options of LLQuad, you can type ?LLQuad in the REPL.
Pumas.Censored — TypeCensored(distribution::Distribution, lower::Real, upper::Real)::CensoredConstruct a censored distribution based on distribution and the censoring limits lower and upper.
Pumas.Central1 — TypeCentral1()An analytical model for a one compartment model with dosing into Central. Equivalent to
Central' = -CL/Vc*Centralwhere clearance, CL, and volume, Vc, are required to be defined in the @pre block.
Pumas.Central1Meta1 — TypeCentral1Meta1()An analytical model for a model with a central compartment, Central, and a metabolite, Metabolite. Equivalent to Central' = -(CL+CLfm)/VcCentral Metabolite' = CLfm/VcCentral - CLm/Vm*Metabolite where clearances (CL and CLm) and volumes (Vc and Vm) and formation clearance of metabolite CLfm are required to be defined in the @pre block.
Pumas.Central1Periph1 — TypeCentral1Periph1()An analytical model for a two-compartment model with a central compartment, Central and a peripheral compartment, Peripheral. Equivalent to
Central' = -(CL+Q)/Vc*Central + Q/Vp*Peripheral
Peripheral' = Q/Vc*Central - Q/Vp*Peripheralwhere clearance, CL, and volumes, Vc and Vp, and distribution clearance, Q, are required to be defined in the @pre block.
Pumas.Central1Periph1Meta1 — TypeCentral1Periph1Meta1()An analytical model for a two compartment model with a central compartment, Central, with a peripheral compartment, Peripheral, and a metabolite compartment, Metabolite. Equivalent to
Central' = -(CL+Q+CLfm)/Vc*Central + Q/Vp*CPeripheral
CPeripheral' = Q/Vc*Central - Q/Vp*CPeripheral
Metabolite' = -CLm/Vm*Metabolite + CLfm/Vc*Centralwhere clearances (CL and CLm) and volumes (Vc, Vp and Vm), distribution clearance (Q), and formation clearance of metabolite CLfm are required to be defined in the @pre block.
Pumas.Central1Periph1Meta1Periph1 — TypeCentral1Periph1Meta1Periph1()An analytical model for a two compartment model with a central compartment, Central, with a peripheral compartment, Peripheral, and a metabolite compartment, Metabolite, with a peripheral compartment, MPeripheral. Equivalent to
Central' = -(CL+Q+CLfm)/Vc*Central + Q/Vp*CPeripheral
CPeripheral' = Q/Vc*Central - Q/Vp*CPeripheral
Metabolite' = -(CLm+Qm)/Vm*Metabolite + Qm/Vmp*MPeripheral + CLfm/Vc*Central
MPeripheral' = Qm/Vm*Metabolite - Qm/Vmp*MPeripheralwhere clearances (CL and CLm) and volumes (Vc, Vp, Vm and Vmp), distribution clearances (Q and Qm) and formation clearance of metabolite CLfm are required to be defined in the @pre block.
Pumas.ConstDomain — Type@param x = val
@param x ∈ ConstDomain(val)Specifies a parameter as a constant.
Pumas.Constrained — TypeConstrainedConstrain a Distribution within a Domain. The most common case is an MvNormal constrained within a VectorDomain. The only supported method for Constrained is logpdf. Notice that the result does not represent a probability distribution since the remaining probability mass is not scaled by the mass excluded by the constraints.
Example
julia> d = Constrained(MvNormal(fill(1.0, 1, 1)), lower=-1, upper=1)
Constrained{ZeroMeanFullNormal{Tuple{Base.OneTo{Int64}}}, VectorDomain{Vector{Int64}, Vector{Int64}, Vector{Float64}}}(ZeroMeanFullNormal(
dim: 1
μ: 1-element Zeros{Float64}
Σ: [1.0;;]
)
, VectorDomain{Vector{Int64}, Vector{Int64}, Vector{Float64}}([-1], [1], [0.0]))
julia> logpdf(d, [ 0])
-0.9189385332046728
julia> logpdf(d, [-2])
-InfPumas.CorrDomain — TypeCorrDomain(template)Return a correlation domain.
template provides both the shape and initial values for the resulting correlation parameter.
- if
templateis anIntthen an identity matrix of sizetemplateis returned. - if
templateis a squareMatrixthen theCorrDomainwill match its size and will
have initial values according to the template elements.
Pumas.Depots1Central1 — TypeDepots1Central1()An analytical model for a one compartment model with a central compartment, Central, and a depot, Depot. Equivalent to
Depot' = -Ka*Depot
Central' = Ka*Depot - CL/Vc*Centralwhere absoption rate, Ka, clearance, CL, and volume, Vc, are required to be defined in the @pre block.
Pumas.Depots1Central1Periph1 — TypeDepots1Central1Periph1()An analytical model for a two-compartment model with a central compartment, Central, a peripheral compartment, Peripheral, and a depot Depot. Equivalent to
Depot' = -Ka*Depot
Central' = Ka*Depot - (CL+Q)/Vc*Central + Q/Vp*Peripheral
Peripheral' = Q/Vc*Central - Q/Vp*Peripheralwhere absorption rate, Ka, clearance, CL, and volumes, Vc and Vp, and distribution clearance, Q, are required to be defined in the @pre block.
Pumas.Depots2Central1 — TypeDepots2Central1()An analytical model for a one compartment model with a central compartment, Central, and two depots, Depot1 and Depot2. Equivalent to
Depot1' = -Ka1*Depot1
Depot2' = -Ka2*Depot2
Central' = Ka1*Depot1 + Ka2*Depot2 - CL/Vc*Centralwhere absorption rates, Ka1 and Ka2, clearance, CL, and volume, Vc, are required to be defined in the @pre block.
When using this model during simulation or estimation, it is preferred to have 2 dosing rows for each subject in the dataset, where the first dose goes into cmt =1 (or cmt = Depot1) and the second dose goes into cmt=2 (or cmt=Depot2). Central compartment gets cmt=3 or (cmt = Central). e.g.
ev = DosageRegimen([100,100],cmt=[1,2]) s1 = Subject(id=1, events=ev)
Pumas.DiffEqWrapper — TypeDiffEqWrapperModifies the diffeq system f to add allow for modifying rates
f: original diffeq systemparams: ODE parametersrates_on: should rates be modified?rates: amount to be added to rates
Pumas.DosageRegimen — TypeDosageRegimenLazy representation of a series of Events.
Fields
data::DataFrame: The tabular representation of a series ofEvents.Signature
evts = DosageRegimen(amt::Numeric;
time::Numeric = 0,
cmt::Union{Numeric,Symbol} = 1,
evid::Numeric = 1,
ii::Numeric = zero.(time),
addl::Numeric = 0,
rate::Numeric = zero.(amt)./oneunit.(time),
duration::Numeric = zero(amt)./oneunit.(time),
ss::Numeric = 0,
route::NCA.Route)- Examples
julia> DosageRegimen(100, ii = 24, addl = 6)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 100.0 1 24.0 6 0.0 0.0 0 NullRoute
julia> DosageRegimen(50, ii = 12, addl = 13)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 50.0 1 12.0 13 0.0 0.0 0 NullRoute
julia> DosageRegimen(200, ii = 24, addl = 2)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 200.0 1 24.0 2 0.0 0.0 0 NullRoute
julia> DosageRegimen(200, ii = 24, addl = 2, route = NCA.IVBolus)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 200.0 1 24.0 2 0.0 0.0 0 IVBolusFrom various DosageRegimens
- Signature
evs = DosageRegimen(regimen1::DosageRegimen, regimen2::DosageRegimen; offset = nothing)
offset specifies if regimen2 should start after an offset following the end of the last event in regimen1.
- Examples
julia> e1 = DosageRegimen(100, ii = 24, addl = 6)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 100.0 1 24.0 6 0.0 0.0 0 NullRoute
julia> e2 = DosageRegimen(50, ii = 12, addl = 13)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 50.0 1 12.0 13 0.0 0.0 0 NullRoute
julia> evs = DosageRegimen(e1, e2)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 100.0 1 24.0 6 0.0 0.0 0 NullRoute
2 │ 0.0 1 50.0 1 12.0 13 0.0 0.0 0 NullRoute
julia> DosageRegimen(e1, e2, offset = 10)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 100.0 1 24.0 6 0.0 0.0 0 NullRoute
2 │ 178.0 1 50.0 1 12.0 13 0.0 0.0 0 NullRoute
Pumas.Event — TypeEventA single event
Fields:
amt: Dose amount (mass units, e.g. g, mg, etc.)time: Time of dose (time unit, e.g. hours, days, etc.)evid: Event identifier, possible values are-1: End of an infusion dose0: observation (these should have already been removed)1: dose event2: other type event3: reset event (the amounts in each compartment are reset to zero, the on/off status of each compartment is reset to its initial status)4: reset and dose event
cmt: Compartment (an Int or Symbol)rate: The dosing rate:0: instantaneous bolus dose>0: infusion dose (mass/time units, e.g. mg/hr)-1: infusion rate specifed by model-2: infusion duration specified by model
duration: duration of dose, ifdose > 0.ss: steady state status0: dose is not a steady state dose.1: dose is a steady state dose, and that the compartment amounts are to be reset to the steady-state amounts resulting from the given dose. Compartment amounts resulting from prior dose event records are "zeroed out," and infusions in progress or pending additional doses are cancelled.2: dose is a steady state dose and that the compartment amounts are to be set to the sum of the steady-state amounts resulting from the given dose plus whatever those amounts would have been at the event time were the steady-state dose not given.
ii: interdose interval (units: time). Time between implied or additional doses.base_time: time at which infusion startedrate_dir: direction of rate-1for end-of-infusion+1for any other doses
route: route of administration to be used in NCA analysis
Pumas.ExactCrossvalidation — TypeExactCrossvalidation(; split_method, split_by, ensemblealg)Defines an instance of the ExactCrossvalidation algorithm for crossvalidation. In this algorithm, the fitting is re-run while leaving out a subset of the data each time.
The way by which the data is split between training and validation sets is determined using the keyword arguments split_method and split_by. The split_method argument can be of any of the following types:
LeaveKfor leave-K-out cross-validationKFoldfor K-fold cross-validationLeaveFutureKfor leaving K future points at a time
while split_by can be of any of the following types:
BySubjectfor leaving out subjectsByObservationfor leaving out individual observations per subject
Please refer to the documentation of each of the above types for the parameters of each split method, e.g. typing ?LeaveK or ?BySubject.
Examples:
Assume there are 5 subjects and 10 observations per subject and that ft is the result of the fit function. The following are some of the combinations in which the above inputs can be used:
- Leave-1-observation-out cross-validation, leaving 1 observation for all the subjects at a time.
allsubjects = truemeans that the same observation index is removed for all the subjects, e.g. the 10th observation for all the subjects is used for validation in the first run, then the 9th observation is used for validation in the second run, etc.
split_method = LeaveK(K = 1)
split_by = ByObservation(allsubjects = true)
cv_method = ExactCrossvalidation(; split_method = split_method, split_by = split_by, ensemblealg = EnsembleThreads())
r = crossvalidate(ft, cv_method)
- Leave-2-future-observations-out cross-validation, leaving 2 future observations per subject such that no less than 4 observations are used in training.
allsubjects = falsemeans that only the data of one subject at a time will be split. So in the first run, only observations 1 to 4 of subject 1 are used for training and observations 5 and 6 of subject 1 are used for validation. In the second run, observations 1 to 6 of subject 1 are used for training and observations 7 and 8 are used for validation. In the third run, observations 1 to 8 of subject 1 get used for training and observations 9 and 10 are used for validation. For all 3 runs, all the observations of subjects 2 to 5 are used in training. Then in the fourth to sixth runs, subject 2's data gets split. In the seventh to nineth runs, subject 3's data gets split, etc.
split_method = LeaveFutureK(K = 2, minimum = 4)
split_by = ByObservation(allsubjects = false)
cv_method = ExactCrossvalidation(; split_method = split_method, split_by = split_by, ensemblealg = EnsembleThreads())
r = crossvalidate(ft, cv_method)
- Leave-1-subject-out cross-validation, marginalizing the subject-specific parameters using quadrature integration when computing predictive likelihoods. Using this method, subjects {1, 2, 3, 4} are used for training in the first run and subject {5} is used for validation. In the second run, subjects {1, 2, 3, 5} are used for training and subject {4} is used for validation, etc. The predictive likelihoods computed are the marginal likelihood of the subject computed using Gaussian quadrature with a maximum of 1000 integration points. Refer to the documentation of
LLQuadby typing?LLQuadfor more details on the parameters of the quadrature algorithm.FO(),FOCE()andLaplaceI()can also be used instead.
split_method = LeaveK(K = 1)
split_by = BySubject(marginal = LLQuad(imaxiters=1000))
cv_method = ExactCrossvalidation(; split_method = split_method, split_by = split_by, ensemblealg = EnsembleThreads())
r = crossvalidate(ft, cv_method)
- Leave-1-future-subject-out cross-validation, where the predictive likelihoods are conditional likelihoods computed using
0s for the subject-specific parameters instead of marginalizing. The minimum number of subjects is 3 so there are only 2 runs. In the first run, the training subjects are {1, 2, 3} and the validation subject is {4}. In the second run, the training subjects are {1, 2, 3, 4} and the validation subject is {5}.
split_method = LeaveFutureK(K = 1, minimum = 3)
split_by = BySubject(marginal = nothing)
cv_method = ExactCrossvalidation(; split_method = split_method, split_by = split_by, ensemblealg = EnsembleThreads())
r = crossvalidate(ft, cv_method)
Pumas.ExplicitModel — Method(m::ExplicitModel)(
Δt::Real,
amounts::AbstractVector,
doses::AbstractVector,
pre,
rates::AbstractVector)Solve the model for a given time step Δt.
The initial/previous time point is associated with the initial system state amounts, bolus doses and the start of an infusion with a rates going into compartments given by rates. For each evaluation, pre will be evaluated at the current time point.
Pumas.FittedPumasEMModel — TypeFittedPumasEMModel
Contains the results of fitting a PumasEMModel with SAEM.
Fields:
- model: The PumasEMModel that was fit.
- data: The population that was fit.
- μ: Final estimates of
@paramand@randomparameters. - ω: Final estimates of
ωparameters. - σ: Final estimated standard deviations of residuals.
- opthistory: If
approxisSAEM, contains the full history of unconstrained-μ,ω, and,σestimates across iterations. Otherwise, contains the optimization results. - approx: Fitting method used.
- η: Random effects from the fit. If
approxisSAEM, contains alength(pop) × nη × iterationarray ofηsamples from the smoothing phase. Otherwise, contains orthogonal random effects. - kwargs: Input keywords stored for use in diagnostics. Includes
ensemblealganddiffeq_options.
Pumas.Formula — TypeO ~ C | D
For example: CL ~ 1 + wt | Normal -> Formula{:CL,(1,:wt),:Normal}()
Pumas.IDR — MethodIDR(; direct=true, mode=:stimulation_sigmoid)Get an indirect response model (IDR) for use as the pharmacodynamic part of model macro generation.
- direct::Bool - toggle whether the drug acts on production (true) or inactivation (false)
- mode::Symbol - pick the mode of drug action. Valid options are: :stimulationlinear, :stimulationemax, :stimulationsigmoid, :stimulationinfinite, :stimulationlogarithmic, :inhibitionemax, :inhibitionsigmoid, :inhibitionhill,
Pumas.JointMAP — TypeJointMAP
An instance of the maximum a-posteriori (MAP) algorithm where the mode of the joint posterior of the population and subject-specific parameters is found using the BFGS optimization algorithm. All the options of the algorithm can be passed to the constructor of JointMAP as keyword arguments. Example:
alg = JointMAP(ensemblealg = EnsembleThreads())The main options that can be set are:
ensemblealg: can be set toEnsembleSerial()for serial sampling,EnsembleThreads()for multi-threaded sampling orEnsembleDistributed()for multi-processing (aka distributed parallelism) samplingdiffeq_options: aNamedTupleof all the differential equations solver's options
Pumas.KFold — TypeKFold(; K = 5, shuffle = false, rng = nothing)K-fold splitting method. In this method, the data is split K times into 2 disjoint groups, each time starting from the full data set. The 2 groups are typically called training and validation subsets, where the validation subset has floor(N / K) data points, N being the total number of data points. In the next iteration, the whole data set is then re-split using another disjoint validation subset of floor(N / K) different points, disjoint from the the previous validation subsets. This process is done iteratively until almost each data point has shown up in 1 and only 1 validation subset. If N is divisible by K, each point will show up in 1 and only 1 validation subset. Otherwise, the remainder points will be part of the training subset for all the splittings and will not show up in any validation subset. This method can be used for computing predictive loglikelihoods, in-sample loglikelihoods and doing cross-validation using subsets of out-of-sample data.
The data is typically a vector of some sort, e.g. observations or subjects, and the splittings are order-dependent. Before performing the splittings, you can randomly shuffle the data vector by setting the shuffle keyword argument to true (default is false) getting rid of the sensitivity to the original order of the data. You can additionally pass an optional pseudorandom number generator rng to control the pseudo-randomness for maximum reproducibility.
Example:
Assume the original data is ["A", "B", "C", "D"]. 4-fold splitting without shuffling results in the following splittings of the data:
| Training subset | Validation subset |
|---|---|
["A", "B", "C"] | ["D"] |
["A", "B", "D"] | ["C"] |
["A", "C", "D"] | ["B"] |
["B", "C", "D"] | ["A"] |
where each data point showed once and only once in a validation subset.
2-fold splitting without shuffling results in the following splittings:
| Training subset | Validation subset |
|---|---|
["A", "B"] | ["C", "D"] |
["C", "D"] | ["A", "B"] |
Pumas.LLQuad — TypeLLQuad(; iabstol = 0.0, ireltol = 1e-8, imaxiters = 10^10)Gaussian quadrature integration approximation method for computing the marginal loglikelihood. The integration volume is adaptively subdivided, using a cubature rule due to Genz and Malik (1980), until the estimated error E satisfies E ≤ max(ireltol*norm(I), iabstol) where I is the approximate integrand, i.e. ireltol and iabstol are the relative and absolute tolerances requested, respectively. However, the number of conditional loglikelihood evaluations is not allowed to exceed imaxiters. The default values of iabstol, ireltol and imaxiters are 0, 1e-8 and 10^10 respectively.
Example:
alg = LLQuad(ireltol = 1e-6, imaxiters = 10_000)
Pumas.marginal_nll(model, subject, init_params(model), alg)Pumas.LeaveFutureK — TypeLeaveFutureK(; K = 1, minimum = 2)Leave-future-K-out splitting method. In this method, the data is assumed to be a timeseries. The goal is to split the data into "past" and "future". Using this method, the data is split multiple times into 3 disjoint groups where the third group is discarded, each time starting from the full data set. The first 2 groups are typically called the past/training subset and the future/validation subset, where the future validation subset has K future data points. In the next iteration, the whole data set is then re-split using another disjoint subset of K data points as the future validation subset. This process is done iteratively until the training set has no less than minimum number of points remaining. Using this method, each data point can show up in at most 1 future validation subset. This method can be used for computing predictive loglikelihoods, in-sample loglikelihoods and doing cross-validation using subsets of future out-of-sample data. The default values of K and minimum are 1 and 2 respectively.
Example:
Assume the original data is ["A", "B", "C", "D", "E", "F"]. Leave-1-future-out splitting with minimum = 2 results in the following splittings of the data where the third column is discarded:
| Past training subset | Future validation subset | Discarded subset |
|---|---|---|
["A", "B", "C", "D", "E"] | ["F"] | [] |
["A", "B", "C", "D"] | ["E"] | ["F"] |
["A", "B", "C"] | ["D"] | ["E", "F"] |
["A", "B"] | ["C"] | ["D", "E", "F"] |
Leave-2-future-out splitting with minimum = 2 results in the following splittings:
| Past training subset | Future validation subset | Discarded subset |
|---|---|---|
["A", "B", "C", "D"] | ["E", "F"] | [] |
["A", "B"] | ["C", "D"] | ["E", "F"] |
Pumas.LeaveK — TypeLeaveK(; K = 5, shuffle = false, rng = nothing)Leave-K-out splitting method. In this method, the data is split multiple times into 2 disjoint groups, each time starting from the full data set. The 2 groups are typically called training and validation subsets, where the validation subset has K data points. In the next iteration, the whole data set is then re-split using another disjoint subset of K data points as the validation subset. This process is done iteratively until almost each data point has shown up in 1 and only 1 validation subset. This method can be used for computing predictive loglikelihoods, in-sample loglikelihoods and doing cross-validation using subsets of out-of-sample data.
The data is typically a vector of some sort, e.g. observations or subjects, and the splittings are order-dependent. Before performing the splittings, you can randomly shuffle the data vector by setting the shuffle keyword argument to true (default is false) getting rid of the sensitivity to the original order of the data. You can additionally pass an optional pseudorandom number generator rng to control the pseudo-randomness for maximum reproducibility.
Example:
Assume the original data is ["A", "B", "C", "D"]. Leave-1-out splitting without shuffling results in the following splittings of the data:
| Training subset | Validation subset |
|---|---|
["A", "B", "C"] | ["D"] |
["A", "B", "D"] | ["C"] |
["A", "C", "D"] | ["B"] |
["B", "C", "D"] | ["A"] |
where each data point showed once and only once in a validation subset.
Leave-2-out splitting without shuffling results in the following splittings:
| Training subset | Validation subset |
|---|---|
["A", "B"] | ["C", "D"] |
["C", "D"] | ["A", "B"] |
Pumas.MarginalMCMC — TypeMarginalMCMC
An instance of the marginal Hamiltonian Monte Carlo (HMC) based No-U-Turn Sampler (NUTS) algorithm where the subject-specific parameters are marginalized and only the population parameters are sampled. All the options of the algorithm can be passed to the constructor of MarginalMCMC as keyword arguments. Example:
alg = MarginalMCMC(marginal_alg = LaplaceI(), nsamples = 1000)The main options that can be set are:
marginal_alg: the algorithm used to marginalize out the subject-specific parameters, defaults toLaplaceI()but can also beFO()orLaplaceI()target_accept: target acceptance ratio for the NUTS algorithmnsamples: number of Markov Chain Monte Carlo (MCMC) samples to generatenadapts: number of adaptation steps in the NUTS algorithmnchains: number of MCMC chains to sampleess_per_chain: target effective sample size (ESS) per chain, sampling terminates if the target is reachedcheck_every: the number of samples after which the ESS per chain is checkedtime_limit: a time limit for sampling in seconds, sampling terminates if the time limit is reachedensemblealg: can be set toEnsembleSerial()for serial sampling,EnsembleThreads()for multi-threaded sampling orEnsembleDistributed()for multi-processing (aka distributed parallelism) samplingparallel_chains: can be set totrueorfalse, if set totruethe chains will be sampled in parallel using either multi-threading or multi-processing depending on the value ofensemblealgparallel_subjects: can be set totrueorfalse, if set totruethe log probability computation will be parallelized over the subjects using either multi-threading or multi-processing depending on the value ofensemblealgrng: the random number generator useddiffeq_options: aNamedTupleof all the differential equations solver's optionsconstantcoef: aNamedTupleof the parameters to be fixed during sampling. This can be used to sample from conditional posteriors fixing some parameters to specific values, e.g.constantcoef = (σ = 0.1,)fixes theσparameter to 0.1 and samples from the posterior of the remaining parameters conditional onσ.
Pumas.NL — TypeNL(pkmodel; hill=false)Specify nonlinear clearance of a pkmodel for model generation.
pkmodel::Pumas.Explicitmodel - The pharmacokinetic model, e.g. Depots1Central1(). hill::Bool - toggle the use of a Hill coefficient.
Pumas.ObsAdditive — TypeObsAdditive(obsname::Symbol)Get an additive error model for automatic model macro generation.
- obsname::Symbol - choose what the output should be called. If you'll be fitting the model towards data then this symbol should match the identifier of the
observationin your data.
Pumas.ObsCombined — TypeObsCombined(obsname::Symbol)Get a combined (proportional and additive) error model for automatic model macro generation.
- obsname::Symbol - choose what the output should be called. If you'll be fitting the model towards data then this symbol should match the identifier of the
observationin your data.
Pumas.ObsProportional — TypeObsProportional(obsname::Symbol)Get a proportional error model for automatic model macro generation.
- obsname::Symbol - choose what the output should be called. If you'll be fitting the model towards data then this symbol should match the identifier of the
observationin your data.
Pumas.PDiagDomain — Type@param x ∈ PDiagDomain(n::Int; init=ones(n))Specifies a parameter as a positive diagonal matrix, with initial diagonal elements specified by init.
Pumas.PSDDomain — Type@param x ∈ PSDDomain(n::Int; init=Matrix{Float64}(I, n, n))Specifies a parameter as a symmetric n-by-n positive semi-definite matrix. init sets the initial value of the parameter and should be a positive semi-definite Matrix of Float64s.
Pumas.PSISCrossvalidation — TypePSISCrossvalidation(; split_method, split_by, filter_nans = true, pareto_shape_threshold = 0.7)Defines an instance of the PSISCrossvalidation algorithm for crossvalidation. In this algorithm, the fitting is not re-run but the MCMC samples are re-weighted using importance sampling (IS) while leaving out a subset of the data each time. The weights are then smoothed using a generalized Pareto distribution. The smoothed weights are then used to estimate the predictive likelihood of the out-of-sample data. This algorithm is known as the Pareto-smoothed importance sampling, leave-one-out (PSIS-LOO) crossvalidation method. In Pumas, we generalize the approach to handle any training-validation splitting of the data.
The shape parameters of the Pareto distribution fitted to the importance weights provide a diagnostic for the reliability of estimates of the predictive likelihoods. This happens typically when the samples are too sensitive to the a certain data point or subset removed and used for validation. In some cases NaNs can even result. The filter_nans set to true by default can be used to ignore the cases where the smoothing fails eliminating this run from the cross-validation result. Additionally, any run leading to a shape parameter higher than pareto_shape_threshold (default is 0.7) will also be removed from the cross-validation result. So if there are 100 different training-validation splittings and 9 of those lead to NaN or a shape parameter higher than pareto_shape_threshold, only the remaining 91 will be used for estimating the predictive loglikelihoods.
The way by which the data is split between training and validation sets is determined using the keyword arguments split_method and split_by. The split_method argument can be of any of the following types:
LeaveKfor leave-K-out cross-validationKFoldfor K-fold cross-validationLeaveFutureKfor leaving K future points at a time
while split_by can be of any of the following types:
BySubjectfor leaving out subjectsByObservationfor leaving out individual observations per subject
Please refer to the documentation of each of the above types for the parameters of each split method, e.g. typing ?LeaveK or ?BySubject.
Examples:
Assume there are 5 subjects and 10 observations per subject and that ft is the result of the fit function. The following are some of the combinations in which the above inputs can be used:
- Leave-1-observation-out cross-validation, leaving 1 observation for all the subjects at a time.
allsubjects = truemeans that the same observation index is removed for all the subjects, e.g. the 10th observation for all the subjects is used for validation in the first run, then the 9th observation is used for validation in the second run, etc.
split_method = LeaveK(K = 1)
split_by = ByObservation(allsubjects = true)
cv_method = PSISCrossvalidation(;
split_method = split_method, split_by = split_by,
filter_nans = true, pareto_shape_threshold = 0.7,
)
r = crossvalidate(ft, cv_method)
- Leave-2-future-observations-out cross-validation, leaving 2 future observations per subject such that no less than 4 observations are used in training.
allsubjects = falsemeans that only the data of one subject at a time will be split. So in the first run, only observations 1 to 4 of subject 1 are used for training and observations 5 and 6 of subject 1 are used for validation. In the second run, observations 1 to 6 of subject 1 are used for training and observations 7 and 8 are used for validation. In the third run, observations 1 to 8 of subject 1 get used for training and observations 9 and 10 are used for validation. For all 3 runs, all the observations of subjects 2 to 5 are used in training. Then in the fourth to sixth runs, subject 2's data gets split. In the seventh to nineth runs, subject 3's data gets split, etc.
split_method = LeaveFutureK(K = 2, minimum = 4)
split_by = ByObservation(allsubjects = false)
cv_method = PSISCrossvalidation(;
split_method = split_method, split_by = split_by,
filter_nans = true, pareto_shape_threshold = 0.7,
)
r = crossvalidate(ft, cv_method)
- Leave-1-subject-out cross-validation, marginalizing the subject-specific parameters using quadrature integration when computing predictive likelihoods. Using this method, subjects {1, 2, 3, 4} are used for training in the first run and subject {5} is used for validation. In the second run, subjects {1, 2, 3, 5} are used for training and subject {4} is used for validation, etc. The predictive likelihoods computed are the marginal likelihood of the subject computed using Gaussian quadrature with a maximum of 1000 integration points. Refer to the documentation of
LLQuadby typing?LLQuadfor more details on the parameters of the quadrature algorithm.FO(),FOCE()andLaplaceI()can also be used instead.
split_method = LeaveK(K = 1)
split_by = BySubject(marginal = LLQuad(imaxiters=1000))
cv_method = PSISCrossvalidation(;
split_method = split_method, split_by = split_by,
filter_nans = true, pareto_shape_threshold = 0.7,
)
r = crossvalidate(ft, cv_method)
- Leave-1-future-subject-out cross-validation, where the predictive likelihoods are conditional likelihoods computed using
0s for the subject-specific parameters instead of marginalizing. The minimum number of subjects is 3 so there are only 2 runs. In the first run, the training subjects are {1, 2, 3} and the validation subject is {4}. In the second run, the training subjects are {1, 2, 3, 4} and the validation subject is {5}.
split_method = LeaveFutureK(K = 1, minimum = 3)
split_by = BySubject(marginal = nothing)
cv_method = PSISCrossvalidation(;
split_method = split_method, split_by = split_by,
filter_nans = true, pareto_shape_threshold = 0.7,
)
r = crossvalidate(ft, cv_method)
Pumas.ParamSet — TypeParamSet(params::NamedTuple)Construct a Pumas.jl parameter set.
params should be a NamedTuple that maps a parameter name to a Domain or a Distribution.
Example
ParamSet((;
tvKa = RealDomain(lower=0.),
tvCL = RealDomain(lower=0.),
tvω = PDiagDomain(2)
))Pumas.Population — TypeA Population is an AbstractVector of Subjects.
Pumas.PumasEMModel — TypePumasEMModel{Σ,NRE,C,PRR,DCR,DR,PR}Parameters: Σ - Covariance structure; (2,1) indicates a matrix with 2x2 and 1x1 diagonal blocks. NRE - The number of formulas without an associated random effect. C - Covariates defined in @covariates block PRR - Parameters defined in @pre block DCR - Parameters defined in @dosecontrol block DR - Parameters defined in @dynamics block POR - Parameters defined in @post block
Currently supported fitting methods are SAEM() and LaplaceI().
Pumas.PumasModel — TypePumasModelA model takes the following arguments (arg [= default]):
param: aParamSetdetailing the parameters and their domainrandom: a mapping from a named tuple of parameters ->ParamSetpre: a mapping from the (params, randeffs, subject) -> ODE parameters.dcp: a mapping from the (params, randeffs, subject) -> dose-control parameters.init: a mapping (col,t0) -> initial conditionsprob: a DEProblem describing the dynamics (either exact or analytical)derived: the derived variables and error distributions (param, randeffs, data, ode vals) -> sampling distobserved = (_, _, _, _, _, _, samples) -> samples: simulated values from the error model and post processing: (param, randeffs, data, ode vals, samples) -> valsoptions::PumasModelOptions = PumasModelOptions(; subject_t0=false): pass additional options, seePumasModelOptions.syms::PumasModelSymbols = PumasModelSymbols(): the symbols of all the variables defined inside the model, seePumasModelSymbols.sys = nothing: the ModelingToolkit.jlSystemused to generateprob- if applicable.macroexpr::Expr = :(): the expression passed to the@modelmacro - if applicable.desc::Dict{Symbol, String} = Dict{Symbol, String}(): descriptions of the symbols (variables, parameters, etc.) used in the model.metadata::Dict{Symbol, Any} = Dict{Symbol, Any}(): include aDictof arbitrary metadata.
Pumas.RealDomain — Type@param x ∈ RealDomain(;lower=-∞,upper=∞,init=0)Specifies a parameter as a real value. lower and upper are the respective bounds, init sets the initial value of the parameter and is used for init_params.
Pumas.SAEM — TypeSAEM(; iters::Tuple{Int,Int,Int} = (200,100,100), α::Float64 = 0.7, λ::Float64 = 0.975, repeat::Int = 1, verbosity::Int = 0)
iters: A tuple indicating the number of iterations to run in the exploratory, middle, and smoothing phases of SAEM.α: Thelearning ratefor the middle phase. On thekth update,α^kweight is placed on thekth iteration, and1 - α^kon the previous.λ: Maximum standard deviation change rate. A value of0.95means thatω_new = clamp(ω_new, ω_old * 0.95, ω_old / 0.95)on each iteration. Slowing the decliine inωs andσs increases exploration, while limiting the growth prevents excessive flattening of the log likelihood.repeat: Indicates how many times to repeatiters. Defaults to1.verbosity: How verbose should it be?0: silent,1: print on eachrepeat,2: print once per iteration.
Pumas.SIR — MethodSIR(; rng=Random.default_rng, samples, resamples)This is the sampling importance re-sampling (SIR) algorithm. In the SIR algorithm, a number of parameter values are sampled from a truncated multivariate normal distribution using rejection sampling. The mean of the multivariate normal distribution is equal to the maximum likelihood parameter values. The covariance is variance-covariance matrix estimator, and the distribution is truncated to the domains of the parameters. After sampling, the samples are weighed by their respective likelihood values, and a subset of these samples is then selected by weighted re-sampling without replacement. The keyword arguments samples and resamples are the number of samples and re-samples respectively. They do not have default values so they must be input by the user.
Pumas.Subject — TypeSubjectThe data corresponding to a single subject:
Fields:
id: identifierobservations: a named tuple of the dependent variablescovariates: a named tuple containing the covariates, ornothing.events: a vector ofEvents.time: a vector of time stamps for the observations
When there are time varying covariates, each covariate is interpolated with a common covariate time support. The interpolated values are then used to build a multi-valued interpolant for the complete time support.
From the multi-valued interpolant, certain discontinuities are flagged in order to use that information for the differential equation solvers and to correctly apply the analytical solution per region as applicable.
Constructor
Subject(;id = "1",
observations = NamedTuple(),
events = Event[],
time = observations isa AbstractDataFrame ? observations.time : nothing,
event_data = true,
covariates::Union{Nothing, NamedTuple} = nothing,
covariates_time = observations isa AbstractDataFrame ? observations.time : nothing,
covariates_direction = :right)Subject may be constructed from an <:AbstractDataFrame with the appropriate schema or by providing the arguments directly through separate DataFrames / structures.
Examples:
julia> Subject()
Subject
ID: 1
julia> Subject(id = 20, events = DosageRegimen(200, ii = 24, addl = 2), covariates = (WT = 14.2, HT = 5.2))
Subject
ID: 20
Events: 3
Covariates: WT, HT
julia> Subject(covariates = (WT = [14.2, 14.7], HT = fill(5.2, 2)), covariates_time = [0, 3])
Subject
ID: 1
Covariates: WT, HT
Pumas.Subject — MethodSubject
Constructor
Subject(simsubject::SimulatedObservations)Roundtrip the result of simobs, i.e. SimulatedObservations to a Subject/Population
Example:
sims = simobs(model, pop, params)To convert sims to a Population, broadcast as below
Subject.(sims)Pumas.Subject — MethodSubject(subject::Subject; id::String, events)
Construct a Subject from an existing Subject while replacing information according to the keyword arguments. The possible keyword arguments are
id, sets the subject identifier to the stringid. Defaults to the identifier already insubject.events, sets the subject events to the inputDosageRegimen. Defaults to the events already present insubject.
Pumas.TimeToEvent — TypeTimeToEvent{T}Distribution-like struct to store the hazard and the cumulative hazard in a time-to-event model. The dependent variable in a model that uses TimeToEvent should be a censoring variable that is one when the variable is observed (not censored) zero when the variable is right censored. Currently, no other censoring types are supported.
Example
...
@pre begin
θeff = θ*DOSE
λ = λ₀*exp(θeff)
end
@dynamics begin
Λ' = λ
end
@derived begin
DV ~ @. TimeToEvent(λ, Λ)
end
...Pumas.VectorDomain — Type@param x ∈ VectorDomain(n::Int; lower=-∞,upper=∞,init=0)Specifies a parameter as a real vector of length n. lower and upper are the respective bounds, init sets the initial value of the parameter and is used for init_params. The keyword arguments can all take either numbers or vectors. If numbers then the same value will be applied to all n vector elements. If a you specify a vector (of length n) then you can adjust the parameters of the VectorDomain elements individually.
Pumas.ZeroSumDomain — TypeZeroSumDomain(n; init = zeros(n))
ZeroSumDomain(init)
ZeroSumDomain(; init)Return a zero-sum domain of dimension n. Any n-dimensional vector with a sum of 0 is in this domain. Instead of passing n, an initial value init can be passed instead either as a positional or keyword argument.
Base.truncate — Methodtruncate(b::BayesMCMCResults; burnin = 0, ratio = 1.0)Removes burnin samples from the beginning of each chain and then keeps only ratio * 100% of the remaining samples in each chain (i.e. thinning).
CommonSolve.solve — Functionsol = solve(
model::AbstractPumasModel,
population::Union{Population, Subject},
param::NamedTuple,
randeffs::NamedTuple=sample_randeffs(rng, model, param, population);
ensemblealg=EnsembleSerial(),
diffeq_options=NamedTuple(),
rng=default_rng()
)Solve the model applied to the Subject(s) within population using parameters param and random effects randeffs. By default, the times at which to save the solution are taken to be the observation times for each Subject within population. This can be overriden by supplying a vector or range as saveat to diffeq_options - e.g. diffeq_options=(; saveat=[0., 0.5, 1.]).
Arguments
modelmay either be aPumasModelor aPumasEMModel.populationmay either be aPopulationofSubjects or a singleSubject.paramis parameter set in the form of aNamedTuple, e.g.(; tvCL=1., tvKa=2., tvV=1.).randeffsis an optional argument that, if used, should specify the random effects for each subject inpopulation. Such random effects are specified byNamedTuples forPumasModels(e.g.(; tvCL=1., tvKa=2.)) and byTuplesforPumasEMModels (e.g.(1., 2.)). Ifpopulationis a singleSubject(without being enclosed in a vector) then a single such random effect specifier should be passed. If, however,populationis aPopulationof multipleSubjects thenrandeffsshould be a vector containing one such specifier for eachSubject. The functionsinit_randeffs,zero_randeffs, andsample_randeffsare sometimes convenient for generatingrandeffs:
randeffs = zero_randeffs(model, param, population)
solve(model, population, param, randeffs)If no randeffs is provided, then random ones are generated according to the distribution in the model.
ensemblealgis a keyword argument that allows you to choose between different modes of parallelization. Options includeEnsembleSerial(),EnsembleThreads()andEnsembleDistributed().diffeq_optionsis a keyword argument that takes aNamedTupleof options to pass on to the differential equation solver.rngis a keyword argument where you can specify which random number generator to use.
Distributions.logpdf — Methodlogpdf(d::Pumas.Constrained, x)Evaluate the logarithm of the probability density of the constrained distribution, d, at x.
- If
dis a constrained univariate distribution thenxshould be a scalar. - If
dis a constrained multivariate distribution thenxshould be a vector.
Evaluations of x outside of d's constraints returns -Inf.
Note that d itself is not a true distribution since its probability mass is not rescaled to 1.
GlobalSensitivity.gsa — Functionfunction GlobalSensitivity.gsa( m::PumasModel, pop::Population, params::NamedTuple, method::GlobalSensitivity.GSAMethod, vars = [:dv], prangelow = NamedTuple(), prangehigh = NamedTuple(); constantcoef::NamedTuple = NamedTuple(), rng::AbstractRNG = defaultrng(), diffeqoptions::NamedTuple = NamedTuple(), N = nothing, n = nothing, samples = 1000, kwargs..., )
Function to perform global sensitivty analysis
The arguments are:
model: aPumasModel, either defined by the@modelDSL or the function-based interface.population: aPopulation.params: a named tuple of parameters.method: one of theGSAMethods from GlobalSensitivity.jl,Sobol(),Morris(),eFAST(),RegressionGSA().vars: a list of the derived variables to run GSA on.p_range_low&p_range_high: the lower and upper bounds for the parameters you want to run the GSA on, if not specified the bounds from@paramblock are used.
Keyword arguments are:
constantcoef: The values of the parameters that are not to be varied in the GSA, passed as aNamedTuple.rng: The random number generator to use.samples: The number of samples to use for the GSA.diffeq_options: The options to pass to the differential equation solver, passed on to thesimobscalled insidegsa.
For method specific arguments that are passed with the method constructor you can refer to the GlobalSensitivity.jl documentation.
LinearAlgebra.cond — Methodcond(pmi::FittedPumasModelInference)Return the condition number of the variance-covariance matrix stored in pmi. Throws an error if pmi is the result of a call to infer with Pumas.Bootstrap or if the
variance-covariance calculation failed.
MCMCDiagnosticTools.gewekediag — Methodgewekediag(b::BayesMCMCResults; subject::Union{Nothing,Int} = nothing, first::Real = 0.1, last::Real = 0.5)Computes the Geweke diagnostic [Geweke1991] for each chain outputting a p-value per parameter. If subject is nothing (default), the chains diagnosed are those of the population parameters. If subject is set to an integer index, the chains diagnosed are those of the subject-specific parameters corresponding to the subject with the input index.
The Geweke diagnostic compares the sample means of two disjoint sub-chains X₁ and X₂ of the entire chain using a normal difference of means hypothesis test where the null and alternative hypotheses are defined as:
H₀: μ₁ = μ₂H₁: μ₁ ≂̸ μ₂
where μ₁ and μ₂ are the population means. The first sub-chain X₁ is taken as the first (first * 100)% of the samples in the chain, where first is a keyword argument defaulting to 0.1. The second sub-chain X₂ is taken as the last (last * 100)% of the samples in the chain, where last is a keyword argument defaulting to 0.5.
The test statistic used is: z₀ = x̅₁ - x̅₂ / √(s₁² + s₂²) where x̅₁ and x̅₂ are the sample means of X₁ and X₂ respectively, and s₁ and s₂ are the Markov Chain standard error (MCSE) estimates of X₁ and X₂ respectively. Auto-correlation is assumed within the samples of each individual sub-chain, but the samples in X₁ are assumed to be independent of the samples in X₂. The p-value output is an estimate of P(|z| > |z₀|), where z is a standard normally distributed random variable.
Low p-values indicate one of the following:
- The first and last parts of the chain are sampled from distributions with different means, i.e. non-convergence,
- The need to discard some initial samples as burn-in, or
- The need to run the sampling for longer due to lack of samples or high auto-correlation.
High p-values indicate the inability to conclude that the means of the first and last parts of the chain are different with statistical significance. However, this alone does not guarantee convergence to a fixed posterior distribution because:
- Either the standard deviations or higher moments of
X₁andX₂may be different, or - The independence assumption between
X₁andX₂may not be satisfied when high auto-correlation exists.
MCMCDiagnosticTools.heideldiag — Methodheideldiag(b::BayesMCMCResults; subject::Union{Nothing,Int} = nothing, alpha::Real = 0.05, eps::Real = 0.1, start::Integer = 1)Compute the Heidelberger and Welch diagnostic [Heidelberger1983] for each chain. If subject is nothing (default), the chains diagnosed are those of the population parameters. If subject is set to an integer index, the chains diagnosed are those of the subject-specific parameters corresponding to the subject with the input index.
The Heidelberger diagnostic attempts to:
- Identify a cutoff point for the initial transient phase for each parameter, after which the samples can be assumed to come from a steady-state posterior distribution. The initial transient phase can be removed as burn-in. The cutoff point for each parameter is given in the
burnincolumn of the output dataframe. - Estimate the relative confidence interval (confidence interval divided by the mean) for the mean of the steady-state posterior distribution of each parameter, assuming such steady-state distribution exists in the samples. A large confidence interval implies either the lack of convergence to a stationary distribution or lack of samples. Half the relative confidence interval is given in the
halfwidthcolumn of the output dataframe. Thetestcolumn will betrue(1) if thehalfwidthis less than the input targeteps(default is 0.1) andfalse(0) otherwise. Note that parameters with mean value close to 0 can have erroneous relative confidence intervals because of the division by the mean. Thetestvalue can therefore be expected to befalse(0) for those parameters without concluding lack of convergence. - Quantify the extent to which the distribution of the samples is stationary using statistical testing. The returned p-value, shown in the
pvaluecolumn of the output dataframe, can be considered a measure of mean stationarity. A p-value lower than the input thresholdalpha(default is 0.05) implies lack of stationarity of the mean, i.e. the posterior samples did not converge to a steady-state distribution with a fixed mean.
The Heidelberger diagnostic only tests for the mean of the distribution. Therefore, it can only be used to detect lack of convergence and not to prove convergence. In other words, even if all the numbers seem normal, one cannot conclude that the chain converged to a stationary distribution or that it converged to the true posterior.
MCMCDiagnosticTools.rafterydiag — Methodrafterydiag(b::BayesMCMCResults; subject::Union{Nothing,Int} = nothing, q = 0.025, r = 0.005, s = 0.95, eps = 0.001)Compute the Raftery and Lewis diagnostic [Raftery1992]. This diagnostic is used to determine the number of iterations required to estimate a specified quantile q within a desired degree of accuracy. The diagnostic is designed to determine the number of autocorrelated samples required to estimate a specified quantile $\theta_q$, such that $\Pr(\theta \le \theta_q) = q$, within a desired degree of accuracy. In particular, if $\hat{\theta}_q$ is the estimand and $\Pr(\theta \le \hat{\theta}_q) = \hat{P}_q$ the estimated cumulative probability, then accuracy is specified in terms of r and s, where $\Pr(q - r < \hat{P}_q < q + r) = s$. Thinning may be employed in the calculation of the diagnostic to satisfy its underlying assumptions. However, users may not want to apply the same (or any) thinning when estimating posterior summary statistics because doing so results in a loss of information. Accordingly, sample sizes estimated by the diagnostic tend to be conservative (too large).
Furthermore, the argument r specifies the margin of error for estimated cumulative probabilities and s the probability for the margin of error. eps specifies the tolerance within which the probabilities of transitioning from initial to retained iterations are within the equilibrium probabilities for the chain. This argument determines the number of samples to discard as a burn-in sequence and is typically left at its default value.
Pumas._convergencedata — Method_convergencedata(obj; metakey="x")Returns the "timeseries" of optimization as a matrix, with series as columns.
This must return parameter data in the same order that _paramnames returns names.
Pumas._fixed_to_constant_paramset — Method_fixed_to_constant_paramset(paramset::ParamSet, param::NamedTuple, fixed::NamedTuple, omegas::Tuple)Replace individual parameter Domains in paramset with ConstantTranform if the parameter has an entry in fixed. Return a new parameter ParamSet with the values in fixed in place of the values in input param.
Pumas._lpdf — Method_lpdf(d,x)The log pdf: this differs from Distributions.logdpf definintion in a couple of ways:
- if
dis a non-distribution it assumes the Dirac distribution. - if
xisNaNorMissing, it returns 0. - if
dis aNamedTupleof distributions, andxis aNamedTupleof observations, it computes the sum of the observed variables.
Pumas._objectivefunctionvalues — Method_objectivefunctionvalues(obj)Returns the objective function values during optimization. Must return a Vector{Number}.
Pumas._paramnames — Method_paramnames(obj)Returns the names of the parameters which convergence is being checked for.
This must return parameter names in the same order that _convergencedata returns data.
Pumas.add_t — Methodadd_t(ex, vars)
Append (t) to all the vars symbols in the expression ex.
Pumas.apply_formatting — Methodapplyformatting(inputex, vars; show_t=true, italicize=true)
Apply formatting to the vars of an input expression before latexification.
Pumas.build_model_str — Methodbuild_model_str(pk, pd, obs, dcp)Generate a string that fully specifies an @model macro call.
- pk - a pharmacokinetic model, for example
Depots1Central1() - pd::Union{Nothing, AbstractPDModule} - a pharmacodynamic model, for example
IDR(; direct=true, mode=:stimulation_emax) - obs::AbstractObservationModule - an observational error model, for example
ObsCombined(:MyOutputName) - dcp::DoseControl - a dose control specification, for example
DoseControl(; bioav=true)
Pumas.conditional_nll — Methodconditional_nll(m::AbstractPumasModel, subject::Subject, param, randeffs; diffeq_options)Compute the conditional negative log-likelihood of model m for subject with parameters param and random effects randeffs. diffeq_options is a NamedTuple of options passed to the ODE solver. Requires that the derived produces distributions.
Pumas.cor2cov — Methodcor2cov(ρ::AbstractVector, ω::AbstractVector)
cor2cov(ω::AbstractMatrix)
cor2cov(ω::AbstractVector)
Converts correlations and standard deviations to a covariance matrix. There are 3 ways to use the cor2cov function.
The first method has 2 inputs:
ρ: a vector of correlations which is a vector of the strict lower triangular elements of the correlation matrix of the random variables.ω: a vector of standard deviations of the random variables
Example:
ω11, ρ12, ω22 = 0.2, 0.1, 0.3
ρ = [ρ12]
ω = [ω11, ω22]
Pumas.cor2cov(ρ, ω)gives the covariance matrix:
[ ω11^2 ρ12*ω11*ω22;
ρ12*ω11*ω22 ω22^2 ]which is equal to:
2×2 Matrix{Float64}:
0.04 0.006
0.006 0.09The second method has a single input matrix ω which is a symmetric matrix whose diagonal is the standard deviations of the random variables and off-diagonals are their correlations. Example:
ω = [
ω11 ρ12;
ρ12 ω22
]
Pumas.cor2cov(ω)gives the covariance matrix:
[ ω11^2 ρ12*ω11*ω22;
ρ12*ω11*ω22 ω22^2 ]which is equal to:
2×2 Matrix{Float64}:
0.04 0.006
0.006 0.09The third method has a single input vector ω which is a vector of the lower triangular elements of the symmetric matrix whose diagonal is the standard deviations of the random variables and off-diagonals are their correlations.
Example:
ω = [ω11, ρ12, ω22]
Pumas.cor2cov(ω)gives the covariance matrix:
[ ω11^2 ρ12*ω11*ω22;
ρ12*ω11*ω22 ω22^2 ]which is equal to:
2×2 Matrix{Float64}:
0.04 0.006
0.006 0.09Pumas.covariates_interpolant — Methodcovariates_interpolant(
covariates_nt::NamedTuple,
covariates_times::NamedTuple,
id::String;
[covariates_direction::Symbol]) -> time_grid, interpolantCreates an interpolation of the elements of covariates_nt and covariates_times using the a piecewise constant interpolation scheme. The two NamedTuples should have the same keys. The values of covariates_nt and covariates_times should contain the interpolation values and time point respectively. The argument id is a string indicating the subject ID and the argument is only used for providing useful error messages.
Returns a callable interpolation struct as well as the common time grid for the covariate observations. This is safe for values which are not time-varying and it also allows for mixing subjects with multiple measurements and subjects with a single measurement. The default is left-sided constant interpolation.
Pumas.covariates_interpolant — Methodcovariates_interpolant(
covariates::Vector{Symbol},
data::AbstractDataFrame,
time::Symbol,
id::String;
[covariates_direction::Symbol]) -> time_grid, interpolantCreates an interpolation of the columns of data listed in the covariates vector at the time points in the time column of data using the a piecewise constant interpolation scheme. The argument id is a string indicating the subject ID and the argument is only used for providing useful error messages.
Returns a callable interpolation struct as well as the common time grid for the covariate observations. This is safe for values which are not time-varying and it also allows for mixing subjects with multiple measurements and subjects with a single measurement. The default is left-sided constant interpolation.
Pumas.crossvalidate — Methodcrossvalidate(b::BayesMCMCResults, cv_method::ExactCrossvalidation)Runs cross-validation by re-running the MCMC sampling while leaving out some data points each time. The first argument b is the output of the MCMC fit. The choice of how the data is split in each run is specified in the cv_method argument which is an instance of ExactCrossvalidation. Please refer to the documentation of ExactCrossvalidation by typing ?ExactCrossvalidation for details on the parameters of the method.
Pumas.crossvalidate — Methodcrossvalidate(b::BayesMCMCResults, cv_method::PSISCrossvalidation)Performs cross-validation using the Pareto smoothed importance sampling (PSIS) approach by re-weighing the samples while leaving out some data points each time. The first argument b is the output of the MCMC fit. The choice of how the data is split in each run is specified in the cv_method argument which is an instance of PSISCrossvalidation.
Pumas.dosecontrol — Methoddosecontrol(fpm::AbstractFittedPumasModel)::Vector{ConstantInterpolationStructArray}Return the dosecontrol parameters from fpm. The dosecontrol parameters are the variables bioav, lags, duration, rate if they're defined in the @dosecontrol block in the @model macro. The function returns a vector of covariate interpolation objects. Each of them can be evaluated at an arbitrary time point, but the interpolant is piecewise constant between the event times. Note, that in the model solution, subject simulation, and parameter fitting the dose control parameters can be time-varying, but they only affect the dosing according to their value at the beginning of each dose event. This means that the dose control parameters are only relevevant at these time points. Each of the covariate interpolation objects can be converted to a DataFrame by calling the DataFrame constructor on each element, or a DataFrame with all subjects can be obtained by calling the DataFrame constructor on the vector output by this function. This DataFrame will only containt the times that correspond to the start of each dose event for the reason discussed above.
Pumas.eiwres — Methodeiwres(model::PumasModel, subject::Subject, param::NamedTuple, nsim::Integer)Calculate the Expected Simulation based Individual Weighted Residuals (EIWRES).
Pumas.elpd — Methodelpd(r::CVResult)Computes an estimate of the expected log predictive density (ELPD) for an out-of-sample data point/set using the result r of crossvalidate. The output is a scalar value estimating the model's mean predictive accuracy on out-of-sample data.
Pumas.em_to_m_params — Methodem_to_m_params(m::PumasEMModel, params::NamedTuple)Given that params is a NamedTuple of coefficients used as inits for fitting m or returned by coef(fpm::FittedPumasEMModel), then em_to_m_params will return a NamedTuple of coefficients usable as inits for fitting or performing diagnostics on a PumasModel created by calling convert(PumasModel, m::PumasEMModel).
Pumas.empirical_bayes — Methodempirical_bayes(fpm::Pumas.FittedPumasModel)
empirical_bayes(fpm::Pumas.FittedPumasEMModel)
empirical_bayes(insp::Pumas.FittedPumasModelInspection)Return sampled random effects or empirical bayes estimates from a fit or model inspection. If the model was estimated with the FO likelihood approximation methods the empirical bayes estimates will be obtained using the LaplaceI approximation. If either FOCE or LaplaceI was used the final empirical bayes estimates will be returned. If Pumas.SAEM was used to fit the empirical bayes estimates are obtained using the LaplaceI approximation.
Pumas.empirical_bayes — Methodempirical_bayes(
model::PumasModel,
subject::AbstractSubject,
param::NamedTuple,
approx::LikelihoodApproximation;
diffeq_options=NamedTuple()
)Return sampled random effects or empirical bayes estimates. The empirical bayes estimates will be obtained using the LaplaceI approximation regardless of the value of the approx argument except for NaivePooled where an empty NamedTuple is returned.
Pumas.empirical_bayes_dist — Methodempirical_bayes_dist(fpm::AbstractFittedPumasModel)Estimate the distribution of random effects (typically a Normal approximation of the empirical Bayes posterior) for the subjects in a FittedPumasModel. The result is returned as a vector of MvNormals.
Pumas.epredict — Methodepredict(fpm::AbstractFittedPumasModel; nsim::Integer, rng::AbstractRNG=default_rng())Calculate the expected simulation based population predictions.
The keyword nsim controls the number of times a subject is simulated and the keyword rng can be used to provide a custom random number generator.
Pumas.eventnum — Methodeventnum(t, events) -> # of doses for each timeCreates an array that matches t in length which denotes the number of events that have occurred prior to the current point in time. If t is a scalar, outputs the number of events before that time point.
Pumas.expectation — Methodexpectation(g, ::MonteCarloExpectation, model::PumasModel, subject::Subject, dist::MvNormal; imaxiters = 10000, diffeq_options::NamedTuple = NamedTuple(), rng::AbstractRNG = Pumas.default_rng())Computes the expected predictions for subject with respect to the population parameters and the subject-specific random effects using Monte Carlo integration. dist is a multivariate normal distribution used to sample the population parameters e.g. dist = MvNormal(pmi) where pmi is the output of infer. Sampling is done using rejection sampling to ensure the parameter values are in the parameters' domains. imaxiters is the number of Monte Carlo samples used. diffeq_options can be used to customize the options of the differential equation solver used. rng is the random number generator used during the sampling, which defaults to Pumas.default_rng().
Pumas.expectation — Methodexpectation(g, ::MonteCarloExpectation, model::PumasModel, subject::Subject, param_dists::NamedTuple; imaxiters = 10000, diffeq_options::NamedTuple = NamedTuple(), rng::AbstractRNG = Pumas.default_rng())Computes the expected predictions for subject with respect to the population parameters and the subject-specific random effects using Monte Carlo integration. param_dists should be a named tuple of distributions for the population parameters. imaxiters is the number of Monte Carlo samples to use. diffeq_options can be used to customize the options of the differential equation solver used. rng is the random number generator used in the sampling, which defaults to Pumas.default_rng().
Pumas.expectation — Methodexpectation(g, quant::QuadratureExpectation, model::PumasModel, subject::Subject, param_dists::NamedTuple; ireltol=1e-3, iabstol=1e-3, imaxiters=10000, diffeq_options::NamedTuple = NamedTuple())Computes the expected predictions for subject with respect to the population parameters using quadrature methods. param_dists should be a named tuple of distributions for the population parameters. Currently, random effects are not supported. ireltol and iabstol are the tolerances for the integration. The integration algorithm will terminate if the tolerance or imaxiters is reached, whichever is first. diffeq_options can be used to customize the options of the differential equation solver.
Pumas.findinfluential — Functionfindinfluential(m::AbstractPumasModel,
data::AbstractPopulation,
param::NamedTuple,
approx::LikelihoodApproximation;
k = 5,
diffeq_options::NamedTuple = NamedTuple())
findinfluential(fpm::AbstractFittedPumasModel,
k::Integer=5,
diffeq_options::NamedTuple = NamedTuple())Return a vector of the k most influential observations based on the value of (minus) the log-likelihood function. It can be before fitting a model in order to find the subjects that contributes the most the log-likelihood value or to identify subjects for which the evaluation of the log-likelihood fails, e.g. due to incorrect data. The latter will show up as NaNs.
It can also be used on a fitted model to identify the subjects with maximum contribution to the log-likelihood.
A DataFrame method is defined on the resulting object for easy viewing of the results.
fi = findinfluential(model, pop, parameters, FOCE(), k=20)
fi_df = DataFrame(fi)Pumas.get_model_combinations — Methodget_model_combinations()Get all valid arguments to build_model_str or build_model_expr.
Returns a tuple (pks, pds, obs, dcps) where pks is a vector of all possible PK specifications, and so on, for the model generation.
Example usage:
for args in Iterators.product(get_model_combinations())
obs = make_obs(args[3], args[1], args[2])
build_model_str(args[1], args[2], obs, args[4])
endto create all model strings (~15k different ones, 1M lines of code).
Useful for testing and possibly for automatic building, etc.
Pumas.getblock — Methodgetblock(blockname, model::AbstractPumasModel) -> array of expressions for a PumasModel macro block.The blockname should match the name of the blocks used for model creation with the @model or @emmodel macro. Only the blocks that were explicitly used in model creation can be used. Valid such blocknames are :covariates, :derived, :dynamics, :dosecontrol, :init, :observations, :options, :param, :post, :pre, :random, :reactions and :vars. Only models that are defined using the @model or @emmodel macro are supported.
Pumas.icoef — Methodicoef(fpm::AbstractFittedPumasModel; obstimes)::Vector{ConstantInterpolationStructArray}Return the individual coefficients from fpm. The individual coefficients are the variables defined in the @pre block in the @model macro. The function returns a vector of covariate interpolation objects defined on obstimes. If no obstimes are given, the time points used to define the covariates will be used. Each of them can be evaluated at any time point, but the interpolant is piecewise constant between the obstimes. Each of the covariate interpolation objects can be converted to a DataFrame by calling the DataFrame constructor on each element, or a DataFrame with all subjects can be obtained by calling the DataFrame constructor on the vector output by this function.
Pumas.infer — Methodinfer(fpm::FittedPumasModel; level=0.95, rethrow_error::Bool=false, sandwich_estimator::Bool=true) -> FittedPumasModelInferenceCompute the vcov matrix and return a struct used for inference based on the fitted model fpm. The confidence intervals are calculated as the (1-level)/2 and (1+level)/2 quantiles of the estimated parameters. sandwich_estimator is a boolean that switches on or off the sandwich estimator. If rethrow_error is false (the default value), no error will be thrown if the variance-covariance matrix estimator fails. If it is true, an error will be thrown if the estimator fails.
Pumas.infer — Methodinfer(fpm::FittedPumasModel, bts::Pumas.Bootstrap; level=0.95)
Perform bootstrapping by resampling the Subjects from the Population stored in fpm. The confidence intervals are calculated as the (1-level)/2 and (1+level)/2 quantiles of the estimated parameters. The number of samples used in the bootstrapping is bts.samples. bts.ensemblealg specifies the ensemblealg used here. If ensemblealg is EnsembleSerial(), a single thread will be used. If it is EnsembleThreads() (the default value), multiple threads will be used. See the documentation of Bootstrap for more details on constructing an instance of Bootstrap.
Pumas.infer — Methodinfer(fpm::FittedPumasModel, sir::SIR; level=0.95, ensemblealg = EnsembleThreads()) -> FittedPumasModelInferencePerform sampling importance re-sampling for the Subjects from the Population stored in fpm. The confidence intervals are calculated as the (1-level)/2 and (1+level)/2quantiles of the estimated parameters.ensemblealgcan beEnsembleThreads()(the default value) to use multi-threading orEnsembleSerial()` to use a single thread.
Pumas.init_params — Functioninit_params(model::PumasModel)Create a parameter set populated with the initial parameters of PumasModel.
Pumas.init_randeffs — Functioninit_randeffs(model::AbstractPumasModel, param::NamedTuple, [, pop::Population])Create an object with random effects locked to their mean values.
The optional argument pop takes a Population and changes the output to a vector of such random effects with one element for each subject within the population.
Pumas.inspect — Methodinspect(fpm::AbstractFittedPumasModel; wres_approx::LikelihoodApproximation, nsim::Int, rng)Output a summary of the model predictions, residuals, Empirical Bayes estimates, and NPDEs (when requested).
Called on a fit output and allows the keyword argument wres_approx for approximation method to be used in residual calculation. The default value is the approximation method used for the marginal likelihood calculation in the fit that produced fpm. The keyword nsim controls the number of times each subject is simulated for npde computation. A FittedPumasModelInspection object with pred, wres, ebes, and npdes is output.
Pumas.lrtest — Methodlrtest(fpm_0::AbstractFittedPumasModel, fpm_A::AbstractFittedPumasModel)::LikelihoodRatioTestCompute the likelihood ratio test statistic of the null hypothesis defined by fpm_0 against the the alternative hypothesis defined by fpm_A. The pvalue function can be used for extracting the p-value based on the asymptotic Χ²(k) distribution of the test statistic.
Pumas.marginal_nll — Functionmarginal_nll(model, subject, param, approx, ...)
marginal_nll(model, population, param, approx, ...)Compute the marginal negative loglikelihood of a subject or dataset, using the integral approximation approx.
See also loglikelihood.
Pumas.npde — Methodnpde(fpm::AbstractFittedPumasModel; nsim::Integer)Calculate the normalised prediction distribution errors (NPDE).
The keyword nsim controls the number of times each subject is simulated.
Pumas.penalized_conditional_nll — Methodpenalized_conditional_nll(m::AbstractPumasModel, subject::Subject, param::NamedTuple, randeffs::NamedTuple; diffeq_options)Compute the penalized conditional negative log-likelihood. This is the same as conditional_nll, except that it incorporates the penalty from the prior distribution of the random effects.
Here param can be either a NamedTuple or a vector (representing a transformation of the random effects to Cartesian space).
Pumas.postprocess — Methodpostprocess(obs::SimulatedObservations)
postprocess(func::Function, obs::SimulatedObservations)
postprocess(obs::SimulatedPopulation; stat = identity)
postprocess(func::Function, obs::SimulatedPopulation; stat = identity)Post-processes the output of simobs. The output of a simobs call stores the simulated observations but also all the intermediate values computed such as the parameters used, individual coefficients, dose control parameters, covariates, differential equation solution, etc. There are a number of post-processing operations you can do on the simulation output to compute various queries and summary statistics based on the simulations.
The postprocess function is a powerful tool that allows you to make various queries using the simulation results. There are multiple ways to use the postprocess function. The first way to use the postprocess function is to extract all the information stored in the simulation result in the form of a vector of named tuples. Each named tuple has all the intermediate values evaluated when simulating 1 run. Let sims be the output of any simobs operation.
generated = Pumas.postprocess(sims)generated is a vector of simulation runs. Hence, generated[i] is the result from the ith simulation run. Time-dependent variables are stored as a vector with 1 element for each time point where there is an observation. Alternatively, if obstimes was set instead when calling simobs, the time-dependent variables will be evaluated at the time points in obstimes instead.
The second way to use postprocess is by passing in a post-processing function. The post-processing function can be used to:
- Transform the simulated quantities, or
- Compare the simulated quantities to the observations.
We use the do syntax here which is short for passing in a function as the first argument to postprocess. The do syntax to pass in a post-processing function is:
Pumas.postprocess(sims) do gen, obs
...
endwhere gen is the named tuple of all generated quantities from 1 simulation run and obs is the named tuple of observations. For instance to query the ratio of simulated observations dv that are higher than the observed quantity dv at the respective time points, you can use:
Pumas.postprocess(sims) do gen, obs
sum(gen.dv .> obs.dv) / length(gen.dv)
endgen.dv is the simulated vector of dv whose length is the same as the number of observation time points. obs.dv is the observation vector dv. gen.dv .> obs.dv returns a vector of true/false, with one element for each time point. The sum of this vector gives the number of time points where the simulation was higher than the observation. Dividing by the number of time points gives the ratio. When using postprocess in this way, the output is always a vector of the query results. In the query function body, you can choose to use only gen or only obs but the header must always have both gen and obs:
Pumas.postprocess(sims) do gen, obs
...
endThe third way to use the postprocess function is to compute summary statistics of the simulated quantities or of functions of the simulated quantities. Summary statistics can be computed by passing a statistic function as the stat keyword argument. For example in order to estimate the probability that a simulated value is higher than an observation, you can use:
Pumas.postprocess(sims, stat = mean) do gen, obs
gen.dv .> obs.dv
endThis function will do 2 things:
- Concatenate the query results (e.g.
gen.dv .> obs.dv) from all the simulation runs into a single vector. - Compute the mean value of the combined vector.
Alternatively, you can use the mean function to do the same thing without using the keyword argument. The following will call the postprocess function under the hood.
mean(sims) do gen, obs
gen.dv .> obs.dv
endThe result of this operation will be a scalar equal to the mean value of the concatenated vector of queries.
In order to get the probability that the simulated quantity is higher than the observation for each time point, you can call the mean function externally as such:
generated = Pumas.postprocess(sims) do gen, obs
gen.dv .> obs.dv
end
mean(generated)This returns a vector of probabilities of the same length as the number of time points without concatenating all the queries together.
To compute a summary statistic of all the generated quantity, you can also call:
Pumas.postprocess(sims, stat = mean)without specifying a post-processing function or for short:
mean(sims)Besides mean, you can also use any of the following summary statistic functions in the same way:
stdfor element-wise standard deviationvarfor element-wise variancecorfor correlation between multiple quantitiescovfor covariance between multiple quantities
These functions can be passed in as the stat keyword argument to postprocess or they can be used in the short form, e.g.:
generated = Pumas.postprocess(sims, stat = std) do gen, obs
...
end
std(sims) do gen, obs
...
end
std(sims)
generated = Pumas.postprocess(sims, stat = var) do gen, obs
...
end
var(sims) do gen, obs
...
end
var(sims)The cor and cov statistics are unique in that they require a post-processing function that outputs a vector. For example, to estimate the correlation between the CL and Vc parameters in a 1-compartment model, you can use any of the following:
Pumas.postprocess(s, stat = cor) do gen, obs
[gen.CL[1], gen.Vc[1]]
end
cor(s) do gen, obs
[gen.CL[1], gen.Vc[1]]
endNote that gen.CL is a vector of simulated CL values for all the time points. But since the value is constant across time, we can use the first element gen.CL[1] only. cov can be used instead of cor to compute the covariance matrix. The output of this operation is either a correlation or a covariance matrix.
Pumas.probstable — Methodprobstable(fpm::FittedPumasModel)
Return a DataFrame with outcome probabilities of all discrete dependent variables.
Pumas.pvalue — Methodpvalue(t::LikelihoodRatioTest)::RealCompute the p-value of the likelihood ratio test t based on the asymptotic Χ²(k) distribution of the test statistic.
Pumas.read_pumas — Methodread_pumas(filepath::String, args...; kwargs...)
read_pumas(
df::AbstractDataFrame;
observations=Symbol[:dv],
covariates=Symbol[],
id::Symbol=:id,
time::Symbol=:time,
evid::Union{Nothing,Symbol}=nothing,
amt::Symbol=:amt,
addl::Symbol=:addl,
ii::Symbol=:ii,
cmt::Symbol=:cmt,
rate::Symbol=:rate,
ss::Symbol=:ss,
route::Symbol=:route,
mdv::Symbol=nothing,
event_data::Bool=true,
covariates_direction::Symbol=:right,
check::Bool=event_data,
adjust_evid34::Bool=true)Import NMTRAN-formatted data.
df:DataFramecontaning the data to be converted to aVector{<:Subject}observations: dependent variables specified by a vector of column namescovariates: covariates specified by a vector of column namesid: specifies the ID column of the dataframetime: specifies the time column of the dataframeevid: specifies the event ID column of the dataframe. See?Pumas.Eventfor more details.amt: specifies the dose amount column of the dataframe. See?Pumas.Eventfor more details.addl: specifies the column of the dataframe that indicated the number of repeated dose events. If not specified then the value is zero.ii: specifies the dose interval column of the dataframe. See?Pumas.Eventfor more details.cmt: specifies the compartment column of the dataframe. See?Pumas.Eventfor more details.rate: specifies the infusion rate column of the dataframe. See?Pumas.Eventfor more details.ss: specifies the steady state column of the dataframe. See?Pumas.Eventfor more details.route: specifies the route of administration column of the dataframe. See?Pumas.Eventfor more details.mdv: specifies the the column of the dataframe indicating if observations are missing.event_data: toggles assertions applicable to event datacovariates_direction: specifies direction of covariate interpolation. Either:leftor:right(default)check: toggles NMTRAN compliance check of the input dataadjust_evid34: toggles adjustment of time vector for reset events (evid=3andevid=4). If true (the default) then the time of the previous event is added to the time on record to ensure that the time vector is monotonically increasing.
Pumas.remove_italics — Methodremove_italics(ex, vars)
Ensure that the variables specified in vars are not italicized during latexification of ex.
Pumas.sample_params — Methodsample_params(model::PumasModel; rng = Pumas.default_rng())Create a parameter set populated with values for the parameters of a PumasModel. Random values will be drawn from the prior distributions for parameters that have a prior. Otherwise, a determinsitic value will be assigned to the parameter similar to init_params.
Pumas.sample_randeffs — Functionsample_randeffs([rng::AbstractRNG=Random.default_rng(),] model::AbstractPumasModel, param::NamedTuple [, pop::Population])Generate a random set of random effects for model model, using parameters param. Optionally, a random number generator object rng can be passed as the first argument.
The optional argument pop takes a Population and changes the output to a vector of such random effects with one element for each subject within the population.
Pumas.simobs — Functionsimobs(
model::AbstractPumasModel,
population::Union{Subject, Population}
param,
randeffs=sample_randeffs(model, param, population);
obstimes=nothing,
ensemblealg=EnsembleSerial(),
diffeq_options=NamedTuple(),
rng=Random.default_rng(),
)Simulate random observations from model for population with parameters param at obstimes (by default, use the times of the existing observations for each subject). If no randeffs is provided, then random ones are generated according to the distribution in the model.
Arguments
modelmay either be aPumasModelor aPumasEMModel.populationmay either be aPopulationofSubjects or a singleSubject.paramshould be either a single parameter set, in the form of aNamedTuple, or a vector of such parameter sets. If a vector then each of the parameter sets in that vector will be applied in turn. Example:(; tvCL=1., tvKa=2., tvV=1.)
randeffsis an optional argument that, if used, should specify the random effects for each subject inpopulation. Such random effects are specified byNamedTuples forPumasModels(e.g.(; tvCL=1., tvKa=2.)) and byTuplesforPumasEMModels (e.g.(1., 2.)). Ifpopulationis a singleSubject(without being enclosed in a vector) then a single such random effect specifier should be passed. If, however,populationis aPopulationof multipleSubjects thenrandeffsshould be a vector containing one such specifier for eachSubject. The functionsinit_randeffs,zero_randeffs, andsample_randeffsare sometimes convenient for generatingrandeffs:
randeffs = zero_randeffs(model, param, population)
solve(model, population, param, randeffs)If no randeffs is provided, then random ones are generated according to the distribution in the model.
obstimesis a keyword argument where you can pass a vector of times at which to simulate observations. The default,nothing, ensures the use of the existing observation times for eachSubject.ensemblealgis a keyword argument that allows you to choose between different modes of parallelization. Options includeEnsembleSerial(),EnsembleThreads()andEnsembleDistributed().diffeq_optionsis a keyword argument that takes aNamedTupleof options to pass on to the differential equation solver.rngis a keyword argument where you can specify which random number generator to use.
Pumas.simobs — Functionsimobs(pmi::FittedPumasModelInference, population::Population = pmi.fpm.data, randeffs::Union{Nothing, AbstractVector{<:NamedTuple}} = nothing; samples::Int, rng = default_rng(), kwargs...,)Simulates observations from the fitted model using a truncated multivariate normal distribution for the parameter values when a covariance estimate is available and using non-parametric sampling for simulated inference and bootstrapping. When a covariance estimate is available, the optimal parameter values are used as the mean and the variance-covariance estimate is used as the covariance matrix. Rejection sampling is used to avoid parameter values outside the parameter domains. Each sample uses a different parameter value. samples is the number of samples to sample. population is the population of subjects which defaults to the population associated the fitted model. randeffs can be set to a vector of named tuples, one for each sample. If randeffs is not specified (the default behaviour), it will be sampled from its distribution.
Pumas.simobs — Functionsimobs(fpm::FittedPumasModel, [population::Population,] vcov::AbstractMatrix, randeffs::Union{Nothing, AbstractVector{<:NamedTuple}} = nothing; samples::Int, rng = default_rng(), kwargs...)Simulates observations from the fitted model using a truncated multi-variate normal distribution for the parameter values. The optimal parameter values are used for the mean and the user supplied variance-covariance (vcov) is used as the covariance matrix. Rejection sampling is used to avoid parameter values outside the parameter domains. Each sample uses a different parameter value. samples is the number of samples to sample. population is the population of subjects which defaults to the population associated the fitted model so it's optional to pass. randeffs can be set to a vector of named tuples, one for each sample. If randeffs is not specified (the default behaviour), it will be sampled from its distribution.
Pumas.simobs — Methodsimobs(b::BayesMCMCResults; subject = nothing, samples::Int = 10, simulate_error = false, rng::AbstractRNG = default_rng())Simulates model predictions using the posterior samples for the population and subject-specific parameters. The predictions are either for the subject with index subject or the whole population if no index is input. samples is the number of simulated predictions returned per subject. If simulate_error is false, the mean of the predictive distribution's error model will be returned, otherwise a sample from the predictive distribution's error model will be returned for each subject. rng is the pseudo-random number generator used in the sampling.
Pumas.simobs — Methodsimobs(model::PumasModel, population::Population; samples::Int = 10, simulate_error = true, rng::AbstractRNG = default_rng())Simulates model predictions for population using the prior distributions for the population and subject-specific parameters. samples is the number of simulated predictions returned per subject. If simulate_error is false, the mean of the predictive distribution's error model will be returned, otherwise a sample from the predictive distribution's error model will be returned for each subject. rng is the pseudo-random number generator used in the sampling.
Pumas.simobs — Methodsimobs(model::PumasModel, subject::Subject; samples::Int = 10, simulate_error = true, rng::AbstractRNG = default_rng())Simulates model predictions for subject using the prior distributions for the population and subject-specific parameters. samples is the number of simulated predictions returned. If simulate_error is false, the mean of the predictive distribution's error model will be returned, otherwise a sample from the predictive distribution's error model will be returned for each subject. rng is the pseudo-random number generator used in the sampling.
Pumas.simobstte — Functionsimobstte(
model::PumasModel,
subject::Union{Subject,Population},
param::NamedTuple,
randeffs::Union{Vector{<:NamedTuple}, NamedTuple, Nothing}=nothing;
minT=0.0,
maxT=nothing,
nT=10,
repeated=false,
rng = default_rng())Simulate observations from a time-to-event model, i.e. one with a TimeToEvent dependent variable and return either a new Subject or Population with random event time stamps in the time vector. The observations vector contains the censoring information (see TimeToEvent for more details on the data coding).
The simobstte function first computes nT values of the survival probability from t=minT to maxT and then interpolates with a cubic spline to get a smooth survival funtion. Given a survival funtion, it is possible to simulate from the distribution by using inverse cdf sampling. Instead of sampling a uniform variate to be compared with the survival function, we sample an exponential variate and compare with cumulative hazard. The two approaches are equivalent. The Roots package is then used for computing the root. If repeated=true then new event times are drawn until maxT has been reached.
Pumas.strip_t — Methodstrip_t(ex, vars)
Strip the (t) from the vars in the expression ex.
Works both when (t) is appended using expressions, :(x(t)), or symbols, Symbol("x(t)").
Pumas.tad — Methodtad(t, events) -> time after most recent doseConverts absolute time t (scalar or array) to relative time after the most recent dose. If t is earlier than all dose times, then the (negative) difference between t and the first dose time is returned instead. If no dose events exist, t is returned unmodified.
Pumas.variables — Functionvariables(dynobj)
Get a list of variable names for a model or dynamics object.
Pumas.vech — Methodvech(A; diag = true)
Converts the symmetric matrix A to a vector of lower triangular elements. If diag is set to true, the vector will include the diagonal elements of A. If diag is false, the vector will only include the strict lower triangle and the diagonal elements of A will be ignored. For example when diag is true:
A = [
1.0 0.1;
0.1 2.0
]
Pumas.vech(A, diag = true)gives
3-element Vector{Float64}:
1.0
0.1
2.0And when diag is false:
Pumas.vech(A, diag = false)gives
1-element Vector{Float64}:
0.1Pumas.vechinv — Methodvechinv(x; diag = nothing, lower = nothing, lower = true)
Converts a vector of lower or upper triangular elements (travesed column by column) to a symmetric matrix. If diag is set to nothing, the vector is assumed to include diagonal elements. Otherwise, the vector is assumed to only include the strict triangle and the diagonal elements of the output matrix will be set to the value of diag. The keyword argument lower can be used to specify whether the vector is that of lower triangular elements or upper triangular elements. For example when lower is true and diag is not set:
Pumas.vechinv([1.0, -0.1, 2.0], lower = true)gives
2×2 Matrix{Float64}:
1.0 -0.1
-0.1 2.0And when diag is 1:
Pumas.vechinv([-0.1], diag = 1)gives
2×2 Matrix{Float64}:
1.0 -0.1
-0.1 1.0Pumas.vpc — Methodvpc(fpm::AbstractFittedPumasModel;
samples::Integer = 499
qreg_method = IP(),
observations::Vector{Symbol} = [keys(fpm.data[1].observations)[1]],
stratify_by::Vector{Symbol} = Symbol[],
quantiles::NTuple{3,Float64}=(0.1, 0.5, 0.9),
level::Real=0.95,
ensemblealg=EnsembleSerial(),
bandwidth=nothing,
maxnumstrats=[6 for i in 1:length(stratify_by)],
covariates::Vector{Symbol} = [:time],
smooth::Bool = true,
rng::AbstractRNG=default_rng(),
nodes::Dict = Dict{Tuple{},Vector{Float64}}(),
nnodes::Integer,
prediction_correction = false)Computes the quantiles for VPC for a FittedPumasModel or FittedPumasEMModel with simulated confidence intervals around the empirical quantiles based on samples simulated populations.
The following keyword arguments are supported:
samples: The number of simulated populations to generate, defaults to499quantiles::NTuple{3,Float64}: A three-tuple of the quantiles for which the quantiles will be computed. The default is(0.1, 0.5, 0.9)which computes the 10th, 50th and 90th percentile.level::Real: Probability level to use for the simulated confidence intervals. The default is0.95.observations::Vector{Symbol}: The name of the dependent variable to use for the VPCs. The default is the first dependent variable in thePopulation.stratify_by: The covariates to be used for stratification. Takes anVectorof theSymbols of the stratification covariates.ensemblealg: This is passed to thesimobscall for thesamplessimulations. For more description check the docs forsimobs.bandwidth: The kernel bandwidth in the quantile regression. If you are seeingNaNs or an error, increasing the bandwidth should help in most cases. With higher values of thebandwidthyou will get more smoothened plots of the quantiles so it's a good idea to check with your data to pick the rightbandwidth. For a DiscreteVPC, the default value is nothing to indicate automatic bandwidth tuning. For a CensoredVPC, the bandwidth should be given as a tuple where the first value applies to the ContinuousVPC and the second to the DiscreteVPC for the censoring variable. For example,(4.0, nothing)applies a bandwidth of4.0to the ContinuousVPC and automatically tunes the DiscreteVPC. Automatic tuning is only applicable to DiscreteVPCs at this moment.maxnumstrats: The maximum number of strata for each covariate. Takes an Vector with the number of strata for the corresponding covariate, passed instratify_by. It defaults to6for each of the covariates.covariates: The independent variable for VPC, defaults to[:time].smooth: In case of discrete VPC used to determine whether to interpolate the dependent variable if independent variable is continuous, defaults totrue.rng: A random number generator, uses thedefault_rngfromRandomas default.nodes: A Dict of covariate values to use for the local regression.nodes: An integer specifying the number of points to use for the local regressions for each stratum.qreg_method: Defaults toIP(). For most users the method used in quantile regression is not going to be of concern, but if you see large run times switchingqreg_methodtoIP(true)should help in improving the performance with a tradeoff in the accuracy of the fitting.prediction_correction: Default tofalse. Set totrueto enable prediction correction that multiplies all observations with the ratio between the mean prediction and each individuals population prediction.
Pumas.wresiduals — Functionwresiduals(fpm::AbstractFittedPumasModel, approx::LikelihoodApproximation; nsim=nothing)Calculate the individual and population weighted residual.
Takes a fit result, an approximation method for the marginal likelihood calculation which defaults to the method used in the fit and the number of simulations with the keyword argument nsim. If nsim is specified only the Expected Simulation based Individual Weighted Residuals (EIWRES) is included in the output as individual residual and population residual is not computed. Using the FO approximation method corresponds to the WRES and while FOCE(I) corresponds to CWRES. The output is a SubjectResidual object that stores the population (wres) and individual (iwres) residuals along with the subject and approximation method (approx).
Pumas.zero_randeffs — Functionzero_randeffs(model::AbstractPumasModel, param::NamedTuple)
zero_randeffs(model::AbstractPumasModel, pop::Population, param::NamedTuple)Create an object to signify that the random effects should be zero.
The optional argument pop takes a Population and changes the output to a vector of such random effects with one element for each subject within the population.
Pumas.ηshrinkage — Methodηshrinkage(fpm::AbstractFittedPumasModel)Calculate the η-shrinkage.
Takes the result of a fit as the only input argument. A named tuple of the random effects and corresponding η-shrinkage values is output.
Pumas.ϵshrinkage — Methodϵshrinkage(fpm::AbstractFittedPumasModel)Calculate the ϵ-shrinkage.
Takes the result of a fit as the only input argument. A named tuple of derived variables and corresponding ϵ-shrinkage values is output.
StatsAPI.aic — Methodaic(fpm::AbstractFittedPumasModel)Calculate the Akaike information criterion (AIC) of the fitted Pumas model fpm.
StatsAPI.bic — Methodbic(fpm::AbstractFittedPumasModel)Calculate the Bayesian information criterion (BIC) of the fitted Pumas model fpm.
StatsAPI.coeftable — Methodcoeftable(pmi::FittedPumasModelInference; level = pmi.level) -> DataFrameConstruct a DataFrame of parameter names, estimates, standard error, and confidence interval with confidence level level. The default value of level is the one that was passed in infer.
StatsAPI.coeftable — Methodcoeftable(fpm::FittedPumasModel) -> DataFrameConstruct a DataFrame of parameter names and estimates from fpm.
StatsAPI.coeftable — Methodcoeftable(cfpm::Vector{<:FittedPumasModel}) -> DataFrameConstruct a DataFrame of parameter names and estimates and their standard deviation from vector of fitted single-subject models vfpm.
StatsAPI.fit — Methodfit(fpm::AbstractFittedPumasModel; kwargs...)Convenience method to start a new fit where an old one left off.
This uses both the fitted parameters and the fitted random effects from fpm as the starting points for the new fit.
StatsAPI.fit — MethodDistributions.fit( model::PumasModel, data::Population, param::NamedTuple, alg::BayesMCMC, )
Samples from the posterior distribution of the model's population and subject-specific parameters given the observations in data using the Hamiltonian Monte Carlo (HMC) based No-U-Turn Sampler (NUTS) algorithm. The MCMC sampler is initialized starting from param for the population parameters. The subject-specific parameters are also apprioriately initialized. The Bayesian inference algorithm's options are passed in the alg struct. Run ?BayesMCMC for more details on the options that can be set.
StatsAPI.fit — Methodfit(
model::PumasModel,
population::Population,
param::NamedTuple,
approx::Union{LikelihoodApproximation, MAP};
optim_alg = BFGS(linesearch = Optim.LineSearches.BackTracking(), initial_stepnorm = 1.0),
optim_options::NamedTuple = NamedTuple(),
optimize_fn = nothing,
constantcoef::NamedTuple = NamedTuple(),
omegas::Tuple = tuple(),
ensemblealg::DiffEqBase.EnsembleAlgorithm = EnsembleThreads(),
checkidentification = true,
diffeq_options = NamedTuple())Fit the Pumas model model to the dataset population with starting values param using the estimation method approx. Currently supported values for the approx argument are FO, FOCE, LaplaceI, NaivePooled, and BayesMCMC. See the online documentation for more details about the different methods.
To control the optimization procedure for finding population parameters (fixed effects), use the optim_alg and optim_options keyword arguments. In previous versions of Pumas the argument optimize_fn was used, but is now discouraged and will be removed in a later version of Pumas. These options control the optimization all approx methods except BayesMCMC. The default optimization function uses the quasi-Newton routine BFGS method from the Optim package. It can be changed by setting the optim_alg to an algorithm implemented in Optim.jl as long as it does not use second order derivatives. Optimization specific options can be passed in using the optim_options keyword and has to be a NamedTuple with names and values that match the Optim.Options type. For example, the optimization trace can be disabled and the algorithm can be changed to L-BFGS by passing optim_alg=Optim.LBFGS(), optim_options = ;(show_trace=false) to fit. See Optim for more defails.
It is possible to fix one or more parameters of the fit by passing a NamedTuple as the constantcoef argument with keys and values corresponding to the names and values of the fixed parameters, e.g. constantcoef=(σ=0.1,).
When models include an @random block and fitting with NaivePooled is requested, it is required that the user supplies the names of the parameters of the random effects as the omegas argument such that these can be ignored in the optimization, e.g. omegas=(Ω,).
Parallelization of the optimization is supported for most estimation methods via the ensemble interface of DifferentialEquations.jl. Currently, the only supported options are:
EnsembleThreads(): the default. Accelerate fits by using multiple threads.EnsembleSerial(): fit using a single thread.EnsembleDistributed(): fit by using multiple worker processes.
The fit function will check if any gradients and throw an exception if any of the elements are exactly zero unless checkidentification is set to false.
Further keyword arguments can be passed via the diffeq_options argument. This allows for passing arguments to the differential equations solver such as alg, abstol, and reltol. The default values for these are AutoVern7(Rodas5(autodiff=true)), 1e-12, and 1e-8 respectively. See the DifferentialEquations.jl documentation for more details.
StatsAPI.fit — Methodfit(
model::PumasEMModel,
population::Population,
param::NamedTuple,
approx::Union{SAEM,LaplaceI};
ensemblealg = EnsembleThreads(),
rng = default_rng()
)Fit the PumasEMModel to the dataset population using the initial values specified in param. The supported methods for the approx argument are currently SAEM and LaplaceI. See the online documentation for more details about these two methods.
When fitting with SAEM the variance terms of the random effects (Ω) and the dispersion parameters for the error model (σ) are initialized to the identity matrix or 1 as appropriate. They may also be specified. With SAEM, it is reccomended to choose an init larger than the true values to facilitate exploration of the parameter space and avoid getting trapped in local optima early. Currently LaplaceI fits with a PumasEMModel require Ω to be diagonal.
Options for ensemblealg are EnsembleSerial() and EnsembleThreads() (the default). The fit will be parallel if ensemblealg == EnsembleThreads(). SAEM imposes the additional requirement for threaded fits that either the rng used supports Future.randjump or that rng be a vector of rngs, one per thread. The default_rng() supports randjump().
StatsAPI.loglikelihood — Methodloglikelihood(fpm::AbstractFittedPumasModel)Compute the loglikelihood of a fitted Pumas model.
StatsAPI.loglikelihood — Methodloglikelihood(b::BayesMCMCResults; split_method, split_by)Computes a 3D array of the in-sample loglikelihoods of all the data points/subsets using the output of the MCMC fit function. The output is a 3D array of size (ndata, nsamples, nchains) where ndata is the number of data points, nsamples is the number of MCMC samples per chain and nchains is the number of chains.
The way by which the data is split into points/subsets is determined using the split_method and the split_by keyword arguments. The split_method argument can be of any of the following types:
LeaveKfor a leave-K-out splittingKFoldfor a K-fold splittingLeaveFutureKfor leaving K future points at a time
while split_by can be of any of the following types:
BySubjectfor leaving out subjectsByObservationfor leaving out individual observations per subject
Please refer to the documentation of each of the above types for the parameters of each split method, e.g. by typing ?LeaveK or ?BySubject.
Examples:
Assume there are 5 subjects and 10 observations per subject. The following are some of the combinations in which the above inputs can be used:
- Leave-1-observation-out splitting, splitting 1 observation for all the subjects at a time.
allsubjects = truemeans that the same observation index is removed for all the subjects, e.g. the 10th observation for all the subjects is used as a single subset, then the 9th observation is used for all the subjects is used as another subset, etc.
split_method = LeaveK(K = 1)
split_by = ByObservation(allsubjects = true)
loglikelihood(b; split_method = split_method, split_by = split_by)- Leave-2-future-observations-out splitting, splitting 2 future observations per subject such that no less than 4 past observations are remaining.
allsubjects = falsemeans that only the data of one subject at a time will be split. So observations 5 and 6 of subject 1 are the first data subset. The second subset is observations 7 and 8 of subject 1. The third subset is observations 9 and 10 of subject 1, etc. Then in the fourth to sixth runs, subject 2's data gets split. In the seventh to nineth runs, subject 3's data gets split, etc.
split_method = LeaveFutureK(K = 2, minimum = 4)
split_by = ByObservation(allsubjects = false)
loglikelihood(b; split_method = split_method, split_by = split_by)- Leave-1-subject-out splitting, marginalizing the subject-specific parameters using quadrature integration when computing the in-sample likelihoods. Using this method, each subject is a data point. The in-sample likelihoods computed are the marginal likelihoods of the subjects computed using Gaussian quadrature with a maximum of 1000 integration points. Refer to the documentation of
LLQuadby typing?LLQuadfor more details on the parameters of the quadrature algorithm.FO(),FOCE()andLaplaceI()can also be used instead.
split_method = LeaveK(K = 1)
split_by = BySubject(marginal = LLQuad(imaxiters = 1000))
loglikelihood(b; split_method = split_method, split_by = split_by)- Leave-1-future-subject-out splitting, where the in-sample likelihoods are conditional likelihoods computed using
0s for the subject-specific parameters instead of marginalizing. In this method, each subject after the third one is considered a data point, i.e. the 4th and 5th subjects are the only 2 points for which the in-sample likelihood is computed.
split_method = LeaveFutureK(K = 1, minimum = 3)
split_by = BySubject(marginal = nothing)
loglikelihood(b; split_method = split_method, split_by = split_by)StatsAPI.loglikelihood — Methodloglikelihood(r::BayesianCVResult)Computes the out-of-sample loglikelihoods from the result r of crossvalidate. The output is vector of matrices, e.g. loglik where loglik[i] is the loglikelihoods matrix of data point/set i which was not used in training. loglik[i][j, k] is the loglikelihood using the parameter values from the jth sample of the kth chain.
StatsAPI.predict — Methodpredict(
fpm::AbstractFittedPumasModel,
[population::Union{Subject,Population};
[obstimes::AbstractVector]
)::Union{SubjectPrediction,Vector{SubjectPrediction}}Compute population and individual predictions for the fitted model fpm. By default, the predictions are computed for the estimation data but the predictions can also be computed for user supplied data by passing either a single subject or a vector of subjects (Population) as the population argument.
If the optional obstimes argument is passed then the time points in obstimes are used for the predictions. Otherwise, the time points of the observations for each subject in the population are used for the predictions.
Any optional keyword arguments used when fitting fpm are reused when computing the predictions.
StatsAPI.predict — Methodpredict(
model::AbstractPumasModel,
subject::Subject,
param::NamedTuple,
[randeffs,];
[obstimes::AbstractVector,
diffeq_options::NamedTuple]
)::Union{SubjectPrediction,Vector{SubjectPrediction}}Compute population and individual predictions for the single subject based on model and the population parameters param. A NamedTuple of random effects, randeffs, can be omitted or provided by the user. If they are omitted, they will be estimated from the data in the subject.
If the optional obstimes argument is passed then the time points in obstimes are used for the predictions. Otherwise, the time points of the observations of the subject are used for the predictions.
The function allows for extra keyword arguments to be passed on to the differential equations solver through the diffeq_options keyword. See the online documentation for more details.
StatsAPI.stderror — Methodstderror(f::AbstractFittedPumasModel) -> NamedTupleCompute the standard errors of the population parameters and return the result as a NamedTuple matching the NamedTuple of population parameters.
StatsAPI.vcov — Methodvcov(f::AbstractFittedPumasModel) -> MatrixCompute the covariance matrix of the population parameters
Pumas.@emmodel — Macro@emmodelDefine a PumasEMModel. It may have the following blocks:
@paramand@random: Defines fixed and random effects, e.g.
@random begin
CL ~ 1 + wt | LogNormal
θbioav ~ 1 | LogitNormal
endDistributions specify the parameter's support. Only Normal (-∞,∞) LogNormal (0,∞), and LogitNormal (0,1) are currently supported.
These define 1 unconstrained parameter per term in the formula, for example CL = exp(μ + wt * β + η) where μ, β ∈ (-∞,∞) and wt is a covariate constant with respect to time. η is a random effect. The @param block is equivalent, with the exception that there is no random effect η. These variables are accessible in @pre, @dosecontrol, @dynamics, @post, and @error blocks.
@covariates: Covariates available in the@pre,@dosecontrol,@dynamics, and@postblocks.@pre: Block evaluated before the@dynamics.@dosecontrol: Block specifying dose control parameters. Function of@paramand@randomvariables only.@init: return initial conditions for dynamical variables.@dynamics: Dynamics, equivalent to the functionality in the@modelmacro.@post: Block evalauted after the@dynamicsand before the@error.@error: Block describing the error model. Dispersion parameters are implicit.Y ~ ProportionalNormal(μ)indicates thatYhas aNormaldistribution with meanμ.Ymust be observed subject data, whileμcan can be defined in the@param,@random,@pre,@dynamics, or@postblocks.
Pumas.@model — Macro@modelDefines a Pumas model, with the following possible blocks:
@param: defines the model's fixed effects and corresponding domains, e.g.tvcl ∈ RealDomain(lower = 0).@random: defines the model's random effects and corresponding distribution, e.g.η ~ MvNormal(Ω).@covariates: Names of covariates used in the model.@pre: pre-processes variables for the dynamic system and statistical specification.@dosecontrol: defines dose control parameters as a function of fixed and random effects.@vars: define variables usable in other blocks. It is recomended to define them in@preinstead if not a function of dynamic variables.@init: defines initial conditions of the dynamic system.@dynamics: defines the dynamic system.@derived: defines the statistical model of the dependent variables.@observed: lists model information to be stored in the solution.@optionsspecifies model options, includingchecklinear,inplace, andsubject_t0.
Pumas.@nca — Macro @ncaDefines a macro that can be used in the observed block to compute NCA metrics of the computed model solutions.
Example
...
@derived begin
cp := Central/Vc
DV ~ @. Normal(cp, σ*cp)
end
@observed begin
nca = @nca cp
auc = NCA.auc(nca, interval = (0,4))
cmax = NCA.cmax(nca)
end
...- Geweke1991Geweke, J. F. (1991). Evaluating the accuracy of sampling-based approaches to the calculation of posterior moments (No. 148). Federal Reserve Bank of Minneapolis.
- Heidelberger1983Heidelberger, P., & Welch, P. D. (1983). Simulation run length control in the presence of an initial transient. Operations Research, 31(6), 1109-1144.
- Raftery1992A L Raftery and S Lewis. Bayesian Statistics, chapter How Many Iterations in the Gibbs Sampler? Volume 4. Oxford University Press, New York, 1992.