Model-based optimal design of experiments
In Pumas, users can do model-based optimal design of experiments. Currently, different variants of sample time optimization are supported and we will walk through them in this tutorial.
Loading the package
The first step is to load the OptimalDesign package using:
using OptimalDesignBy loading OptimalDesign, Pumas will be automatically loaded as well.
Define a model
Once OptimalDesign is loaded, you can define a normal Pumas model as follows.
model = @model begin
@param begin
θ ∈ VectorDomain(3; lower=zeros(3), init=ones(3))
Ω ∈ PSDDomain(2)
end
@random η ~ MvNormal(Ω)
@pre begin
Ka = θ[1]
CL = θ[2]*exp(η[1])
Vc = θ[3]*exp(η[2])
end
@dynamics Depots1Central1
@derived begin
cp := @. Central / Vc
dv ~ @. Normal(cp, 1e-2)
end
endDefine a subject
The next step is to define an observation-free subject.
subject = Subject(;
events=DosageRegimen(100),
observations=(; dv=nothing),
)Note that the observations field should be defined using a NamedTuple whose names match the observed variables' names in the @derived block. The value of the observations can be nothing but any other value can be used and will be ignored.
Specify the fixed effects
Currently, only local optimal design is available so the user must specify the values of the fixed effects to use.
params = (;
θ=[1.5, 1.0, 30.0],
Ω=[0.1 0.0; 0.0 0.1],
)Time window specification
Date and time API
In OptimalDesign, users can use the Dates standard Julia package to specify the time bounds for optimization using specific dates and times through the DateTime function. Regular floating point number bounds, i.e. Float64, are also supported but the Dates user interface allows a pain-free incoroporation of hospital schedules. If the time bounds are passed in as Float64, the optimal design will be in Float64. And if the time bounds are passed in using DateTime, the optimal design will be in DateTime.
t0 and tend below define the start and end times of a time window.
t0 = DateTime(2022, 1, 25, 9) # 25 Jan 2022 - 9:00 am
tend = DateTime(2022, 1, 25, 12) # 25 Jan 2022 - 12:00 pmThe DateTime function's inputs are of the form: (Year, Month, Day, Hour, Minute, Second, MilliSecond). Dropping arguments from the end assigns their value to 0, e.g. the minute, second and millisecond values in t0 and tend above are 0.
DateTime arithmetic
One can use arithmetic functions to increment dates and times as follows:
- Next second:
t0 + Second(1) - Next minute:
t0 + Minute(1) - Next hour:
t0 + Hour(1) - Next day:
t0 + Day(1) - Next week:
t0 + Day(7) - Next month:
t0 + Month(1) - Next year:
t0 + Year(1)
Multiplication with integers and subtraction are also supported.
Intervals
To define time bounds, OptimalDesign makes use of intervals. One can define a time window that starts at t0 and ends at tend using:
t0..tendArithmetic on intervals is also supported to increment the whole interval by 1 day or 1 hour. However, parentheses must be used carefully. The following is the correct syntax to increment the start and finish bounds by 1 day:
(t0..tend) + Day(1)The following syntax parses differently incrementing only the finish bound by a day.
t0..tend + Day(1)Decision definition
Fixed samples per time window
When doing sample time optimization, one way to specify the decision is using a fixed number of samples per time window per subject. The following dictionary can be used to associate a number of samples for each time window using the interval and arithmetic feature presented above:
bounds = Dict(
(t0..tend) => 10,
(t0..tend) + Day(1) => 10,
)After defining the time bounds and number of samples per time window, a decision can be constructed using:
dec = decision(
model,
subject,
params;
type=:observation_times,
bounds=bounds,
minimum_offset=Minute(15),
model_time_unit=Hour(1),
)for a single subject or for a whole population using:
population = [subject, subject]
dec = decision(
model,
population,
params;
type=:observation_times,
bounds=bounds,
minimum_offset=Minute(15),
model_time_unit=Hour(1),
)Since models in Pumas are unitless, when DateTime decisions are used, a model_time_unit must be passed in to specify the unit of time assumed in the model definition and dynamics model. The minimum_offset argument can be optionally specified to enforce a minimum duration between any 2 samples. The default value of minimum_offset is 1e-4.
Some additional options which can be specified here include:
rng: the random number generator used to generate random initial of sample times. The default value isRandom.GLOBAL_RNG.init: The initial sample times. The default value isnothing. Ifnothingis input, random sample times are used to initialize the optimization. The initial sample times for a subject is a vector ofDateTimeorFloat64. The initial sample times for a population is a vector of vectors ofDateTimeorFloat64.subject_multiple: The number of times each subject is counted. This is a number if only a single subject is passed in the second argument, or a vector if a population is used. The default value is 1 for an individual subject or a vector of ones for a population.time_multiples: a vector of integers used to replicate each sample multiple times. If not specified, each sample will be counted only once.
Total number of samples for all windows
The second way to specify a sample times decision in OptimalDesign is using a total number of observations to be allocated to different time windows. Mixed integer nonlinear programming will be used to find the optimal design. In this case, bounds should be a vector of intervals:
bounds = [(t0..tend), (t0..tend) + Day(1)]and an additional keyword argument N must be passed to indicate the total number of samples per subject. Example:
dec = decision(
model,
subject,
params;
type=:observation_times,
bounds=bounds,
N=20,
minimum_offset=Minute(5),
model_time_unit=Hour(1),
)Similarly, a population can be passed in as the second argument.
Some additional options which can be specified here include:
rng: the random number generator used to generate random initial of sample times. The default value isRandom.GLOBAL_RNG.init: The initial sample times. The default value isnothing. Ifnothingis input, random sample times are used to initialize the optimization. The initial sample times for a subject is a vector ofDateTimeorFloat64. The initial sample times for a population is a vector of vectors ofDateTimeorFloat64.subject_multiple: The number of times each subject is counted. This is a number if only a single subject is passed in the second argument, or a vector if a population is used. The default value is 1 for an individual subject or a vector of ones for a population.time_multiples: a vector of integers used to replicate each sample multiple times. If not specified, each sample will be counted only once.minimum_per_window: a vector of integers indicating the minimum number of samples to allocate to each time window per subject.maximum_per_window: a vector of integers indicating the maximum number of samples to allocate to each time window per subject.
Design
In OptimalDesign, a number of types of optimal design are supported. The optimality type can be specified using the keyword argument optimality which can take the following values:
:aoptimal: minimizes the trace of the inverse of the expected information matrix.:doptimal: maximizes the (log) determinant of the expected information matrix. In the optimization, negative the log determinant is minimized instead.:toptimal: maximizes the trace of the expected information matrix. In the optimization, negative the trace is minimized instead.
To perform the sample time optimization using the previously constructed decision dec, run:
result = design(dec; optimality=:doptimal)for example where :doptimal can be replaced with :aoptimal or :toptimal.
Additionally, the following options can be specified using keyword arguments:
verbose: can betrueorfalse. Iftrue, the optimization progress will be displayed. The default value isfalse.time_limit: time limit in seconds, default value is 1200 seconds (20 minutes).nlp_tol: tolerance for the optimization solver.auto_initialize: auto-intializes the optimizer to a feasible design, default value is true.processors: number of processors to use in the mixed integer optimizer.
Query the design
The output of design can be queried using:
optimaltimes(result): returns the optimal sample times for each subject. This will be a vector of vectors for a population.initvalue(result): returns the initial objective value.optimalvalue(result): returns the optimal objective value.defficiency(result): returns(det(FIM_2) / det(FIM_1))^(1/M)whereFIM_2is the Fisher information matrix (FIM) at the optimal design,FIM_1is the FIM at the initial design, andMis the number of model parameters.
Distributed mixed integer optimization on JuliaHub
If the decision is defined using multiple time windows and a shared total number of samples to be allocated, mixed integer optimization is performed to find the optimal allocation. In mixed integer optimization, you can use multiple processors in a distributed computing environment such as JuliaHub to speed up the computation. For that, you must make sure that OptimalDesign is visible and loaded on all the processors used. To do this:
- Start Julia in an environment with
OptimalDesignavailable. - Run the following to load
OptimalDesignon all the processors.
using Distributed
@everywhere using Pkg
@everywhere pkg"activate ."
@everywhere pkg"instantiate"
using OptimalDesign
@everywhere using OptimalDesign- Define the model, subject(s), parameters and decision as described in the above sections.
- Use the
processorsoption indesignto set the number of processors to use during the mixed integer optimization, e.g.processors = nprocs() - 1wherenprocs()returns the total number of processors available.
Full example
The following is an example of optimal design of sample times for a warfarin model.
using OptimalDesign, PharmaDatasets
datapath = dataset("warfarin.csv", String)
df = CSV.read(datapath, DataFrame; missingstring=[".", "", "NA"])
data = read_pumas(df)
model = @model begin
@param begin
θ₁ ∈ RealDomain(; lower=0.0, init=0.15)
θ₂ ∈ RealDomain(; lower=0.0, init=8.0)
θ₃ ∈ RealDomain(; lower=0.0, init=1.0)
Ω ∈ PSDDomain(3)
σ ∈ RealDomain(; lower=0.0001, init=sqrt(0.01))
end
@random begin
η ~ MvNormal(Ω)
end
@pre begin
Tvcl = θ₁
Tvv = θ₂
Tvka = θ₃
CL = Tvcl*exp(η[1])
Vc = Tvv*exp(η[2])
Ka = Tvka*exp(η[3])
end
@dynamics Depots1Central1
@vars begin
conc = Central / Vc
end
@derived begin
dv ~ @. Normal(log(conc), σ)
end
end
params = (;
θ₁=0.15,
θ₂=8.0,
θ₃=1.0,
Ω =[0.07 0.0 0.0; 0.0 0.02 0.0; 0.0 0.0 0.6],
σ =sqrt(0.01),
)
# 25 Jan 2021 - 9:00 am
t0 = DateTime(2021, 1, 25, 9)
# 25 Jan 2021 - 9:00 pm
tend = DateTime(2021, 1, 25, 21)
time_windows = [t0..tend]
# Decision variables
Nsubjects = length(data)
dec = decision(
model,
data,
params;
type=:observation_times,
bounds=time_windows,
N=15,
subject_multiples=fill(3, Nsubjects),
minimum_offset=Minute(30),
model_time_unit=Hour(1),
)
result = design(
dec;
optimality=:doptimal,
time_limit=400.0,
nlp_tol=1e-4,
verbose=true,
)
optimaltimes(result)