Dosage Regimens, Subjects, and Populations
In Pumas, subjects are represented by the Subject
type and collections of subjects are represented as Vector
s of Subject
s and aliased as a Population
. Subjects are defined by their identifier, observations, covariates, and events. In this section we will specify the methods used for defining Subject
s programmatically or using the read_pumas
function that reads in data that follows the Pumas NLME Data Format (PumasNDF) data format. Before we look at Subject
s, we will take a look at how to define events as represented by the DosageRegimen
type.
Dosage Regimen Terminology
When subjects are subjected to treatment it is represented by an event in Pumas. Administration of a drug is represented by a DosageRegimen
that describes the amount, type, frequency and route. DosageRegimen
s can either be constructed programmatically using the DosageRegimen
constructor or from a data source in the PumasNDF format using read_pumas
. The names of the inputs are the same independent of how the DosageRegimen
is constructed. The definition of the values are as follows:
amt
: the amount of the dose. This is the only required value.time
: the time at which the dose is given. Defaults to0
.evid
: the event id.1
specifies a normal event.3
means it's a reset event, meaning that the value of the dynamical variable is reset to theamt
at the dosing event. If4
, then the dynamical value and time are reset, and then a final dose is given. Defaults to1
.ii
: the inter-dose interval. For steady state events, this is the length of time between successive doses. Whenaddl
is specified, this is the length of time between dose events. Defaults to0
.addl
: the number of additional events of the same type, spaced byii
. Defaults to 0.rate
: the rate of administration. If0
, then the dose is instantaneous. Otherwise the dose is administrated at a constant rate for a duration equal toamt/rate
.ss
: an indicator for whether the dose is a steady state dose. A steady state dose is defined as the result of having applied the dose with the intervalii
infinitely many successive times.0
indicates that the dose is not a steady state dose.1
indicates that the dose is a steady state dose.2
indicates that it is a steady state dose that is added to the previous amount. The default is0
.route
: route of administration to be used in NCA analysis if it is carried out with the integrated interface inside@model
. Defaults toNullRoute
which is basically no route specified.
This specification leads to the following default constructor for the DosageRegimen
type:
Pumas.DosageRegimen
— TypeDosageRegimen(
amt::Numeric;
time::Numeric = 0,
cmt::Union{Numeric,Symbol} = 1,
evid::Numeric = 1,
ii::Numeric = zero.(time),
addl::Numeric = 0,
rate::Numeric = zero.(amt)./oneunit.(time),
duration::Numeric = zero(amt)./oneunit.(time),
ss::Numeric = 0,
route::NCA.Route = NCA.NullRoute,
)
Lazy representation of a series of events.
Examples
julia> DosageRegimen(100; ii = 24, addl = 6)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 100.0 1 24.0 6 0.0 0.0 0 NullRoute
julia> DosageRegimen(50; ii = 12, addl = 13)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 50.0 1 12.0 13 0.0 0.0 0 NullRoute
julia> DosageRegimen(200; ii = 24, addl = 2)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 200.0 1 24.0 2 0.0 0.0 0 NullRoute
julia> DosageRegimen(200; ii = 24, addl = 2, route = NCA.IVBolus)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 200.0 1 24.0 2 0.0 0.0 0 IVBolus
You can also create a new DosageRegimen
from various existing DosageRegimen
s:
evs = DosageRegimen(
regimen1::DosageRegimen,
regimen2::DosageRegimen;
offset = nothing
)
offset
specifies if regimen2
should start after an offset following the end of the last event in regimen1
.
Examples
julia> e1 = DosageRegimen(100; ii = 24, addl = 6)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 100.0 1 24.0 6 0.0 0.0 0 NullRoute
julia> e2 = DosageRegimen(50; ii = 12, addl = 13)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 50.0 1 12.0 13 0.0 0.0 0 NullRoute
julia> evs = DosageRegimen(e1, e2)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 100.0 1 24.0 6 0.0 0.0 0 NullRoute
2 │ 0.0 1 50.0 1 12.0 13 0.0 0.0 0 NullRoute
julia> DosageRegimen(e1, e2; offset = 10)
DosageRegimen
Row │ time cmt amt evid ii addl rate duration ss route
│ Float64 Int64 Float64 Int8 Float64 Int64 Float64 Float64 Int8 NCA.Route
─────┼───────────────────────────────────────────────────────────────────────────────────
1 │ 0.0 1 100.0 1 24.0 6 0.0 0.0 0 NullRoute
2 │ 178.0 1 50.0 1 12.0 13 0.0 0.0 0 NullRoute
Steady State Dosing
In addition to applying the specified dose, steady state dosing resets the compartments to a steady state (if ss = 1
) or adds the the steady state values to the current compartment values (if ss = 2
).
Steady state dosing can only be used with models for which a non-trivial steady state exists. For instance, if the @dynamics
block includes an equation for the AUC of a compartment, then the system can be in steady state only if the corresponding compartment is zero.
There are three classes of steady state doses:
Class | Amount (amt ) | Rate (rate ) |
---|---|---|
Constant infusion | 0 | >0 |
Multiple infusions | >0 | >0 |
Multiple bolus doses | >0 | 0 |
The steady state of the system can be viewed as the result of having applied the steady state dose from the infinite past, depending on the type of the dose as a constant infusion or as a sequence of doses. These so-called implied past doses do not affect the values of the system prior to the time point (time
) of the dose record.
Constant Infusion
The time point (time
) of a steady state dose record with constant infusion denotes the time point at which the infusion from the infinite past ends. Therefore, absorption lag (lags
) does not apply to constant infusions and the dose event cannot be applied multiple times (ii
and addl
must be zero). Since the dose amount (amt
) is zero, also other dose control parameters (bioav
, rate
, and duration
) do not apply to steady state events with constant infusion.
Multiple Doses
For steady state events with multiple infusions or bolus doses, the time point (time
) of the dose record denotes the time point at which the infusion is started or the bolus is injected. Dose control parameters (lags
, bioav
, rate
, and duration
) at the time point (time
) of the dose record apply to all doses, including the implied doses prior to the dose event.
The Subject
Constructor
The dosage regimen is only a subset of what we need to fully specify a subject. As mentioned above, we use the Subject
type to represent individuals in Pumas. As seen below, then can either be constructed from primitives (time
, events
covariates
, and observations
) or read from a tabular format that will be mapped to these concepts. They can be constructed using the Subject
constructor programmatically:
Pumas.Subject
— TypeSubject
The data corresponding to a single subject:
Fields:
id
: identifierobservations
: a named tuple of the dependent variablescovariates
: a named tuple containing the covariates, ornothing
.events
: aDosageRegimen
, ornothing
.time
: a vector of time stamps for the observations
When there are time varying covariates, each covariate is interpolated with a common covariate time support. The interpolated values are then used to build a multi-valued interpolant for the complete time support.
From the multi-valued interpolant, certain discontinuities are flagged in order to use that information for the differential equation solvers and to correctly apply the analytical solution per region as applicable.
Constructor
Subject(;id = "1",
observations = NamedTuple(),
events = nothing,
time = observations isa AbstractDataFrame ? observations.time : nothing,
covariates::Union{Nothing, NamedTuple} = nothing,
covariates_time = observations isa AbstractDataFrame ? observations.time : nothing,
covariates_direction = :left)
Subject
may be constructed from an <:AbstractDataFrame
with the appropriate schema or by providing the arguments directly through separate DataFrames
/ structures.
Examples:
julia> Subject()
Subject
ID: 1
julia> Subject(
id = 20,
events = DosageRegimen(200, ii = 24, addl = 2),
covariates = (WT = 14.2, HT = 5.2),
)
Subject
ID: 20
Events: 3
Covariates: WT, HT
julia> Subject(covariates = (WT = [14.2, 14.7], HT = fill(5.2, 2)), covariates_time = [0, 3])
Subject
ID: 1
Covariates: WT, HT
or using read_pumas
from tabular data:
Pumas.read_pumas
— Functionread_pumas(filepath::AbstractString; missingstring = ["", ".", "NA"], kwargs...)
read_pumas(df::AbstractDataFrame; kwargs...)
Import NMTRAN-formatted data. You can either pass a CSV file path or a data frame as the first and only positional argument.
Keyword Arguments
observations
(default:[:dv]
): a vector of column names of dependent variables.covariates
(default:Symbol[]
): a vector of column names of covariates.id::Symbol
(default::id
): the name of the column with the IDs of the individuals. Each individual should have a unique integer or string.time::Symbol
(default::time
): the name of the column with the time corresponding to the row. Time should be unique per ID, i.e., there should be no duplicate time values for a given subject.evid::Union{Symbol,Nothing}
(default:nothing
): the name of the column with event IDs, ornothing
. Possible event IDs are:0
: observation1
: dose event2
: other type event3
: reset event (the amounts in each compartment are reset to zero and the on/off status of each compartment is reset to its initial status)4
: reset and dose event
The event ID defaults to 0 if the dose amount is 0 or missing, and 1 otherwise.
amt::Symbol
(default::amt
): the name of the column of dose amounts. If the event ID is specified and non-zero, the dose amount should be non-zero. The default dose amount is 0.addl::Symbol
(default::addl
): the name of the column that indicates the number of repeated dose events. The number of additional doses defaults to 0.ii
(default::ii
): the name of the column of inter-dose intervals. When the number of additional doses is specified and non-zero, this is the time to the next dose. For steady-state events with multiple infusions or bolus doses, this is the time between implied doses. The default inter-dose interval is 0. It is required to be non-zero for steady-state events with multiple infusions or bolus doses, and it is required to be zero for steady-state events with constant infusion.cmt::Symbol
(default::cmt
): the name of the column with the compartment to be dosed. Compartments can be specified by integers, strings andSymbol
s. The default compartment is 1.rate::Symbol
(default::rate
): the name of the column with the rate of administration. A rate=-2 allows the rate to be determined by Dose Control Parameters (DCP). Defaults to 0. Possible values are:0
: instantaneous bolus dose>0
: infusion dose administered at a constant rate for a duration equal toamt/rate
-2
: infusion rate or duration specified by the dose control parameters (see@dosecontrol
)
ss::Symbol
(default::ss
): the name of the column that indicates whether a dose is a steady-state dose. A steady-state dose is defined as the result of having applied the same dose from the infinite past. Possible values of the steady-state indicator are:0
: dose is not a steady state dose.1
: dose is a steady state dose, and the compartment amounts are to be reset to the steady-state amounts resulting from the given dose. Compartment amounts resulting from prior dose event records are "zeroed out", and infusions in progress or pending additional doses are cancelled.2
: dose is a steady state dose and the compartment amounts are to be set to the sum of the steady-state amounts resulting from the given dose plus the amounts at the event time were the steady-state dose not given.
The default value is 0.
route::Symbol
: the name of the column that specifies the route of administration.mdv::Union{Symbol,Nothing}
(default:nothing
): the name of the column that indicates if observations aremissing
, ornothing
.event_data::Bool
(default:true
): toggles assertions applicable to event data. More specifically, checks if the following columns are present in theDataFrame
, either as the default values or as user-defined values::id
,:time
, and:amt
. If no:evid
column is present, then a warning will be thrown, and:evid
is set to1
when:amt
values are>0
or notmissing
, or:evid
is set to0
when:amt
values aremissing
and observations are notmissing
. Otherwise,read_pumas
will throw an error.covariates_direction::Symbol
(default::left
): the direction of covariate interpolation. Either:left
(Last Observation Carried Forward, LOCF) (default), or:right
(Next Observation Carried Backward, NOCB). Notice, that for models with occasion variables it is important to use:left
for the correct behavior of the interpolation.check::Bool
(default:event_data
): toggles NMTRAN compliance check of the input data. More specifically, checks if the following columns are present in theDataFrame
, either as the default values or as user-defined values::id
,:time
,:amt
,:cmt
,:evid
,:addl
,:ii
,:ss
, and:route
. Additional checks are:- all variables in
observations
are numeric, i.e.Integer
orAbstractFloat
. :amt
column is numeric, i.e.Integer
orAbstractFloat
.:cmt
column is either a positiveInteger
, anAbstractString
, or aSymbol
.:amt
values must bemissing
or0
whenevid = 0
; or>=0
otherwise.- all variables in
observation
aremissing
whenevid = 1
. :ii
column must be present if:ss
is specified or present.:ii
values must bemissing
or0
whenevid = 0
.:ii
column must be>0
if:addl
values are>0
, and vice-versa.:addl
column, if present, must be>=0
whenevid = 1
.:evid
values must be!=0
when:amt
values are>0
, or:addl
and:ii
values are>0
.
- all variables in
adjust_evid34::Bool
(default:true
): toggles adjustment of time vector for reset events (evid = 3
andevid = 4
). Iftrue
(the default) then the time of the previous event is added to the time on record to ensure that the time vector is monotonically increasing.
If a column does not exist, its values are imputed to be the default values.
You can also create Subject
s from an existing Subject
:
Pumas.Subject
— MethodSubject(subject::Subject; id::AbstractString, events)
Construct a Subject
from an existing subject
while replacing information according to the keyword arguments. The possible keyword arguments are
id
, sets the subject identifier to the stringid
. Defaults to the identifier already insubject
.events
, sets the subject events to the inputDosageRegimen
. Defaults to the events already present insubject
.covariates
andcovariates_times
, used as input to add covariate information
Examples:
julia> s1 = Subject(; id = "AKJ491", events = DosageRegimen(1.0; time = 1.0))
Subject
ID: AKJ491
Events: 1
julia> Subject(s1; id = "AKJ492")
Subject
ID: AKJ492
Events: 1
The simulated output of a PumasModel
as a result of simobs
is Pumas.SimulatedObservations
. Passing this simobs
output into a vectorized Subject
will result in a Population
(an alias for a vector of Subject
s) that is equivalent to the output of read_pumas
. This is a convenient feature that allows one to simulate data and turn it back into a Population
that can then be passed into a fit
function:
Pumas.Subject
— MethodSubject
Constructor
Subject(simsubject::SimulatedObservations)
Roundtrip the result of simobs
, i.e. SimulatedObservations to a Subject
/Population
Example:
sims = simobs(model, pop, params)
To convert sims
to a Population
, broadcast as below
Subject.(sims)
Understanding the Subject
The Subject
is a central concept when it comes to simulation and fitting in Pumas. To do either of the two we need a model and data. The model is built using the PumasModel
or PumasEMModel
but carries no information about events, covariate values, or observed values. These all come from the Subject
. In Pumas material you will also see Population
mentioned and that is nothing but a collection of Subject
s. Specifically, the collection type is a Vector
such that if sub1
, sub2
, and sub3
are Subject
s then [sub1, sub2, sub3]
is a Population
.
As mentioned, a model alone does not carry all the information needed to perform a simulation. In theory, we could divide the information into two categories:
- Observations
- Covariates
Both sets of information is always associated with a specific time point that refers to when the information was collected. In the case of dosing information the information is typically not collected but taken from protocol even if we would ideally always want to know the actual dosing time rather than the protocol time in a fitting context.
Covariates can further be divided into to categories: events and actual covariates. Events are typically dosing records including information such as additional doses, whether to treat the dose as if it has been given from an infinite past (steady-state dosing), and so on. Actual covariates would be information about the subject that does not directly affect events and that are not observations of derived variables in the statistical model. This could be demographics, formulation information that does not fit the dosing regimen, occasion indices, and so on. In Pumas, these two sets of covariate information fall into the two sub-categories events
and covariates
.
Observations
The data used for fitting model parameters and the data that will be simulated in simobs
is specified in what we call observations
Subject(id = "1", time = [...], observations = [...])
If we have data on record we will often use the read_pumas
function that is designed to map tabular data to Subject
s. This will be explained below. It is also possible to input data directly in the Subject
constructor as follows:
Subject(
id = "1",
time = [0.0, 2.0, 4.0, 24.0],
observations = (; dv = [3.0, 1.2, 0.8, 0.001]),
)
This requires the time vector to match up with the vectors inside the NamedTuple
passed to observations
. If we had two dependent variables we would simply add the second one to the NamedTuple
Subject(
id = "1",
time = [0.0, 2.0, 4.0, 24.0],
observations = (; dv = [3.0, 1.2, 0.8, 0.001], dv2 = [0.1, 0.6, 0.9, 1.05]),
)
Where each element of each vector in the observations
input matches the time vector. Some times there will not be a complete overlap between the two sets of sampling time points and in that case we need to add missing
values. Let us say that dv2
is not sampled for the first two hours but it is additionally sampled at 48
hours then it might look like the following:
Subject(
id = "1",
time = [0.0, 2.0, 4.0, 24.0, 48.0],
observations = (;
dv = [3.0, 1.2, 0.8, 0.001, missing],
dv2 = [missing, missing, 0.9, 1.05, 0.2],
),
)
Notice, that one dependent variable cannot have two observations at the same time point. Sometimes this can occur in data if pre-dose plasma samples are labelled as having the same time
value as the dose and the post-dose sample. This is an example in the category mentioned above where the actual sample time is not reflected in the data set as you cannot realistically take a pre-dose sample, dose, and take a post-dose sample with a nanosecond. If this occurs, you need to shift the time point of the pre-dose sample slightly. Below, we add a pre-dose sample 3 minutes prior to dosing by setting time to -3/60 = -0.05
:
Subject(
id = "1",
time = [-0.05, 0.0, 2.0, 4.0, 24.0, 48.0],
observations = (;
dv = [0.0, 3.0, 1.2, 0.8, 0.001, missing],
dv2 = [missing, missing, missing, 0.9, 1.05, 0.2],
),
)
The same can be done in a DataFrame
if using read_pumas
.
Notice, that when setting negative time points in the Subject
, the reference time of the model is shifted from t0=0
to the earliest time point contained in the set of time of first observation, time of first dose, and time of first covariate value being observed.
As mentioned above, if we have data on record it is more often the case that we use read_pumas
. The direct construction of Subject
s is often related to simulation exercises where we want to simulate data instead. When simulating data, we can either choose to specify what variables to simulate and store in the simulated subject through the observations
input or we can choose to not input anything and in that case all variables returned from @derived
will be treated as relevant. Consider the following @derived
block:
@derived begin
cp1 := @. Central / Vc
dv1 ~ @. Normal(cp1, abs(cp1) * σ)
cp2 := @. Metabolite / Vm
dv2 ~ @. Normal(cp2, abs(cp2) * σ)
end
This is a parent-metabolite model with proportional error models. cp1
and cp2
are suppressed from the output through :=
so the relevant variables to consider for data simulation is parent concentration dv1
and metabolite concentration dv2
. If we want to simulate data from this model we need to prepare our subjects in one of the following two ways:
Subject(id = "1-both-1", time = [0.0, 1.0, 3.0, 4.0])
or
Subject(id = "1-both-2", time = [0.0, 1.0, 3.0, 4.0], observations = (:dv1, :dv2))
They will both contain the same information in this case because the omission of observations
means "simulate everything" and in the second case we include observations
and we mention everything. The list of variables to simulate should be given as a Tuple
of Symbols
(the names of the dv with a :
in front).
However, if we wanted to simulate only parent or only metabolite we would have to do something like the following:
Subject(id = "1-parent", time = [0.0, 1.0, 3.0, 4.0], observations = (:dv1,))
Subject(id = "1-metabolite", time = [0.0, 1.0, 3.0, 4.0], observations = (:dv2,))
and then we could create a Subject
from the output of simobs
called with each subject by itself and we would get two different subjects back: one with only parent concentrations, and one with only metabolite concentrations.
Selective simulation of dependent variables was described above. The selective application of dependent variables also works similarly in fitting. If you only have dv1
in your Subject
but the model was both dv1
and dv2
the model will simply be fit under the assumption that dv2
does not add anything to the log-likelihood. This means that you can have a Population
with only a subset of the dependent variables defined with ~
in your @derived
block. A case where that could be useful is exactly the model above where you might have parent-metabolite relationships to consider, but you want to model the parent in a first stage and a metabolite in a second stage.
PumasNDF
The PumasNDF is a specification for building a Population
(an alias for a vector of Subject
s) from tabular data. Generally this tabular data is given by a database like a CSV. The CSV has columns described as follows:
id
: the ID of the individual. Each individual should have a unique integer, or string.time
: the time corresponding to the row. Should be unique per id, i.e. no duplicate time values for a given subject.evid
: the event id.1
specifies a normal dose event.3
means it's a reset event, meaning that the value of the dynamical variable is reset to theamt
at the dosing event. If4
, then the dynamical value and time are reset, and then a final dose is given. Defaults to0
if amt is0
or missing, and 1 otherwise.amt
: the amount of the dose. If theevid
column exists and is non-zero, this value should be non-zero. Defaults to0
.ii
: the inter-dose interval. Whenaddl
is specified, this is the length of time to the next dose. For steady state events, this is the length of time between successive doses. Defaults to0
, and is required to be non-zero on rows where a steady-state event is specified.addl
: the number of additional doses of the sameamt
to give. Defaults to0
.rate
: the rate of administration. If0
, then the dose is instantaneous. Otherwise, the dose is administered at a constant rate for a duration equal toamt/rate
. Arate=-2
allows therate
to be determined by Dose Control Parameters (DCP). Defaults to0
.ss
: an indicator for whether the dose is a steady state dose. A steady state dose is defined as the result of having applied the dose with the intervalii
infinitely many successive times.0
indicates that the dose is not a steady state dose.1
indicates that the dose is a steady state dose.2
indicates that it is a steady state dose that is added to the previous amount. The default is0
.cmt
: the compartment being dosed. Defaults to1
.duration
: the duration of administration. If0
, then the dose is instantaneous. Otherwise, the dose is administered at a constant rate equal toamt/duration
. Defaults to0
.- Observation and covariate columns should be given as a time series of values of matching type. Constant covariates should be constant through the full column. Time points without a measurement should be denoted by a dot (
.
).
If a column does not exist, its values are imputed to be the defaults. Special notes:
- If
rate
andduration
exist, then it is enforced thatamt=rate*duration
. - All values and header names are interpreted as lower case.
Given the information above, it is important to understand how to read a dataset using the CSV
package. We recommend that all blanks (""
), .
's, NA
's and any other character elements in your dataset be passed to the missingstring
keyword argument when reading the file as below:
using CSV
data = CSV.read(
joinpath("pathtomyfile", "mydata.csv"),
DataFrame;
missingstring = ["", ".", "NA", "BQL"],
)
For more information check out the CSV
documentation.
PumasNDF Checks
The read_pumas
function performs some general checks on the provided data and informs the user about inconsistency in the data. It throws an error in case of invalid data, reporting the row number and column name causing the problem such that the user can resolve the issue.
Following is the list of checks applied by the read_pumas
function.
Necessary columns in case of event and non-event data
In case of event_data=true
(the default), the dataset must contain id
, time
, amt
, and observations
columns.
df = DataFrame(
id = [1, 1],
time = [0, 1],
cmt = [1, 2],
dv = [missing, 8],
age = [45, 45],
sex = ["M", "M"],
evid = [1, 0],
)
read_pumas(df; observations = [:dv], covariates = [:age, :sex])
# output
[ Info: The input has keys: [:id, :time, :cmt, :dv, :age, :sex, :evid]
ERROR: PumasDataError: The input must have: `id, time, amt, and observations` when `event_data` is `true`
[...]
In case of event_data=false
, only the id
column is required.
df = DataFrame(
time = [0, 1],
cmt = [1, 2],
dv = [missing, 8],
age = [45, 45],
sex = ["M", "M"],
evid = [1, 0],
)
read_pumas(df; observations = [:dv], covariates = [:age, :sex], event_data = false)
# output
ERROR: ArgumentError: column name "id" not found in the data frame; existing most similar names are: "dv" and "evid"
[...]
Event data without evid
column
In case of event_data=true
(the default), a warning is displayed if the provided dataset doesn't have an evid
column.
df = DataFrame(
id = [1, 1],
time = [0, 1],
amt = [10, 0],
cmt = [1, 2],
dv = [missing, 8],
age = [45, 45],
sex = ["M", "M"],
)
read_pumas(df; observations = [:dv], covariates = [:age, :sex])
# output
┌ Warning: Your dataset has dose event but it hasn't an evid column. We are adding 1 for dosing rows and 0 for others in evid column. If this is not the case, please add your evid column.
│
└ @ Pumas
Population
Subjects: 1
Covariates: age, sex
Observations: dv
Non-numeric/string entries in an observation column
If there are non-numeric or string entries in an observation column, read_pumas
throws an error and reports row(s) and column(s) having this issue.
df = DataFrame(
id = [1, 1],
time = [0, 1],
amt = [10, 0],
cmt = [1, 2],
dv = [missing, "k@"],
age = [45, 45],
sex = ["M", "M"],
evid = [1, 0],
)
read_pumas(df; observations = [:dv], covariates = [:age, :sex])
# output
ERROR: PumasDataError: [Subject id: [1], row = [2], col = dv] We expect the dv column to be of numeric type.
These are the unique non-numeric values present in the column dv: ("k@",)
[...]
Non-numeric/string entries in the amt
column
Similarly, if there are non-numeric or string entries in an observation column, read_pumas
throws an error and reports row(s) and column(s) having this issue.
df = DataFrame(
id = [1, 1],
time = [0, 1],
amt = ["k8", 0],
cmt = [1, 2],
dv = [missing, 8],
age = [45, 45],
sex = ["M", "M"],
evid = [1, 0],
)
read_pumas(df; observations = [:dv], covariates = [:age, :sex])
# output
ERROR: PumasDataError: [Subject id: [1], row = [1], col = amt] We expect the amt column to be of numeric type.
These are the unique non-numeric values present in the column amt: ("k8",)
[...]
cmt
must be a positive integer or valid string/symbol for non-zero evid
data record
The cmt
column should contain positive numbers or string/symbol identifiers of the compartment being dosed.
df = DataFrame(
id = [1, 1],
time = [0, 1],
amt = [10, 0],
cmt = [-1, 2],
dv = [missing, 8],
age = [45, 45],
sex = ["M", "M"],
evid = [1, 0],
)
read_pumas(df; observations = [:dv], covariates = [:age, :sex])
# output
ERROR: PumasDataError: [Subject id: 1, row = 1, col = cmt] column had invalid value: -1. Pumas does not currently support negative compartment values to turn off compartments.
[...]
amt
can only be missing
or zero when evid
is zero
df = DataFrame(
id = [1, 1],
time = [0, 1],
amt = [10, 5],
cmt = [1, 2],
dv = [missing, 8],
age = [45, 45],
sex = ["M", "M"],
evid = [1, 0],
)
read_pumas(df; observations = [:dv], covariates = [:age, :sex])
# output
ERROR: PumasDataError: [Subject id: 1, row = 2, col = evid] amt can only be missing or 0 when evid is 0
[...]
amt
can only be positive or zero when evid
is 1
df = DataFrame(
id = [1, 1],
time = [0, 1],
amt = [-10, 0],
cmt = [1, 2],
evid = [1, 0],
dv = [10, 8],
age = [45, 45],
sex = ["M", "M"],
)
read_pumas(df; observations = [:dv], covariates = [:age, :sex])
# output
ERROR: PumasDataError: [Subject id: 1, row = 1, col = evid] amt can only be positive or zero when evid is 1
[...]
Observations at the time of dose
Observations should be missing
at the time of dose (or when amt
> 0).
df = DataFrame(
id = [1, 1],
time = [0, 1],
amt = [10, 0],
cmt = [1, 2],
evid = [1, 0],
dv = [10, 8],
age = [45, 45],
sex = ["M", "M"],
)
read_pumas(df; observations = [:dv], covariates = [:age, :sex])
# output
ERROR: PumasDataError: [Subject id: 1, row = 1, col = dv] an observation is present at the time of dose in column dv. A blank record (`missing`) is required at time of dosing, i.e. when `amt` is positive.
[...]
Steady-state dosing requires ii
column
df = DataFrame(
id = [1, 1],
time = [0, 1],
amt = [10, 0],
ss = [1, 0],
cmt = [1, 2],
dv = [missing, 8],
age = [45, 45],
sex = ["M", "M"],
evid = [1, 0],
)
read_pumas(df; observations = [:dv], covariates = [:age, :sex])
# output
ERROR: PumasDataError: your dataset does not have ii which is a required column for steady state dosing.
[...]
Steady-state dosing with multiple infusions or bolus doses requires positive ii
In case of a steady-state event with multiple infusions or bolus doses (ss = 1
or ss = 2
and amt > 0
), the value of the dose interval column ii
must be non-zero.
If the rate
column is not provided it is assumed to be zero.
df = DataFrame(
id = [1, 1],
time = [0, 1],
amt = [10, 0],
ss = [1, 0],
ii = [0, 0],
cmt = [1, 2],
dv = [missing, 8],
age = [45, 45],
sex = ["M", "M"],
evid = [1, 0],
)
read_pumas(df; observations = [:dv], covariates = [:age, :sex])
# output
ERROR: PumasDataError: [Subject id: 1, row = 1, col = ii] for steady-state dosing the value of the interval column ii must be non-zero but was 0
[...]
Steady-state dosing with constant infusion requires zero ii
In case of a steady-state event with constant infusion (ss = 1
or ss = 2
, amt = 0
and rate > 0
), the value of the dose interval column ii
must be zero.
df = DataFrame(
id = [1, 1],
time = [0, 1],
amt = [0, 0],
ss = [1, 0],
rate = [2, 0],
ii = [1, 0],
cmt = [1, 2],
dv = [missing, 8],
age = [45, 45],
sex = ["M", "M"],
evid = [1, 0],
)
read_pumas(df; observations = [:dv], covariates = [:age, :sex])
# output
ERROR: PumasDataError: [Subject id: 1, row = 1, col = ii] for steady-state infusion the value of the interval column ii must be zero but was 1
[...]
Steady-state dosing with constant infusion requires zero addl
In case of a steady-state event with constant infusion (ss = 1
or ss = 2
, amt = 0
and rate > 0
), the value of the additional dose column addl
must be zero.
df = DataFrame(
id = [1, 1],
time = [0, 1],
amt = [0, 0],
ss = [1, 0],
rate = [2, 0],
ii = [0, 0],
addl = [5, 0],
cmt = [1, 2],
dv = [missing, 8],
age = [45, 45],
sex = ["M", "M"],
evid = [1, 0],
)
read_pumas(df; observations = [:dv], covariates = [:age, :sex])
# output
ERROR: PumasDataError: [Subject id: 1, row = 1, col = addl] for steady-state infusion the value of the additional dose column addl must be zero but was 5
[...]
Non-zero addl
requires ii
column
df = DataFrame(
id = [1, 1],
time = [0, 1],
amt = [10, 0],
addl = [5, 0],
cmt = [1, 2],
evid = [1, 0],
dv = [missing, 8],
age = [45, 45],
sex = ["M", "M"],
)
read_pumas(df; observations = [:dv], covariates = [:age, :sex])
# output
ERROR: PumasDataError: The dataset has an `addl` column specified with non-zero entries but does not have an `ii` column specified. It is not possible to specify the additional doses without providing the interval of time between them. Please specify which column the `ii` information is present in in your `read_pumas` call.
[...]
Non-zero addl
requires positive ii
df = DataFrame(
id = [1, 1],
time = [0, 1],
amt = [10, 0],
addl = [5, 0],
ii = [0, 0],
cmt = ["Depot", "Central"],
evid = [1, 0],
dv = [missing, 8],
age = [45, 45],
sex = ["M", "M"],
)
read_pumas(df; observations = [:dv], covariates = [:age, :sex])
# output
ERROR: PumasDataError: [Subject id: 1, row = 1, col = ii] ii must be positive for addl > 0
[...]
Positive ii
requires non-zero addl
df = DataFrame(
id = [1, 1],
time = [0, 1],
amt = [10, 0],
addl = [0, 0],
ii = [12, 0],
cmt = ["Depot", "Central"],
evid = [1, 0],
dv = [missing, 8],
age = [45, 45],
sex = ["M", "M"],
)
read_pumas(df; observations = [:dv], covariates = [:age, :sex])
# output
ERROR: PumasDataError: [Subject id: 1, row = 1, col = addl] addl must be positive for ii > 0
[...]
ii
can only be missing
or zero when evid
is zero
df = DataFrame(
id = [1, 1],
time = [0, 1],
amt = [10, 0],
addl = [5, 2],
ii = [12, 4],
cmt = ["Depot", "Central"],
evid = [1, 0],
dv = [missing, 8],
age = [45, 45],
sex = ["M", "M"],
)
read_pumas(df; observations = [:dv], covariates = [:age, :sex])
# output
ERROR: PumasDataError: [Subject id: 1, row = 2, col = evid] ii can only be missing or zero when evid is 0
[...]
addl
can only be positive or zero when evid
is 1
df = DataFrame(
id = [1, 1],
time = [0, 1],
amt = [10, 0],
addl = [-10, 0],
ii = [12, 0],
cmt = ["Depot", "Central"],
evid = [1, 0],
dv = [missing, 8],
age = [45, 45],
sex = ["M", "M"],
)
read_pumas(df; observations = [:dv], covariates = [:age, :sex])
# output
ERROR: PumasDataError: [Subject id: 1, row = 1, col = evid] addl can only be positive or zero when evid is 1
[...]
evid
must be nonzero when amt
is positive or addl
and ii
are positive
When amt
is positive, evid
must be non-zero as evid=0
indicates an observation record.
df = DataFrame(
id = [1, 1],
time = [0, 1],
amt = [10, 0],
addl = [5, 0],
ii = [12, 0],
cmt = ["Depot", "Central"],
evid = [0, 0],
dv = [missing, 8],
age = [45, 45],
sex = ["M", "M"],
)
read_pumas(df; observations = [:dv], covariates = [:age, :sex])
# output
ERROR: PumasDataError: [Subject id: 1, row = 1, col = evid] amt can only be missing or 0 when evid is 0
[...]