Data Format and parsing for NCA
Once a source data is read into Pumas-NCA, the next step is to ensure that the correctness of the data for NCA analysis. The correctness check requires that the data is presented to Pumas-NCA in a specific format. This format is called the Pumas-NCA data format - PumasNCADF which is discussed next.
Pumas-NCA data format - PumasNCADF
PumasNCADF
is a standardized format for tabular source data that is required for Pumas-NCA. A comprehensive list of the requirements is listed below, although, not all the requirements below are required for a given analysis.
id
: The numeric or stringid
of the subject.time
: The time at which theobservations
were measured.observations
: The observation (e.g. concentration) time series measurements. Values must be numbers or missing.amt
: The amount of a dose. Must be the dosing amount at each dosing time and otherwise missing.route
: The route of administration. Possible choices areiv
for intravenous,ev
for extravascular, andinf
for infusion. These can be specified as lower, upper or mixed case. E.g.iv
,IV
orEv
are accepted.duration
: The infusion duration. Should be the duration value or missing.blq
: Below the lower Limit of Quantification (BLQ). Used to specify the observation is BLQ. The BLQ column can take a value of 1 for BLQ observation and 0 otherwise. Handling Missing and BLQ Data is described in a separate section.ii
: The interdose interval, equivalent totau
. Used to specify the interval length for steady-state dosing. Defaults to the:ii
column. If specified, andss
istrue
, then analysis returns steady-state parameters e.g.,cminss, cavgss, cmaxss
by computing theaccumulationindex
.ss
: The steady-state. Used to specify whether a dose is steady-state, a steady-state dose takes the value1
and0
otherwise. It defaults to the:ss
column. Ifss
is set to1
for a subject,ii
should be greater than0
.
For urine analysis, the format has the follows columns:
id
: The numeric or stringid
of the subject.observations
: The observations e.g. (urine concentration) time series measurements.volume
: Collected urine volume.start_time
: The beginning of the urine collection time.end_time
: The end of the urine collection time.amt
: The amount of a dose. Must be the dosing amount at each dosing time, and otherwise missing.route
: The route of administration. Possible choices areiv
for intravenous,ev
for extravascular, andinf
for infusion. These can be specified as lower, upper or mixed case. E.g.iv
,IV
orEv
are accepted.duration
: The infusion duration. Should be the duration value or missing.blq
: Below the lower Limit of Quantification (BLQ). Used to specify the observation that is BLQ, BLQ column takes the value 1 and 0 otherwise. Handling BLQ Data is described in a separate section.ii
: The interdose interval. Used to specify the interval length for steady state dosing. Defaults to the:ii
column.ss
: The steady-state. Used to specify whether a dose is steady-state, a steady-state dose takes the value1
and0
otherwise. Defaults to the:ss
column. Ifss
is set to1
for a subject,ii
should be greater than0
.
Any additional columns in the data may be chosen for grouping the output, as discussed in the reference to group
keyword argument below.
read_nca
The parsing function for the PumasNCADF is read_nca
which has the following signature:
read_nca(df;
id = :id,
time = :time,
observations = :conc,
start_time = :start_time,
end_time = :end_time,
volume = :volume,
amt = :amt,
route = :route,
duration = :duration,
blq = :blq,
ii = :ii,
ss = :ss,
group = nothing,
concu = true,
timeu = true,
amtu = true,
volumeu = true,
verbose = true,
lambdazidxs = nothing,
lambdazslopetimes = nothing,
kwargs...)
These arguments are:
df
: The required positional argument. This is either a string which is the path to a CSV file, or aDataFrame
of tabular data for use in the NCA.
In case column names are distinct from the default names discussed above, id
, time
, observations
, start_time
, end_time
, volume
, amt
, route
, duration
, blq
, ii
and ss
keyword arguments can be used to pass in the corresponding column's name.
In addition to the keyword arguments for column names, the following are supported as well:
group
: The columns to group the data by, splits the subjects based on the group information associated with them. Defaults to no grouping.llq
: The Lower Limit of Quantification (LLQ). Defaults tonothing
.concblq
: The scheme for handling of BLQ values. Defaults to the dictionaryDict(:first=>:keep, :middle=>:drop, :last=>:keep)
, further explanation is available in the Handling BLQ Data section.concu
: The units forobservations
(e.g. concentration). Defaults to no units.amtu
: The units for dosing amount. Defaults to no units.timeu
: The units for time. Defaults to no units.volumeu
: The units for volume. Defaults to no units.verbose
: When true, warnings will be thrown when the output does not match PumasNCADF. Defaults totrue
.lambdazidxs
: For specifying the time points to use in thelambdaz
calculation by passing the indices of time points per subject in an array, hence it needs an array of arrays. If you want to skip a subject the corresponding element should benothing
.lambdazslopetimes
: Similar tolambdazidxs
but takes the actual time points instead of the index.
Examples
The examples below provide various patterns of using read_nca
. In addition to showcasing correct usage, we also showcase the expected errors when the function is used incorrectly. This can serve as a quick reference in the event a user faces an error.
Standard dataframe with no errors
julia> df1 = DataFrame(id = [1,1,1,1,1,2,2,2,2,2], time = [0,1,2,3,4,0,1,2,3,4], amt=[10,0,0,0,0,10,0,0,0,0], conc=[missing,8,6,4,2,missing,8,6,4,2], route = ["iv","iv","iv","iv","iv","iv","iv","iv","iv","iv"])
10×5 DataFrame Row │ id time amt conc route │ Int64 Int64 Int64 Int64? String ─────┼────────────────────────────────────── 1 │ 1 0 10 missing iv 2 │ 1 1 0 8 iv 3 │ 1 2 0 6 iv 4 │ 1 3 0 4 iv 5 │ 1 4 0 2 iv 6 │ 2 0 10 missing iv 7 │ 2 1 0 8 iv 8 │ 2 2 0 6 iv 9 │ 2 3 0 4 iv 10 │ 2 4 0 2 iv
julia> df1_r = read_nca(df1)
NCAPopulation (2 subjects): Number of missing observations: 2 Number of blq observations: 0
We can make use of the other keyword arguments for more control over the creation of the population with read_nca
, let's pass units from Unitful.jl for concentration and time with concu
and timeu
and also specify the slopetimes for lambdaz
with lambdazslopetimes
.
julia> using Unitful
ERROR: ArgumentError: Package Unitful not found in current path: - Run `import Pkg; Pkg.add("Unitful")` to install the Unitful package.
julia> time_unit = u"hr"
ERROR: LoadError: UndefVarError: @u_str not defined in expression starting at REPL[2]:1
julia> concentration_unit = u"mg/L"
ERROR: LoadError: UndefVarError: @u_str not defined in expression starting at REPL[3]:1
julia> custom_slopetimes = [[3,4], [2,3]]
2-element Vector{Vector{Int64}}: [3, 4] [2, 3]
julia> df3_r = read_nca(df3, concu = concentration_unit, timeu = time_unit, lambdazslopetimes = custom_slopetimes)
ERROR: UndefVarError: concentration_unit not defined
One can also pass the indices of the time points instead of the actual time values as below, with the lambdazidxs
keyword argument.
julia> custom_slopetimes_idxs = [[4:5], [3:4]]
2-element Vector{Vector{UnitRange{Int64}}}: [4:5] [3:4]
julia> df3_r = read_nca(df3, concu = concentration_unit, timeu = time_unit, lambdazidxs = custom_slopetimes_idxs)
ERROR: UndefVarError: concentration_unit not defined
Missing required column route
julia> df2 = DataFrame(id = [1,1,1,1,1,2,2,2,2,2], time = [0,1,2,3,4,0,1,2,3,4], amt=[10,0,0,0,0,10,0,0,0,0], conc=[missing,8,6,4,2,missing,8,6,4,2])
10×4 DataFrame Row │ id time amt conc │ Int64 Int64 Int64 Int64? ─────┼────────────────────────────── 1 │ 1 0 10 missing 2 │ 1 1 0 8 3 │ 1 2 0 6 4 │ 1 3 0 4 5 │ 1 4 0 2 6 │ 2 0 10 missing 7 │ 2 1 0 8 8 │ 2 2 0 6 9 │ 2 3 0 4 10 │ 2 4 0 2
julia> df2_r = read_nca(df2, observations = :conc)
┌ Warning: No dosage information has passed. If the dataset has dosage information, you can pass the column names by `amt=:amt, route=:route`. └ @ NCA /builds/PumasAI/PumasDocs-jl/.julia/packages/NCA/9RagW/src/data_parsing.jl:78 ┌ Warning: Dosage information requires the presence of both amt & route information. Looks like you only entered the amt and not the route. If your dataset does not have route, please add a column that specifies the route of administration and then pass both columns as `amt=:amt, route=:route.` └ @ NCA /builds/PumasAI/PumasDocs-jl/.julia/packages/NCA/9RagW/src/data_parsing.jl:80 NCAPopulation (2 subjects): Number of missing observations: 2 Number of blq observations: 0
The warning message above is verbose and educates the users on the consequence of the not passing in the route
column, and also, how to pass it in if missing from the source data.
amt
can be missing
at time of observations
julia> df3 = DataFrame(id = [1,1,1,1,1,2,2,2,2,2], time = [0,1,2,3,4,0,1,2,3,4], amt=[10,missing,missing,missing,missing,10,missing,missing,missing,missing], conc=[missing,8,6,4,2,missing,8,6,4,2], route = ["iv","iv","iv","iv","iv","iv","iv","iv","iv","iv"])
10×5 DataFrame Row │ id time amt conc route │ Int64 Int64 Int64? Int64? String ─────┼──────────────────────────────────────── 1 │ 1 0 10 missing iv 2 │ 1 1 missing 8 iv 3 │ 1 2 missing 6 iv 4 │ 1 3 missing 4 iv 5 │ 1 4 missing 2 iv 6 │ 2 0 10 missing iv 7 │ 2 1 missing 8 iv 8 │ 2 2 missing 6 iv 9 │ 2 3 missing 4 iv 10 │ 2 4 missing 2 iv
julia> df3_r = read_nca(df3, observations = :conc)
NCAPopulation (2 subjects): Number of missing observations: 2 Number of blq observations: 0
String (non-numeric) observations
julia> df4 = DataFrame(id = [1,1,1,1,1,2,2,2,2,2], time = [0,1,2,3,4,0,1,2,3,4], amt=[10,missing,missing,missing,missing,10,missing,missing,missing,missing], conc=[missing,8,6,4,"<LOQ",missing,8,6,4,2], route = ["iv","iv","iv","iv","iv","iv","iv","iv","iv","iv"])
10×5 DataFrame Row │ id time amt conc route │ Int64 Int64 Int64? Any String ─────┼──────────────────────────────────────── 1 │ 1 0 10 missing iv 2 │ 1 1 missing 8 iv 3 │ 1 2 missing 6 iv 4 │ 1 3 missing 4 iv 5 │ 1 4 missing <LOQ iv 6 │ 2 0 10 missing iv 7 │ 2 1 missing 8 iv 8 │ 2 2 missing 6 iv 9 │ 2 3 missing 4 iv 10 │ 2 4 missing 2 iv
julia> df4_r = read_nca(df4, observations = :conc)
ERROR: ArgumentError: conc has non-numeric values at index=[5]. We expect the names column to be of numeric type. Please fix your input data before proceeding further.
observations
column can only be numeric. The above error message will be noticed if the column has a string element, in this example <LOQ
. The way to circumvent this error is to specify the missingstrings
keyword in CSV.read
. e.g. CSV.read("./pkdata", DataFrame, missingstrings=["<LOQ"])
. In this way, all string elements match that text will be converted to missing
.
The above method of setting missingstrings
runs the risk that one may not be able to keep tally of which observations were missing due to what reason. In that case, it is recommended to take a two step approach, such as below.
df4.isBLQ .= ifelse.(df4.conc == "<LOQ", 1, 0)
df4.conc .= parse.(Float64, df4.conc)
Here the data was pre-processed and an extra column isBLQ
was set to keep a tally of the <LOQ
value. Then, the observations column, conc
was converted to a numeric column.
Another way of handling this would be to
- read the data in without
missingstrings
in theCSV.read
function. - create a
isBQL
column as beforedf4.isBLQ .= ifelse.(df4.conc == "<LOQ", 1, 0)
- convert the observations from
Any
to numeric bydf.conc.= parse.(Float64, df4)
amt
column can only be numeric
julia> df5 = DataFrame(id = [1,1,1,1,1,2,2,2,2,2], time = [0,1,2,3,4,0,1,2,3,4], amt=["10",missing,missing,missing,missing,"10",missing,missing,missing,missing], conc=[missing,8,6,4,2,missing,8,6,4,2], route = ["iv","iv","iv","iv","iv","iv","iv","iv","iv","iv"])
10×5 DataFrame Row │ id time amt conc route │ Int64 Int64 String? Int64? String ─────┼──────────────────────────────────────── 1 │ 1 0 10 missing iv 2 │ 1 1 missing 8 iv 3 │ 1 2 missing 6 iv 4 │ 1 3 missing 4 iv 5 │ 1 4 missing 2 iv 6 │ 2 0 10 missing iv 7 │ 2 1 missing 8 iv 8 │ 2 2 missing 6 iv 9 │ 2 3 missing 4 iv 10 │ 2 4 missing 2 iv
julia> df5_r = read_nca(df5, observations = :conc)
ERROR: ArgumentError: amt has non-numeric values at index=[1, 6]. We expect the names column to be of numeric type. Please fix your input data before proceeding further.
Concentration at dosing rows are not ignored
julia> df6 = DataFrame(id = [1,1,1,1,1,2,2,2,2,2], time = [0,1,2,3,4,0,1,2,3,4], amt=[10,missing,missing,missing,missing,10,missing,missing,missing,missing], conc=[10,8,6,4,2,10,8,6,4,2], route = ["iv","iv","iv","iv","iv","iv","iv","iv","iv","iv"])
10×5 DataFrame Row │ id time amt conc route │ Int64 Int64 Int64? Int64 String ─────┼────────────────────────────────────── 1 │ 1 0 10 10 iv 2 │ 1 1 missing 8 iv 3 │ 1 2 missing 6 iv 4 │ 1 3 missing 4 iv 5 │ 1 4 missing 2 iv 6 │ 2 0 10 10 iv 7 │ 2 1 missing 8 iv 8 │ 2 2 missing 6 iv 9 │ 2 3 missing 4 iv 10 │ 2 4 missing 2 iv
julia> df6_r = read_nca(df6, observations = :conc)
NCAPopulation (2 subjects): Number of missing observations: 0 Number of blq observations: 0
The example below emphasizes the fact that concentrations in dose rows are not ignored. We can compare this with df1_r
in the example Standard dataframe with no errors. The computed auc
's between the two are different as can be seen below.
julia> NCA.auc(df1_r)
2×2 DataFrame Row │ id auc │ Int64 Float64 ─────┼──────────────── 1 │ 1 27.9743 2 │ 2 27.9743
julia> NCA.auc(df6_r)
2×2 DataFrame Row │ id auc │ Int64 Float64 ─────┼──────────────── 1 │ 1 27.641 2 │ 2 27.641
route
can either be upper or lowercase or mixedcase ev
, iv
or inf
julia> df7 = DataFrame(id = [1,1,1,1,1,2,2,2,2,2], time = [0,1,2,3,4,0,1,2,3,4], amt=[10,missing,missing,missing,missing,10,missing,missing,missing,missing], conc=[10,8,6,4,2,10,8,6,4,2], route = ["Iv","Iv","Iv","Iv","Iv","IV","IV","IV","IV","IV"])
10×5 DataFrame Row │ id time amt conc route │ Int64 Int64 Int64? Int64 String ─────┼────────────────────────────────────── 1 │ 1 0 10 10 Iv 2 │ 1 1 missing 8 Iv 3 │ 1 2 missing 6 Iv 4 │ 1 3 missing 4 Iv 5 │ 1 4 missing 2 Iv 6 │ 2 0 10 10 IV 7 │ 2 1 missing 8 IV 8 │ 2 2 missing 6 IV 9 │ 2 3 missing 4 IV 10 │ 2 4 missing 2 IV
julia> df7_r = read_nca(df7, observations = :conc)
NCAPopulation (2 subjects): Number of missing observations: 0 Number of blq observations: 0
julia> df7_r[1].dose
NCADose: time: 0 amt: 10 duration: 0 route: IVBolus ss: false
While we accommodate mixed case, it is recommended for consistency that users provide route
information in the same consistent case, preferably lower case.
Non-monotonic time is not allowed within an individual
julia> df8 = DataFrame(id = [1,1,1,1,1,2,2,2,2,2], time = [0,1,2,3,3,0,1,2,3,3], amt=[10,missing,missing,missing,missing,10,missing,missing,missing,missing], conc=[10,8,6,4,2,10,8,6,4,2], route = ["iv","iv","iv","iv","iv","iv","iv","iv","iv","iv"])
10×5 DataFrame Row │ id time amt conc route │ Int64 Int64 Int64? Int64 String ─────┼────────────────────────────────────── 1 │ 1 0 10 10 iv 2 │ 1 1 missing 8 iv 3 │ 1 2 missing 6 iv 4 │ 1 3 missing 4 iv 5 │ 1 3 missing 2 iv 6 │ 2 0 10 10 iv 7 │ 2 1 missing 8 iv 8 │ 2 2 missing 6 iv 9 │ 2 3 missing 4 iv 10 │ 2 3 missing 2 iv
julia> df8_r = read_nca(df8, observations = :conc)
[ Info: ID 1 errored ERROR: ArgumentError: Time must be monotonically increasing. Errored at `time=3` (index 4)
Users have to ensure that time is monotonically increasing within a subject, unless there is a grouping variable that is specified.
Missing Time is not allowed
julia> df9 = DataFrame(id = [1,1,1,1,1,2,2,2,2,2], time = [0,1,2,3,missing,0,1,2,3,missing], amt=[10,missing,missing,missing,missing,10,missing,missing,missing,missing], conc=[10,8,6,4,2,10,8,6,4,2], route = ["iv","iv","iv","iv","iv","iv","iv","iv","iv","iv"])
10×5 DataFrame Row │ id time amt conc route │ Int64 Int64? Int64? Int64 String ─────┼──────────────────────────────────────── 1 │ 1 0 10 10 iv 2 │ 1 1 missing 8 iv 3 │ 1 2 missing 6 iv 4 │ 1 3 missing 4 iv 5 │ 1 missing missing 2 iv 6 │ 2 0 10 10 iv 7 │ 2 1 missing 8 iv 8 │ 2 2 missing 6 iv 9 │ 2 3 missing 4 iv 10 │ 2 missing missing 2 iv
julia> df9_r = read_nca(df9, observations = :conc)
[ Info: ID 1 errored ERROR: ArgumentError: Time may not be missing (missing occured at index 5)
Multiple dose within a subject requires contiguous time
julia> df10 = DataFrame(id = [1,1,1,1,1,1,1,1,1,1], time = [0,1,2,3,4,5,6,7,8,9], amt=[10,0,0,0,0,10,0,0,0,0], conc=[missing,8,6,4,2,missing,8,6,4,2], route = ["iv","iv","iv","iv","iv","iv","iv","iv","iv","iv"], iii = [5,0,0,0,0,5,0,0,0,0])
10×6 DataFrame Row │ id time amt conc route iii │ Int64 Int64 Int64 Int64? String Int64 ─────┼───────────────────────────────────────────── 1 │ 1 0 10 missing iv 5 2 │ 1 1 0 8 iv 0 3 │ 1 2 0 6 iv 0 4 │ 1 3 0 4 iv 0 5 │ 1 4 0 2 iv 0 6 │ 1 5 10 missing iv 5 7 │ 1 6 0 8 iv 0 8 │ 1 7 0 6 iv 0 9 │ 1 8 0 4 iv 0 10 │ 1 9 0 2 iv 0
julia> df10_r = read_nca(df10, observations = :conc)
NCAPopulation (1 subjects): Number of missing observations: 1 Number of blq observations: 0
julia> df10_r[1].dose
2-element Vector{NCADose{Int64, Int64}}: NCADose: time: 0 amt: 10 duration: 0 route: IVBolus ss: false NCADose: time: 5 amt: 10 duration: 0 route: IVBolus ss: false
Multiple dose with ii
specified allows computation of steady-state values
julia> df11 = DataFrame(id = [1,1,1,1,1,1,1,1,1,1], time = [0,1,2,3,4,5,6,7,8,9], amt=[10,0,0,0,0,10,0,0,0,0], conc=[missing,8,6,4,2,missing,8,6,4,2], route = ["iv","iv","iv","iv","iv","iv","iv","iv","iv","iv"], iii = [5,0,0,0,0,5,0,0,0,0])
10×6 DataFrame Row │ id time amt conc route iii │ Int64 Int64 Int64 Int64? String Int64 ─────┼───────────────────────────────────────────── 1 │ 1 0 10 missing iv 5 2 │ 1 1 0 8 iv 0 3 │ 1 2 0 6 iv 0 4 │ 1 3 0 4 iv 0 5 │ 1 4 0 2 iv 0 6 │ 1 5 10 missing iv 5 7 │ 1 6 0 8 iv 0 8 │ 1 7 0 6 iv 0 9 │ 1 8 0 4 iv 0 10 │ 1 9 0 2 iv 0
julia> df11_r = read_nca(df11, observations=:conc, ii=:iii)
NCAPopulation (1 subjects): Number of missing observations: 1 Number of blq observations: 0
julia> df11_r[1].dose
2-element Vector{NCADose{Int64, Int64}}: NCADose: time: 0 amt: 10 duration: 0 route: IVBolus ss: false NCADose: time: 5 amt: 10 duration: 0 route: IVBolus ss: false
As you can see, the result of df10_r
and df11_r
are identical, even though the later accepts the ii
argument mapped to iii
from the dataset. ii
specifies the tau
, or the dosing frequency. This information allows Pumas-NCA to compute steady-state parameters, cmaxss, cminss, cavgss, accumuluationindex, tau
. We can confirm this by looking at the differences between the two. df10_r
that has no ii
information cannot compute the accumulationindex
whereas df11_r
can.
julia> NCA.accumulationindex(df10_r)
2×2 DataFrame Row │ id accumulationindex │ Int64 Missing ─────┼────────────────────────── 1 │ 1 missing 2 │ 1 missing
julia> NCA.accumulationindex(df11_r)
2×2 DataFrame Row │ id accumulationindex │ Int64 Float64 ─────┼────────────────────────── 1 │ 1 1.06855 2 │ 1 1.06855
Subjects with dosing record only and no observations will result in missing results
julia> df12 = DataFrame(id = 1, time = 0, amt=10, conc=missing, route="iv")
1×5 DataFrame Row │ id time amt conc route │ Int64 Int64 Int64 Missing String ─────┼────────────────────────────────────── 1 │ 1 0 10 missing iv
julia> df12_r = read_nca(df12, observations=:conc)
┌ Warning: Subject 1: All concentration data is missing between times 0 and 0 └ @ NCA /builds/PumasAI/PumasDocs-jl/.julia/packages/NCA/9RagW/src/utils.jl:54 NCAPopulation (1 subjects): Number of missing observations: 1 Number of blq observations: 0
julia> NCA.auc(df12_r)
1×2 DataFrame Row │ id auc │ Int64 Missing ─────┼──────────────── 1 │ 1 missing
Multiple dosing - all provided doses should have corresponding observations vectors
In the example below, for subject 1,the first dose has observations, but the second dose at time 5 has no associated observations and hence results in the error.
julia> df13 = DataFrame(id = [1,1,1,1,1,1], time = [0,1,2,3,4,5], amt=[10,0,0,0,0,10], conc=[missing,8,6,4,2,missing], route = ["iv","iv","iv","iv","iv","iv"])
6×5 DataFrame Row │ id time amt conc route │ Int64 Int64 Int64 Int64? String ─────┼────────────────────────────────────── 1 │ 1 0 10 missing iv 2 │ 1 1 0 8 iv 3 │ 1 2 0 6 iv 4 │ 1 3 0 4 iv 5 │ 1 4 0 2 iv 6 │ 1 5 10 missing iv
julia> df13_r = read_nca(df13, observations=:conc)
┌ Warning: Subject 1: All concentration data is missing between times 5 and 5 └ @ NCA /builds/PumasAI/PumasDocs-jl/.julia/packages/NCA/9RagW/src/utils.jl:54 NCAPopulation (1 subjects): Number of missing observations: 1 Number of blq observations: 0
steady-state flag ss
requires ii>0
At the moment, ii
and ss
work interchangeable for the computation of steady state parameters. The rules are as follows:
- When
ii
is specified,ss
is not required, and the information oftau
fromii
is used to compute parameters specific to multiple dose. - When
ss
is specified,ii
is required as most steady-state parameters requiretau
as information. - When
ii
orss
are not specified for multiple dose data, none of the steady-state parameters are computed.
julia> df14 = DataFrame(id = [1,1,1,1,1,1,1,1,1,1], time = [0,1,2,3,4,5,6,7,8,9], amt=[10,0,0,0,0,10,0,0,0,0], conc=[missing,8,6,4,2,missing,8,6,4,2], route = ["iv","iv","iv","iv","iv","iv","iv","iv","iv","iv"], iii = [5,0,0,0,0,5,0,0,0,0], sss = [1,0,0,0,0,1,0,0,0,0])
10×7 DataFrame Row │ id time amt conc route iii sss │ Int64 Int64 Int64 Int64? String Int64 Int64 ─────┼──────────────────────────────────────────────────── 1 │ 1 0 10 missing iv 5 1 2 │ 1 1 0 8 iv 0 0 3 │ 1 2 0 6 iv 0 0 4 │ 1 3 0 4 iv 0 0 5 │ 1 4 0 2 iv 0 0 6 │ 1 5 10 missing iv 5 1 7 │ 1 6 0 8 iv 0 0 8 │ 1 7 0 6 iv 0 0 9 │ 1 8 0 4 iv 0 0 10 │ 1 9 0 2 iv 0 0
julia> df14_r1 = read_nca(df14, observations=:conc)
NCAPopulation (1 subjects): Number of missing observations: 1 Number of blq observations: 0
julia> NCA.accumulationindex(df14_r1)
2×2 DataFrame Row │ id accumulationindex │ Int64 Missing ─────┼────────────────────────── 1 │ 1 missing 2 │ 1 missing
julia> df14_r2 = read_nca(df14, observations=:conc, ss=:sss, ii = :iii)
NCAPopulation (1 subjects): Number of missing observations: 1 Number of blq observations: 0
julia> NCA.accumulationindex(df14_r2)
2×2 DataFrame Row │ id accumulationindex │ Int64 Float64 ─────┼────────────────────────── 1 │ 1 1.06855 2 │ 1 1.06855
julia> df14_r3 = read_nca(df14, observations=:conc, ii=:iii)
NCAPopulation (1 subjects): Number of missing observations: 1 Number of blq observations: 0
julia> NCA.accumulationindex(df14_r3)
2×2 DataFrame Row │ id accumulationindex │ Int64 Float64 ─────┼────────────────────────── 1 │ 1 1.06855 2 │ 1 1.06855
julia> df14_r4 = read_nca(df14, observations=:conc, ss=:sss, ii=:iii)
NCAPopulation (1 subjects): Number of missing observations: 1 Number of blq observations: 0
julia> NCA.accumulationindex(df14_r4)
2×2 DataFrame Row │ id accumulationindex │ Int64 Float64 ─────┼────────────────────────── 1 │ 1 1.06855 2 │ 1 1.06855
Specification of Groups
- NCA is always done on a per-
NCASubject
, per-doseevent basis. - Grouping during an NCA analysis can be at a
NCASubject
level - e.g. After a single dose, subject has observations of parent and metabolite, so grouping happens at the analyte level; after multiple dose (single dose every day), subject has measurements every day, so grouping happens per day; subject has received single ascending dose, so group is per doseNCAPopulation
level - e.g. The study population is divided into multiple dose groups, so grouping is done by dose; some subject receive tablets and some subjects receive capsules, so grouping is done by formulation.
- Groups specified in
read_nca
via thegroup
argument get carried forward into the the result data frame, whether a complete report or the result of a single function. - At a
NCASubject
level, specifyinggroup
allows Pumas-NCA to breakdown the subject's profile into multiple groups that ensures that Non-monotonic time is not allowed within an individual requirement is respected. - At a
NCAPopulation
level, specifyinggroup
provides a convenient way to to carry that variable forward into the result data frame. - More than one group can be passed in via the
group
argument using the array of symbols syntax, e.g.group = [:dose, :day]
As of v2.0, Pumas-NCA accepts only one observations
at a time
The example below emphasizes the grouping at the NCAPopulation
level
julia> df17 = DataFrame(id = [1,1,1,1,1,2,2,2,2,2], time = [0,1,2,3,4,0,1,2,3,4], amt=[10,0,0,0,0,10,0,0,0,0], conc=[missing,8,6,4,6,missing,8,6,4,2], route = ["iv","iv","iv","iv","iv","iv","iv","iv","iv","iv"], formulation = ["T","T", "T","T","T","R", "R", "R", "R", "R"])
10×6 DataFrame Row │ id time amt conc route formulation │ Int64 Int64 Int64 Int64? String String ─────┼─────────────────────────────────────────────────── 1 │ 1 0 10 missing iv T 2 │ 1 1 0 8 iv T 3 │ 1 2 0 6 iv T 4 │ 1 3 0 4 iv T 5 │ 1 4 0 6 iv T 6 │ 2 0 10 missing iv R 7 │ 2 1 0 8 iv R 8 │ 2 2 0 6 iv R 9 │ 2 3 0 4 iv R 10 │ 2 4 0 2 iv R
julia> df17_r = read_nca(df17, observations=:conc, group = [:formulation])
NCAPopulation (2 subjects): Group: ["formulation" => "R", "formulation" => "T"] Number of missing observations: 2 Number of blq observations: 0