Data Format and parsing for NCA
Once a source data is read into Pumas NCA
package, the next step is to ensure that the correctness of the data for NCA analysis. The correctness check requires that the data is presented to Pumas NCA
package in a specific format. This format is called the Pumas-NCA data format - PumasNCADF which is discussed next.
Pumas-NCA data format - PumasNCADF
PumasNCADF is a standardized format for tabular source data that is required for Pumas NCA
analyses. A comprehensive list of the requirements is listed below as the docstring of the read_nca
function:
NCA.read_nca
— Functionread_nca(file::AbstractString; kwargs...)
read_nca(df_obs::AbstractDataFrame, df_dose::AbstractDataFrame; id = :id, time = :time, kwargs...)
read_nca(df::DataFrame; id=:id, time=:time, observations=:conc, nominal_time = :nominal_time,
start_time=:start_time, end_time=:end_time, volume=:volume,
amt=:amt, route=:route, duration=:duration, blq=:blq,
ii=:ii, ss=:ss, group=nothing, concu=true, timeu=true, amtu=true, volumeu=true,
verbose=true, sparse = false, kwargs...)
Parse a DataFrame
object or a CSV file to NCAPopulation
. NCAPopulation
holds an array of NCASubject
s which contain relevant data for the individual subjects.
Concentrations at dosing rows are NOT ignored in read_nca
.
df
:DataFrame
containing the data for the analysis.
2 dataframes, in order, observations dataframe and dosing dataframe, can be passed to read_nca
as well, rest of the arguments stay consistent in this case.
The following keyword arguments are used to specify column names in the df
:
id
: The numeric or string id of the subject. Defaults to:id
.time
: The actual time at which the observations were measured. Defaults to:time
.observations
: The observation (e.g. concentration) time series measurements. Values must be numbers or missing. Defaults to:conc
.amt
: The amount of a dose. Can either be the dosing amount at each dosing time and otherwise missing or the dosing amount is present at each time, in this case the first time (for a subject in a subgroup) is considered as the dosing time. Defaults to:amt
.route
: The route of administration. Possible choices areiv
for intravenous,ev
for extravascular, andinf
for infusion. These can be specified as lower, upper or mixed case. E.g.iv
,IV
orEv
are accepted. Defaults to:route
.duration
: The infusion duration. Should be the duration value or missing. Defaults to:duration
.blq
: Below the lower Limit of Quantification (BLQ). Used to specify the observation is BLQ. The BLQ column can take a value of 1 for BLQ observation and 0 otherwise. Defaults to:blq
.ii
: The interdose interval, equivalent totau
. Used to specify the interval length for steady-state dosing. Defaults to the:ii
column. If specified, andss
istrue
, then analysis returns steady-state parameters e.g.,cminss, cavgss, cmaxss
by computing theaccumulationindex
.ss
: The steady-state. Used to specify whether a dose is steady-state, a steady-state dose takes the value1
and0
otherwise. It defaults to the:ss
column. Ifss
is set to1
for a subject,ii
should be greater than0
.group
: The columns to group the data by, splits the subjects based on the group information associated with them. Defaults to no grouping.llq
: The Lower Limit of Quantification (LLQ). Defaults tonothing
.concblq
: The scheme for handling of BLQ values. Defaults to the dictionaryDict(:first=>:keep, :middle=>:drop, :last=>:keep)
, further explanation is available in the Handling BLQ Data section.concu
: The units forobservations
(e.g. concentration). Defaults to no units.amtu
: The units for dosing amount. Defaults to no units.timeu
: The units for time. Defaults to no units.volumeu
: The units for volume. Defaults to no units.verbose
: When true, warnings will be thrown when the output does not match PumasNCADF. Defaults totrue
.nominal_time
: The nominal time corresponding to the observations. Defaults to:nominal_time
.sparse
: Boolean flag to indicate if the dataset should be treated as a case of sparse sampling. Defaults tofalse
.
Urine analysis requires the following columns not used in case of plasma.
start_time
: The beginning of the urine collection time. Defaults to:start_time
.end_time
: The end of the urine collection time. Defaults to:end_time
.volume
: Collected urine volume. Defaults to:volume
.
For details about the handling of concentration values below the lower limit of quantification, please check out the documentation of NCA.cleanblq
. All the keyword arguments of NCA.cleanblq
are applicable to read_nca
, too.
Examples
The examples below provide various patterns of using read_nca
. In addition to showcasing correct usage, we also showcase the expected errors when the function is used incorrectly. This can serve as a quick reference in the event a user faces an error.
Standard DataFrames with no errors
julia> df1 = DataFrame(; id = [1, 1, 1, 1, 1, 2, 2, 2, 2, 2], time = [0, 1, 2, 3, 4, 0, 1, 2, 3, 4], amt = [10, 0, 0, 0, 0, 10, 0, 0, 0, 0], conc = [missing, 8, 6, 4, 2, missing, 8, 6, 4, 2], route = ["iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv"], )
10×5 DataFrame Row │ id time amt conc route │ Int64 Int64 Int64 Int64? String ─────┼────────────────────────────────────── 1 │ 1 0 10 missing iv 2 │ 1 1 0 8 iv 3 │ 1 2 0 6 iv 4 │ 1 3 0 4 iv 5 │ 1 4 0 2 iv 6 │ 2 0 10 missing iv 7 │ 2 1 0 8 iv 8 │ 2 2 0 6 iv 9 │ 2 3 0 4 iv 10 │ 2 4 0 2 iv
julia> df1_r = read_nca(df1)
NCAPopulation (2 subjects): Number of missing observations: 2 Number of blq observations: 0
We can make use of the other keyword arguments for more control over the creation of the population with read_nca
, let's pass units from Unitful for concentration and time with concu
and timeu
; while also specifying the slope times for lambdaz
with lambdazslopetimes
:
julia> time_unit = u"hr"
hr
julia> concentration_unit = u"mg/L"
mg L^-1
julia> custom_slopetimes = [[3, 4], [2, 3]]
2-element Vector{Vector{Int64}}: [3, 4] [2, 3]
julia> df3_r = read_nca( df3; concu = concentration_unit, timeu = time_unit, lambdazslopetimes = custom_slopetimes, )
ERROR: UndefVarError: `df3` not defined
One can also pass the indices of the time points instead of the actual time values as below, with the lambdazidxs
keyword argument:
julia> custom_slopetimes_idxs = [[4:5], [3:4]]
2-element Vector{Vector{UnitRange{Int64}}}: [4:5] [3:4]
julia> df3_r = read_nca( df3; concu = concentration_unit, timeu = time_unit, lambdazidxs = custom_slopetimes_idxs, )
ERROR: UndefVarError: `df3` not defined
Missing required column route
The warning message below is verbose and educates the users on the consequence, of the not passing in the route
column, and also, how to pass it in if missing
from the source data:
julia> df2 = DataFrame(; id = [1, 1, 1, 1, 1, 2, 2, 2, 2, 2], time = [0, 1, 2, 3, 4, 0, 1, 2, 3, 4], amt = [10, 0, 0, 0, 0, 10, 0, 0, 0, 0], conc = [missing, 8, 6, 4, 2, missing, 8, 6, 4, 2], )
10×4 DataFrame Row │ id time amt conc │ Int64 Int64 Int64 Int64? ─────┼────────────────────────────── 1 │ 1 0 10 missing 2 │ 1 1 0 8 3 │ 1 2 0 6 4 │ 1 3 0 4 5 │ 1 4 0 2 6 │ 2 0 10 missing 7 │ 2 1 0 8 8 │ 2 2 0 6 9 │ 2 3 0 4 10 │ 2 4 0 2
julia> df2_r = read_nca(df2; observations = :conc)
┌ Warning: No dosage information has passed. If the dataset has dosage information, you can pass the column names by `amt=:amt, route=:route`. └ @ NCA ~/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/NCA/CJt5D/src/data_parsing.jl:178 ┌ Warning: Dosage information requires the presence of both amt & route information. Looks like you only entered the amt and not the route. If your dataset does not have route, please add a column that specifies the route of administration and then pass both columns as `amt=:amt, route=:route.` └ @ NCA ~/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/NCA/CJt5D/src/data_parsing.jl:180 NCAPopulation (2 subjects): Number of missing observations: 2 Number of blq observations: 0
amt
can be missing
at time of observations
This is totally fine and won't error or emit warnings:
julia> df3 = DataFrame(; id = [1, 1, 1, 1, 1, 2, 2, 2, 2, 2], time = [0, 1, 2, 3, 4, 0, 1, 2, 3, 4], amt = [10, missing, missing, missing, missing, 10, missing, missing, missing, missing], conc = [missing, 8, 6, 4, 2, missing, 8, 6, 4, 2], route = ["iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv"], )
10×5 DataFrame Row │ id time amt conc route │ Int64 Int64 Int64? Int64? String ─────┼──────────────────────────────────────── 1 │ 1 0 10 missing iv 2 │ 1 1 missing 8 iv 3 │ 1 2 missing 6 iv 4 │ 1 3 missing 4 iv 5 │ 1 4 missing 2 iv 6 │ 2 0 10 missing iv 7 │ 2 1 missing 8 iv 8 │ 2 2 missing 6 iv 9 │ 2 3 missing 4 iv 10 │ 2 4 missing 2 iv
julia> df3_r = read_nca(df3; observations = :conc)
NCAPopulation (2 subjects): Number of missing observations: 2 Number of blq observations: 0
String (non-numeric) observations
observations
column can only be numeric. The error message below will be noticed if the column has a string element, in this example <LOQ
:
julia> df4 = DataFrame(; id = [1, 1, 1, 1, 1, 2, 2, 2, 2, 2], time = [0, 1, 2, 3, 4, 0, 1, 2, 3, 4], amt = [10, missing, missing, missing, missing, 10, missing, missing, missing, missing], conc = [missing, 8, 6, 4, "<LOQ", missing, 8, 6, 4, 2], route = ["iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv"], )
10×5 DataFrame Row │ id time amt conc route │ Int64 Int64 Int64? Any String ─────┼──────────────────────────────────────── 1 │ 1 0 10 missing iv 2 │ 1 1 missing 8 iv 3 │ 1 2 missing 6 iv 4 │ 1 3 missing 4 iv 5 │ 1 4 missing <LOQ iv 6 │ 2 0 10 missing iv 7 │ 2 1 missing 8 iv 8 │ 2 2 missing 6 iv 9 │ 2 3 missing 4 iv 10 │ 2 4 missing 2 iv
julia> df4_r = read_nca(df4; observations = :conc)
ERROR: ArgumentError: conc has non-numeric values at index=[5]. We expect the names column to be of numeric type. Please fix your input data before proceeding further.
The way to circumvent this error is to specify the missingstrings
keyword in CSV.read
, e.g. CSV.read("pkdata.csv", DataFrame; missingstrings = ["<LOQ"])
. In this way, all string elements match that text will be converted to missing
.
amt
column can only be numeric
The amt
column must be of a numeric type, otherwise read_nca
will error:
julia> df5 = DataFrame(; id = [1, 1, 1, 1, 1, 2, 2, 2, 2, 2], time = [0, 1, 2, 3, 4, 0, 1, 2, 3, 4], amt = [ "10", missing, missing, missing, missing, "10", missing, missing, missing, missing, ], conc = [missing, 8, 6, 4, 2, missing, 8, 6, 4, 2], route = ["iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv"], )
10×5 DataFrame Row │ id time amt conc route │ Int64 Int64 String? Int64? String ─────┼──────────────────────────────────────── 1 │ 1 0 10 missing iv 2 │ 1 1 missing 8 iv 3 │ 1 2 missing 6 iv 4 │ 1 3 missing 4 iv 5 │ 1 4 missing 2 iv 6 │ 2 0 10 missing iv 7 │ 2 1 missing 8 iv 8 │ 2 2 missing 6 iv 9 │ 2 3 missing 4 iv 10 │ 2 4 missing 2 iv
julia> df5_r = read_nca(df5; observations = :conc)
ERROR: ArgumentError: amt has non-numeric values at index=[1, 6]. We expect the names column to be of numeric type. Please fix your input data before proceeding further.
Concentration at dosing rows are not ignored
The example below emphasizes the fact that concentrations in dose rows are not ignored:
julia> df6 = DataFrame(; id = [1, 1, 1, 1, 1, 2, 2, 2, 2, 2], time = [0, 1, 2, 3, 4, 0, 1, 2, 3, 4], amt = [10, missing, missing, missing, missing, 10, missing, missing, missing, missing], conc = [10, 8, 6, 4, 2, 10, 8, 6, 4, 2], route = ["iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv"], )
10×5 DataFrame Row │ id time amt conc route │ Int64 Int64 Int64? Int64 String ─────┼────────────────────────────────────── 1 │ 1 0 10 10 iv 2 │ 1 1 missing 8 iv 3 │ 1 2 missing 6 iv 4 │ 1 3 missing 4 iv 5 │ 1 4 missing 2 iv 6 │ 2 0 10 10 iv 7 │ 2 1 missing 8 iv 8 │ 2 2 missing 6 iv 9 │ 2 3 missing 4 iv 10 │ 2 4 missing 2 iv
julia> df6_r = read_nca(df6; observations = :conc)
NCAPopulation (2 subjects): Number of missing observations: 0 Number of blq observations: 0
route
can either be upper or lowercase or mixed-case ev
, iv
or inf
While we accommodate mixed case, it is recommended for consistency that users provide route
information in the same consistent case, preferably lower-case.
julia> df7 = DataFrame(; id = [1, 1, 1, 1, 1, 2, 2, 2, 2, 2], time = [0, 1, 2, 3, 4, 0, 1, 2, 3, 4], amt = [10, missing, missing, missing, missing, 10, missing, missing, missing, missing], conc = [10, 8, 6, 4, 2, 10, 8, 6, 4, 2], route = ["Iv", "Iv", "Iv", "Iv", "Iv", "IV", "IV", "IV", "IV", "IV"], )
10×5 DataFrame Row │ id time amt conc route │ Int64 Int64 Int64? Int64 String ─────┼────────────────────────────────────── 1 │ 1 0 10 10 Iv 2 │ 1 1 missing 8 Iv 3 │ 1 2 missing 6 Iv 4 │ 1 3 missing 4 Iv 5 │ 1 4 missing 2 Iv 6 │ 2 0 10 10 IV 7 │ 2 1 missing 8 IV 8 │ 2 2 missing 6 IV 9 │ 2 3 missing 4 IV 10 │ 2 4 missing 2 IV
julia> df7_r = read_nca(df7; observations = :conc)
NCAPopulation (2 subjects): Number of missing observations: 0 Number of blq observations: 0
julia> df7_r[1].dose
NCADose: time: 0 amt: 10 duration: 0 route: IVBolus ss: false
Non-monotonic time is not allowed within an individual
Users have to ensure that time is monotonically increasing within a subject, unless there is a grouping variable that is specified:
julia> df8 = DataFrame(; id = [1, 1, 1, 1, 1, 2, 2, 2, 2, 2], time = [0, 1, 2, 3, 3, 0, 1, 2, 3, 3], amt = [10, missing, missing, missing, missing, 10, missing, missing, missing, missing], conc = [10, 8, 6, 4, 2, 10, 8, 6, 4, 2], route = ["iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv"], )
10×5 DataFrame Row │ id time amt conc route │ Int64 Int64 Int64? Int64 String ─────┼────────────────────────────────────── 1 │ 1 0 10 10 iv 2 │ 1 1 missing 8 iv 3 │ 1 2 missing 6 iv 4 │ 1 3 missing 4 iv 5 │ 1 3 missing 2 iv 6 │ 2 0 10 10 iv 7 │ 2 1 missing 8 iv 8 │ 2 2 missing 6 iv 9 │ 2 3 missing 4 iv 10 │ 2 3 missing 2 iv
julia> df8_r = read_nca(df8; observations = :conc)
[ Info: ID 1 errored ERROR: ArgumentError: Time must be monotonically increasing. Errored at `time=3` (index 4)
Missing time is not allowed
Values in the time
column must not be missing
:
julia> df9 = DataFrame(; id = [1, 1, 1, 1, 1, 2, 2, 2, 2, 2], time = [0, 1, 2, 3, missing, 0, 1, 2, 3, missing], amt = [10, missing, missing, missing, missing, 10, missing, missing, missing, missing], conc = [10, 8, 6, 4, 2, 10, 8, 6, 4, 2], route = ["iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv"], )
10×5 DataFrame Row │ id time amt conc route │ Int64 Int64? Int64? Int64 String ─────┼──────────────────────────────────────── 1 │ 1 0 10 10 iv 2 │ 1 1 missing 8 iv 3 │ 1 2 missing 6 iv 4 │ 1 3 missing 4 iv 5 │ 1 missing missing 2 iv 6 │ 2 0 10 10 iv 7 │ 2 1 missing 8 iv 8 │ 2 2 missing 6 iv 9 │ 2 3 missing 4 iv 10 │ 2 missing missing 2 iv
julia> df9_r = read_nca(df9; observations = :conc)
[ Info: ID 1 errored ERROR: ArgumentError: Time may not be missing (missing occured at index 5)
Multiple dose within a subject requires contiguous time
julia> df10 = DataFrame(; id = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1], time = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], amt = [10, 0, 0, 0, 0, 10, 0, 0, 0, 0], conc = [missing, 8, 6, 4, 2, missing, 8, 6, 4, 2], route = ["iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv"], iii = [5, 0, 0, 0, 0, 5, 0, 0, 0, 0], )
10×6 DataFrame Row │ id time amt conc route iii │ Int64 Int64 Int64 Int64? String Int64 ─────┼───────────────────────────────────────────── 1 │ 1 0 10 missing iv 5 2 │ 1 1 0 8 iv 0 3 │ 1 2 0 6 iv 0 4 │ 1 3 0 4 iv 0 5 │ 1 4 0 2 iv 0 6 │ 1 5 10 missing iv 5 7 │ 1 6 0 8 iv 0 8 │ 1 7 0 6 iv 0 9 │ 1 8 0 4 iv 0 10 │ 1 9 0 2 iv 0
julia> df10_r = read_nca(df10; observations = :conc)
NCAPopulation (1 subjects): Number of missing observations: 1 Number of blq observations: 0
julia> df10_r[1].dose
2-element Vector{NCADose{Int64, Int64}}: NCADose: time: 0 amt: 10 duration: 0 route: IVBolus ss: false NCADose: time: 5 amt: 10 duration: 0 route: IVBolus ss: false
Multiple dose with ii
specified allows computation of steady-state values
julia> df11 = DataFrame(; id = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1], time = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], amt = [10, 0, 0, 0, 0, 10, 0, 0, 0, 0], conc = [missing, 8, 6, 4, 2, missing, 8, 6, 4, 2], route = ["iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv"], iii = [5, 0, 0, 0, 0, 5, 0, 0, 0, 0], )
10×6 DataFrame Row │ id time amt conc route iii │ Int64 Int64 Int64 Int64? String Int64 ─────┼───────────────────────────────────────────── 1 │ 1 0 10 missing iv 5 2 │ 1 1 0 8 iv 0 3 │ 1 2 0 6 iv 0 4 │ 1 3 0 4 iv 0 5 │ 1 4 0 2 iv 0 6 │ 1 5 10 missing iv 5 7 │ 1 6 0 8 iv 0 8 │ 1 7 0 6 iv 0 9 │ 1 8 0 4 iv 0 10 │ 1 9 0 2 iv 0
julia> df11_r = read_nca(df11; observations = :conc, ii = :iii)
NCAPopulation (1 subjects): Number of missing observations: 1 Number of blq observations: 0
julia> df11_r[1].dose
2-element Vector{NCADose{Int64, Int64}}: NCADose: time: 0 amt: 10 duration: 0 route: IVBolus ss: false NCADose: time: 5 amt: 10 duration: 0 route: IVBolus ss: false
As you can see, the result of df10_r
and df11_r
are identical, even though the latter accepts the ii
argument mapped to iii
from the dataset. ii
specifies the tau
, or the dosing frequency. This information allows Pumas NCA
package to compute steady-state parameters, cmaxss, cminss, cavgss, accumuluationindex, tau
. We can confirm this by looking at the differences between the two. df10_r
that has no ii
information cannot compute the accumulationindex
whereas df11_r
can.
julia> NCA.accumulationindex(df10_r)
2×2 DataFrame Row │ id accumulationindex │ String Missing ─────┼─────────────────────────── 1 │ 1 missing 2 │ 1 missing
julia> NCA.accumulationindex(df11_r)
2×2 DataFrame Row │ id accumulationindex │ String Float64 ─────┼─────────────────────────── 1 │ 1 1.06855 2 │ 1 1.06855
Subjects with dosing record only and no observations will result in missing results
julia> df12 = DataFrame(; id = 1, time = 0, amt = 10, conc = missing, route = "iv")
1×5 DataFrame Row │ id time amt conc route │ Int64 Int64 Int64 Missing String ─────┼────────────────────────────────────── 1 │ 1 0 10 missing iv
read_nca
will also give a warning when parsing:
julia> df12_r = read_nca(df12; observations = :conc)
[ Info: ID: 1. Dataset has the amt column amt populated for all rows hence the first time 0 is considered as dose time. ┌ Warning: Subject 1: All concentration data is missing between times 0 and 0 └ @ NCA ~/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/NCA/CJt5D/src/utils.jl:72 NCAPopulation (1 subjects): Number of missing observations: 1 Number of blq observations: 0
julia> NCA.auc(df12_r)
1×2 DataFrame Row │ id auc │ String Missing ─────┼───────────────── 1 │ 1 missing
Multiple dosing - all provided doses should have corresponding observations vectors
In the example below, for subject 1, the first dose has observations, but the second dose at time 5 has no associated observations and hence results in the error:
julia> df13 = DataFrame(; id = [1, 1, 1, 1, 1, 1], time = [0, 1, 2, 3, 4, 5], amt = [10, 0, 0, 0, 0, 10], conc = [missing, 8, 6, 4, 2, missing], route = ["iv", "iv", "iv", "iv", "iv", "iv"], )
6×5 DataFrame Row │ id time amt conc route │ Int64 Int64 Int64 Int64? String ─────┼────────────────────────────────────── 1 │ 1 0 10 missing iv 2 │ 1 1 0 8 iv 3 │ 1 2 0 6 iv 4 │ 1 3 0 4 iv 5 │ 1 4 0 2 iv 6 │ 1 5 10 missing iv
julia> df13_r = read_nca(df13; observations = :conc)
┌ Warning: Subject 1: All concentration data is missing between times 5 and 5 └ @ NCA ~/_work/PumasSystemImages/PumasSystemImages/julia_depot/packages/NCA/CJt5D/src/utils.jl:72 NCAPopulation (1 subjects): Number of missing observations: 1 Number of blq observations: 0
steady-state flag ss
requires ii>0
At the moment, ii
and ss
work interchangeable for the computation of steady state parameters. The rules are as follows:
- When
ii
is specified,ss
is not required, and the information oftau
fromii
is used to compute parameters specific to multiple dose. - When
ss
is specified,ii
is required as most steady-state parameters requiretau
as information. - When
ii
orss
are not specified for multiple dose data, none of the steady-state parameters are computed.
julia> df14 = DataFrame(; id = [1, 1, 1, 1, 1, 1, 1, 1, 1, 1], time = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9], amt = [10, 0, 0, 0, 0, 10, 0, 0, 0, 0], conc = [missing, 8, 6, 4, 2, missing, 8, 6, 4, 2], route = ["iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv"], iii = [5, 0, 0, 0, 0, 5, 0, 0, 0, 0], sss = [1, 0, 0, 0, 0, 1, 0, 0, 0, 0], )
10×7 DataFrame Row │ id time amt conc route iii sss │ Int64 Int64 Int64 Int64? String Int64 Int64 ─────┼──────────────────────────────────────────────────── 1 │ 1 0 10 missing iv 5 1 2 │ 1 1 0 8 iv 0 0 3 │ 1 2 0 6 iv 0 0 4 │ 1 3 0 4 iv 0 0 5 │ 1 4 0 2 iv 0 0 6 │ 1 5 10 missing iv 5 1 7 │ 1 6 0 8 iv 0 0 8 │ 1 7 0 6 iv 0 0 9 │ 1 8 0 4 iv 0 0 10 │ 1 9 0 2 iv 0 0
julia> df14_r1 = read_nca(df14; observations = :conc)
NCAPopulation (1 subjects): Number of missing observations: 1 Number of blq observations: 0
julia> NCA.accumulationindex(df14_r1)
2×2 DataFrame Row │ id accumulationindex │ String Missing ─────┼─────────────────────────── 1 │ 1 missing 2 │ 1 missing
julia> df14_r2 = read_nca(df14; observations = :conc, ss = :sss, ii = :iii)
NCAPopulation (1 subjects): Number of missing observations: 1 Number of blq observations: 0
julia> NCA.accumulationindex(df14_r2)
2×2 DataFrame Row │ id accumulationindex │ String Float64 ─────┼─────────────────────────── 1 │ 1 1.06855 2 │ 1 1.06855
julia> df14_r3 = read_nca(df14; observations = :conc, ii = :iii)
NCAPopulation (1 subjects): Number of missing observations: 1 Number of blq observations: 0
julia> NCA.accumulationindex(df14_r3)
2×2 DataFrame Row │ id accumulationindex │ String Float64 ─────┼─────────────────────────── 1 │ 1 1.06855 2 │ 1 1.06855
julia> df14_r4 = read_nca(df14; observations = :conc, ss = :sss, ii = :iii)
NCAPopulation (1 subjects): Number of missing observations: 1 Number of blq observations: 0
julia> NCA.accumulationindex(df14_r4)
2×2 DataFrame Row │ id accumulationindex │ String Float64 ─────┼─────────────────────────── 1 │ 1 1.06855 2 │ 1 1.06855
Specification of Groups
- NCA is always done on a per-
NCASubject
, per-dose-event basis. - Grouping during an NCA analysis can be at a
NCASubject
level, e.g. After a single dose, subject has observations of parent and metabolite, so grouping happens at the analyte level; after multiple dose (single dose every day), subject has measurements every day, so grouping happens per day; subject has received single ascending dose, so group is per dose.NCAPopulation
level, e.g. The study population is divided into multiple dose groups, so grouping is done by dose; some subject receive tablets and some subjects receive capsules, so grouping is done by formulation.
- Groups specified in
read_nca
via thegroup
argument get carried forward into the result data frame, whether a complete report or the result of a single function. - At a
NCASubject
level, specifyinggroup
allows PumasNCA
package to break down the subject's profile into multiple groups that ensures that Non-monotonic time is not allowed within an individual requirement is respected. - At a
NCAPopulation
level, specifyinggroup
provides a convenient way to carry that variable forward into the result data frame. - More than one group can be passed in via the
group
argument using the array of symbols syntax, e.g.group=[:dose, :day]
The example below emphasizes the grouping at the NCAPopulation
level:
julia> df17 = DataFrame(; id = [1, 1, 1, 1, 1, 2, 2, 2, 2, 2], time = [0, 1, 2, 3, 4, 0, 1, 2, 3, 4], amt = [10, 0, 0, 0, 0, 10, 0, 0, 0, 0], conc = [missing, 8, 6, 4, 6, missing, 8, 6, 4, 2], route = ["iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv", "iv"], formulation = ["T", "T", "T", "T", "T", "R", "R", "R", "R", "R"], )
10×6 DataFrame Row │ id time amt conc route formulation │ Int64 Int64 Int64 Int64? String String ─────┼─────────────────────────────────────────────────── 1 │ 1 0 10 missing iv T 2 │ 1 1 0 8 iv T 3 │ 1 2 0 6 iv T 4 │ 1 3 0 4 iv T 5 │ 1 4 0 6 iv T 6 │ 2 0 10 missing iv R 7 │ 2 1 0 8 iv R 8 │ 2 2 0 6 iv R 9 │ 2 3 0 4 iv R 10 │ 2 4 0 2 iv R
julia> df17_r = read_nca(df17; observations = :conc, group = [:formulation])
NCAPopulation (2 subjects): Group: [["formulation" => "R"], ["formulation" => "T"]] Number of missing observations: 2 Number of blq observations: 0