ADaM Docstrings
ADaM.JoinColumnsKeywordError — Type
JoinColumnsKeywordError(keyword::Symbol, err::Exception)Custom error type for join_columns keyword argument failures.
ADaM.basic_info_pc — Method
basic_info_pc(df::DataFrame)Displays basic information about the pc(PK conc.) dataset in a dictionary containing
- Studies involved
- No of subjects(overall)
- Treatments
- Sample specimens
Example
julia> pc = PharmaDatasets.dataset("SDTM/CDISCPILOT01/pc")
3556×20 DataFrame
Row │ STUDYID DOMAIN USUBJID PCSEQ PCTESTCD PCTEST PCORRES PCORRESU PCSTRESC PCS ⋯
│ String15 String3 String15 Float64 String3 String15 String31 String7 String31 Flo ⋯
──────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ CDISCPILOT01 PC 01-701-1015 1.0 XAN XANOMELINE <BLQ ug/ml <BLQ ⋯
2 │ CDISCPILOT01 PC 01-701-1015 2.0 XAN XANOMELINE <BLQ ug/ml <BLQ mis
3 │ CDISCPILOT01 PC 01-701-1015 3.0 XAN XANOMELINE <BLQ ug/ml <BLQ mis
4 │ CDISCPILOT01 PC 01-701-1015 4.0 XAN XANOMELINE <BLQ ug/ml <BLQ mis
5 │ CDISCPILOT01 PC 01-701-1015 5.0 XAN XANOMELINE <BLQ ug/ml <BLQ mis ⋯
6 │ CDISCPILOT01 PC 01-701-1015 6.0 XAN XANOMELINE <BLQ ug/ml <BLQ mis
7 │ CDISCPILOT01 PC 01-701-1015 7.0 XAN XANOMELINE <BLQ ug/ml <BLQ mis
8 │ CDISCPILOT01 PC 01-701-1015 8.0 XAN XANOMELINE <BLQ ug/ml <BLQ mis
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱
3550 │ CDISCPILOT01 PC 01-718-1427 8.0 XAN XANOMELINE 1.87286298246525 ug/ml 1.87286298246525 ⋯
3551 │ CDISCPILOT01 PC 01-718-1427 9.0 XAN XANOMELINE 1.8956805216499 ug/ml 1.8956805216499
3552 │ CDISCPILOT01 PC 01-718-1427 10.0 XAN XANOMELINE 0.575294228033741 ug/ml 0.575294228033741
3553 │ CDISCPILOT01 PC 01-718-1427 11.0 XAN XANOMELINE 0.173882563295603 ug/ml 0.173882563295603
3554 │ CDISCPILOT01 PC 01-718-1427 12.0 XAN XANOMELINE 0.015885031037154 ug/ml 0.015885031037154 ⋯
3555 │ CDISCPILOT01 PC 01-718-1427 13.0 XAN XANOMELINE <BLQ ug/ml <BLQ mis
3556 │ CDISCPILOT01 PC 01-718-1427 14.0 XAN XANOMELINE <BLQ ug/ml <BLQ mis
11 columns and 3541 rows omitted
julia> basic_info_pc(pc)
Dict{String, Any} with 4 entries:
"studies" => String15["CDISCPILOT01"]
"subjects" => 254
"treatments" => String15["XANOMELINE"]
"specimens" => String7["PLASMA"]ADaM.bmi_summary — Method
bmi_summary(df::DataFrame; id, bmi)Displays the count of each BMI category based on id. Category names are based on National Library of Medicine.
- Underweight (Below 18.5)
- Normal (18.5 to 24.9)
- Overweight (25.0 to 29.9)
- Obese (30.0 to 39.9)
- Extreme (Over 40)
id defaults to :USUBJID and bmi defaults to :BMIBL.
Example
julia> df = DataFrame(USUBJID = [1,2,3,4,5,6,7,8], BMIBL = [15,42,31,25,21,46,18,19])
julia> bmi_summary(df)
5×2 DataFrame
Row │ BMIC count
│ String Int64
─────┼────────────────────
1 │ Underweight 2
2 │ Extreme 2
3 │ Obese 1
4 │ Overweight 1
5 │ Normal 2
julia> df = DataFrame(ID = [1,2,3,4,5,6,7,8], BMI = [15,42,31,25,21,46,18,19])
julia> bmi_summary(df, id="ID", bmi = "BMI")
5×2 DataFrame
Row │ BMIC count
│ String Int64
─────┼────────────────────
1 │ Underweight 2
2 │ Extreme 2
3 │ Obese 1
4 │ Overweight 1
5 │ Normal 2ADaM.body_mass_index — Method
body_mass_index(weight::Number, height::Number; kwargs...)
body_mass_index(weight::Quantity, height::Quantity)
body_mass_index(df::DataFrame; kwargs...)Calculates BMI from height and weight which can be provided as Quantitys, Scalars or Vectors via DataFrame. BMI Wikipedia
The weight_unit and height_unit can be explicitly passed as unit values or can be passed as a Vector. Follows the units from DynamicQuantities.jl
Arguments
weight: Weight value.height: Height value.weight_unit: Weight unit (default: "kg").height_unit: Height unit (default: "cm").
Default Columns
The following default column names are used for DataFrame input:
weight = :WTBLheight = :HTBLweight_unit = :WTBLUheight_unit = :HTBLUcol = :BMIBL
Output
BMI value in kg/m² as a scalar or DataFrame column with unit column.
Examples
julia> bmi = body_mass_index(60, 160)
23.437499999999996 m⁻² kg
julia> value, unit = ustrip(bmi), dimension(bmi)
(23.437499999999996, m⁻² kg)
julia> body_mass_index(60, 1.6, height_unit = :m)
23.437499999999996 m⁻² kg
julia> body_mass_index(60000, 160, weight_unit = :g)
23.437499999999996 m⁻² kg
julia> body_mass_index(60u"kg", 160u"cm")
23.437499999999996 m⁻² kg
julia> df = DataFrame(
HTBL = [150, 160, 170, 180],
WTBL = [50, 60, 70, 80],
HTBLU = "cm",
WTBLU = "kg",
)
4×4 DataFrame
Row │ HTBL WTBL HTBLU WTBLU
│ Int64 Int64 String String
─────┼──────────────────────────────
1 │ 150 50 cm kg
2 │ 160 60 cm kg
3 │ 170 70 cm kg
4 │ 180 80 cm kg
julia> body_mass_index(df)
4×6 DataFrame
Row │ HTBL WTBL HTBLU WTBLU BMIBL BMIBLU
│ Int64 Int64 String String Float64 Symbolic…
─────┼──────────────────────────────────────────────────
1 │ 150 50 cm kg 22.2222 m⁻² kg
2 │ 160 60 cm kg 23.4375 m⁻² kg
3 │ 170 70 cm kg 24.2215 m⁻² kg
4 │ 180 80 cm kg 24.6914 m⁻² kg
julia> df = DataFrame(HT = [150, 160, 170, 180], WT = [50, 60, 70, 80], HTU = "cm", WTU = "kg")
4×4 DataFrame
Row │ HT WT HTU WTU
│ Int64 Int64 String String
─────┼──────────────────────────────
1 │ 150 50 cm kg
2 │ 160 60 cm kg
3 │ 170 70 cm kg
4 │ 180 80 cm kg
julia> body_mass_index(df, height = :HT, weight = :WT, height_unit = :HTU, weight_unit = :WTU, col = :BMI)
4×6 DataFrame
Row │ HT WT HTU WTU BMI BMIU
│ Int64 Int64 String String Float64 Symbolic…
─────┼──────────────────────────────────────────────────
1 │ 150 50 cm kg 22.2222 m⁻² kg
2 │ 160 60 cm kg 23.4375 m⁻² kg
3 │ 170 70 cm kg 24.2215 m⁻² kg
4 │ 180 80 cm kg 24.6914 m⁻² kgADaM.body_surface_area — Method
body_surface_area(height::Number, weight::Number; kwargs...)
body_surface_area(height::Quantity, weight::Quantity; kwargs...)
body_surface_area(df::DataFrame; kwargs...)Calculates BSA from height and weight which can be provided as Quantitys, Scalars or Vectors via DataFrame. BSA Wikipedia
BSA can be calculated using the following formulas:
mosteller(default)dubois-duboishaycockgehan-georgeboydfujimototakahira
The weight_unit and height_unit can be explicitly passed as unit values or can be passed as a Vector. Follows the units from DynamicQuantities.jl
Arguments
height: Height value.weight: Weight value.height_unit: Height unit (default: "cm").weight_unit: Weight unit (default: "kg").formula: BSA calculation formula (default::mosteller).
Default Columns
The following default column names are used for DataFrame input:
height = :HTBLweight = :WTBLheight_unit = :HTBLUweight_unit = :WTBLUcol = :BSABL
Output
BSA value in m² as a scalar or DataFrame column with unit column.
Examples
julia> bsa = body_surface_area(160, 60) # default height(cm), weight(kg), formula(mosteller)
1.632993161855452 m²
julia> value, unit = ustrip(bsa), dimension(bsa)
(1.632993161855452, m²)
julia> bsa = body_surface_area(160, 60, formula="dubois-dubois")
1.6220414635466536 m²
julia> bsa = body_surface_area(160, 60, formula=:takahira)
1.6349324596500971 m²
julia> body_surface_area( 1.6, 60, height_unit = :m)
1.632993161855452 m²
julia> body_surface_area(160, 60000, weight_unit = :g)
1.632993161855452 m²
julia> body_surface_area(160u"cm", 60u"kg")
1.632993161855452 m²
julia> df = DataFrame(
HTBL = [150, 160, 170, 180],
WTBL = [50, 60, 70, 80],
HTBLU = "cm",
WTBLU = "kg",
)
4×4 DataFrame
Row │ HTBL WTBL HTBLU WTBLU
│ Int64 Int64 String String
─────┼──────────────────────────────
1 │ 150 50 cm kg
2 │ 160 60 cm kg
3 │ 170 70 cm kg
4 │ 180 80 cm kg
julia> body_surface_area(df)
4×6 DataFrame
Row │ HTBL WTBL HTBLU WTBLU BSABL BSABLU
│ Int64 Int64 String String Float64 Symbolic…
─────┼──────────────────────────────────────────────────
1 │ 150 50 cm kg 1.44338 m²
2 │ 160 60 cm kg 1.63299 m²
3 │ 170 70 cm kg 1.81812 m²
4 │ 180 80 cm kg 2.0 m²
julia> df = DataFrame(HT = [150, 160, 170, 180], WT = [50, 60, 70, 80], HTU = "cm", WTU = "kg")
4×4 DataFrame
Row │ HT WT HTU WTU
│ Int64 Int64 String String
─────┼──────────────────────────────
1 │ 150 50 cm kg
2 │ 160 60 cm kg
3 │ 170 70 cm kg
4 │ 180 80 cm kg
julia> body_surface_area(df, height = :HT, weight = :WT, height_unit = :HTU, weight_unit = :WTU, col = :BSA)
4×6 DataFrame
Row │ HT WT HTU WTU BSA BSAU
│ Int64 Int64 String String Float64 Symbolic…
─────┼──────────────────────────────────────────────────
1 │ 150 50 cm kg 1.44338 m²
2 │ 160 60 cm kg 1.63299 m²
3 │ 170 70 cm kg 1.81812 m²
4 │ 180 80 cm kg 2.0 m²ADaM.compress_dose_events — Method
compress_dose_events(df::DataFrame; group, order, sampling_rows)This function replaces a sequence of dosing rows (EVID == 1) into compressed format based on EVID column, creating ADDL (Additional Doses) and II (Inter-dose Interval) columns
group and order variables (Vectors or Scalars) can be passed to customise the compression.
Compression can be done so as to retain one inter-sampling row sampling_rows = :single or two sampling rows sampling_rows = :double. Only the information of the 1st row of the sequence is retained.
Required columns for expansion: EVID
Example
julia> df = DataFrame([
(1, 1, 40),
(2, 1, 90),
(1, 0, 10),
(2, 0, 60),
(1, 1, 20),
(2, 1, 70),
(1, 1, 30),
(2, 1, 80),
(1, 0, 50),
(2, 0, 100)
], [:ID, :EVID, :AFRLT]) # unordered and ungrouped dataset
10×3 DataFrame
Row │ ID EVID AFRLT
│ Int64 Int64 Int64
─────┼─────────────────────
1 │ 1 1 40
2 │ 2 1 90
3 │ 1 0 10
4 │ 2 0 60
5 │ 1 1 20
6 │ 2 1 70
7 │ 1 1 30
8 │ 2 1 80
9 │ 1 0 50
10 │ 2 0 100
julia> compress_dose_events(df) # compress without groupby
6×4 DataFrame
Row │ ID EVID AFRLT ADDL
│ Int64 Int64 Int64 Int64
─────┼────────────────────────────
1 │ 1 1 40 1
2 │ 1 0 10 0
3 │ 2 0 60 0
4 │ 1 1 20 3
5 │ 1 0 50 0
6 │ 2 0 100 0
julia> compress_dose_events(df, group = ["ID"])
8×4 DataFrame
Row │ ID EVID AFRLT ADDL
│ Int64 Int64 Int64 Int64
─────┼────────────────────────────
1 │ 1 1 40 0
2 │ 1 0 10 0
3 │ 1 1 20 1
4 │ 1 0 50 0
5 │ 2 1 90 0
6 │ 2 0 60 0
7 │ 2 1 70 1
8 │ 2 0 100 0
julia> compress_dose_events(df, group = [:ID], order = [:AFRLT])
6×4 DataFrame
Row │ ID EVID AFRLT ADDL
│ Int64 Int64 Int64 Int64
─────┼────────────────────────────
1 │ 1 0 10 0
2 │ 1 1 20 2
3 │ 1 0 50 0
4 │ 2 0 60 0
5 │ 2 1 70 2
6 │ 2 0 100 0
julia> compress_dose_events(df, group = "ID", order = "AFRLT", sampling_rows = :double)
8×4 DataFrame
Row │ ID EVID AFRLT ADDL
│ Int64 Int64 Int64 Int64
─────┼────────────────────────────
1 │ 1 0 10 0
2 │ 1 1 20 1
3 │ 1 1 40 0
4 │ 1 0 50 0
5 │ 2 0 60 0
6 │ 2 1 70 1
7 │ 2 1 90 0
8 │ 2 0 100 0ADaM.convert_to_missing — Method
convert_to_missing(df::DataFrame, NaStr::Vector)- Converts values to
missing. The values that need to be converted tomissingcan be passed as aVector. - Example : [nothing, "", NaN, ".", "-"]
Example
julia> df = DataFrame(Col1 = [1, "."], Col2 = ["", 2], Col3 = [3, nothing], Col4 = ["-", 4])
2×4 DataFrame
Row │ Col1 Col2 Col3 Col4
│ Any Any Union… Any
─────┼──────────────────────────
1 │ 1 3 -
2 │ . 2 4
julia> convert_to_missing(df, ["", nothing, ".", "-"])
2×4 DataFrame
Row │ Col1 Col2 Col3 Col4
│ Int64? Int64? Int64? Int64?
─────┼────────────────────────────────────
1 │ 1 missing 3 missing
2 │ missing 2 missing 4ADaM.creatinine_clearance — Method
creatinine_clearance(weight::Number, height::Number; kwargs...)
creatinine_clearance(weight::Quantity, height::Quantity)
creatinine_clearance(df::DataFrame; kwargs...)Calculates creatinine clearance from age, weight,creatinine and sex using Cockcroft–Gault formula which can be provided as Quantitys, Scalars or Vectors via DataFrame. Cockcroft–Gault formula Wikipedia
The age_unit, weight_unit and creat_unit can be explicitly passed as unit values or can be passed as a Vector. Follows the units from DynamicQuantities.jl
Default values for keyword arguments:
age = :AGEweight = :WTBLcreat = :CREATBLsex = :SEXage_unit = :AGEUweight_unit = :WTBLUcreat_unit = :CREATBLUcol = :CRCLBL
Example
julia> creatinine_clearance(53, 85, 90, "M", creat_unit = "umol/L")
100.88434438681963 min⁻¹ mL
julia> creatinine_clearance(53, 85, 1, "M", creat_unit = "mg/dL")
102.70833333333333 min⁻¹ mL
julia> creatinine_clearance(53us"yr", 85us"kg", 90us"umol/L", "M")
100.88434438681963 min⁻¹ mL
julia> creatinine_clearance(53u"yr", 85u"kg", 1us"mg/dL", "M")
102.70833333333333 min⁻¹ mL
julia> df = DataFrame(
AGE = [20, 30, 40, 53],
WTBL = [50, 60, 70, 85],
CREATBL = [60, 70, 80, 90],
SEX = ["M", "M", "F", "F"],
AGEU = :yr,
WTBLU = :kg,
CREATBLU = "umol/L",
)
4×7 DataFrame
Row │ AGE WTBL CREATBL SEX AGEU WTBLU CREATBLU
│ Int64 Int64 Int64 String Symbol Symbol String
─────┼─────────────────────────────────────────────────────────
1 │ 20 50 60 M yr kg umol/L
2 │ 30 60 70 M yr kg umol/L
3 │ 40 70 80 F yr kg umol/L
4 │ 53 85 90 F yr kg umol/L
julia> creatinine_clearance(df)
4×9 DataFrame
Row │ AGE WTBL CREATBL SEX AGEU WTBLU CREATBLU CRCLBL CRCLBLU
│ Int64 Int64 Int64 String Symbol Symbol String Float64 Symbolic…
─────┼──────────────────────────────────────────────────────────────────────────────
1 │ 20 50 60 M yr kg umol/L 122.78 min⁻¹ mL
2 │ 30 60 70 M yr kg umol/L 115.764 min⁻¹ mL
3 │ 40 70 80 F yr kg umol/L 91.3177 min⁻¹ mL
4 │ 53 85 90 F yr kg umol/L 85.7517 min⁻¹ mL
julia> df = DataFrame(
AGEYRS = [20, 30, 40, 53],
WEIGHT = [50, 60, 70, 85],
CREAT = [60, 70, 80, 90],
GENDER = ["M", "M", "F", "F"],
AGEUNI = :yr,
WTUNI = :kg,
CREATUNI = "umol/L",
)
4×7 DataFrame
Row │ AGEYRS WEIGHT CREAT GENDER AGEUNI WTUNI CREATUNI
│ Int64 Int64 Int64 String Symbol Symbol String
─────┼─────────────────────────────────────────────────────────
1 │ 20 50 60 M yr kg umol/L
2 │ 30 60 70 M yr kg umol/L
3 │ 40 70 80 F yr kg umol/L
4 │ 53 85 90 F yr kg umol/L
julia> creatinine_clearance(
df;
age = :AGEYRS,
weight = :WEIGHT,
creat = :CREAT,
sex = :GENDER,
age_unit = :AGEUNI,
weight_unit = :WTUNI,
creat_unit = :CREATUNI,
col = :CREATCL,
)
4×9 DataFrame
Row │ AGEYRS WEIGHT CREAT GENDER AGEUNI WTUNI CREATUNI CREATCL CREATCLU
│ Int64 Int64 Int64 String Symbol Symbol String Float64 Symbolic…
─────┼──────────────────────────────────────────────────────────────────────────────
1 │ 20 50 60 M yr kg umol/L 122.78 min⁻¹ mL
2 │ 30 60 70 M yr kg umol/L 115.764 min⁻¹ mL
3 │ 40 70 80 F yr kg umol/L 91.3177 min⁻¹ mL
4 │ 53 85 90 F yr kg umol/L 85.7517 min⁻¹ mLADaM.definition_table — Method
definition_table(table)Creates a Table that gives a defintion overview of the columns of table, intended to give a quick intuition of the dataset. ategoric and Numeric columns(NUM, CD , N) are automatically mapped to each other in the Summary column. Custom comments can be passed for each column as additional information.
Keyword arguments
max_categories = 10: Limit the number of categories listed individually for categorical columns, the rest will be lumped together.label_metadata_key = "label": Key to look up column label metadata with.map_dict= Helps map unique values of 2 columns.comment_dict= Helps display custom comments for columns.
ADaM.derive_body_covariates — Method
derive_body_covariates(df; kwargs...)Derive standard body size (BMI, BSA) and renal function (CRCL, EGFR) covariates from a DataFrame.
This is a convenience function that internally calls:
body_mass_indexbody_surface_areacreatinine_clearanceest_glomerular_filtration_rate
Positional Arguments
df: Input DataFrame containing subject data.
Keyword Arguments
bmi: Output column for Body Mass Index (default::BMIBL)bsa: Output column for Body Surface Area (default::BSABL)crcl: Output column for Creatinine Clearance (default::CRCLBL)egfr: Output column for estimated GFR (default::EGFRBL)weight: Input column for weight (default::WTBL)height: Input column for height (default::HTBL)weight_unit: Input column for weight units (default::WTBLU)height_unit: Input column for height units (default::HTBLU)age: Input column for age (default::AGE)age_unit: Input column for age units (default::AGEU)sex: Input column for sex/gender (default::SEX)creat: Input column for serum creatinine (default::CREATBL)creat_unit: Input column for creatinine units (default::CREATBLU)bsa_formula: Formula for BSA calculation (default::mosteller)egfr_formula: Formula for eGFR calculation (default:"ckd-epi-creat-2021")
Returns
A new DataFrame with additional columns for derived covariates, including BMI, BSA, CRCL, and EGFR column values and units.
Examples
julia> df = DataFrame(
HTBL = [150, 160, 170, 180],
WTBL = [50, 60, 70, 80],
HTBLU = "cm",
WTBLU = "kg",
)
4×4 DataFrame
Row │ HTBL WTBL HTBLU WTBLU
│ Int64 Int64 String String
─────┼──────────────────────────────
1 │ 150 50 cm kg
2 │ 160 60 cm kg
3 │ 170 70 cm kg
4 │ 180 80 cm kg
julia> derive_body_covariates(df)
4×8 DataFrame
Row │ HTBL WTBL HTBLU WTBLU BMIBL BMIBLU BSABL BSABLU
│ Int64 Int64 String String Float64 Symbolic… Float64 Symbolic…
─────┼──────────────────────────────────────────────────────────────────────
1 │ 150 50 cm kg 22.2222 m⁻² kg 1.44338 m²
2 │ 160 60 cm kg 23.4375 m⁻² kg 1.63299 m²
3 │ 170 70 cm kg 24.2215 m⁻² kg 1.81812 m²
4 │ 180 80 cm kg 24.6914 m⁻² kg 2.0 m²
julia> df = DataFrame(
HTBL = [150, 160, 170, 180],
WTBL = [50, 60, 70, 80],
HTBLU = "cm",
WTBLU = "kg",
AGE = [20, 30, 40, 53],
CREATBL = [60, 70, 80, 90],
SEX = ["M", "M", "F", "F"],
AGEU = :yr,
CREATBLU = "umol/L",
)
4×9 DataFrame
Row │ HTBL WTBL HTBLU WTBLU AGE CREATBL SEX AGEU CREATBLU
│ Int64 Int64 String String Int64 Int64 String Symbol String
─────┼────────────────────────────────────────────────────────────────────────
1 │ 150 50 cm kg 20 60 M yr umol/L
2 │ 160 60 cm kg 30 70 M yr umol/L
3 │ 170 70 cm kg 40 80 F yr umol/L
4 │ 180 80 cm kg 53 90 F yr umol/L
julia> derive_body_covariates(df)
4×17 DataFrame
Row │ HTBL WTBL HTBLU WTBLU AGE CREATBL SEX AGEU CREATBLU BMIBL BMIBLU BSABL BSABLU CRCLBL CRCLBLU EGFRBL EGFRBLU
│ Int64 Int64 String String Int64 Int64 String Symbol String Float64 Symbolic… Float64 Symbolic… Float64 Symbolic… Float64 String
─────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 150 50 cm kg 20 60 M yr umol/L 22.2222 m⁻² kg 1.44338 m² 122.78 min⁻¹ mL 136.546 min⁻¹ mL/1.73m²
2 │ 160 60 cm kg 30 70 M yr umol/L 23.4375 m⁻² kg 1.63299 m² 115.764 min⁻¹ mL 122.476 min⁻¹ mL/1.73m²
3 │ 170 70 cm kg 40 80 F yr umol/L 24.2215 m⁻² kg 1.81812 m² 91.3177 min⁻¹ mL 82.3362 min⁻¹ mL/1.73m²
4 │ 180 80 cm kg 53 90 F yr umol/L 24.6914 m⁻² kg 2.0 m² 80.7075 min⁻¹ mL 65.9318 min⁻¹ mL/1.73m²ADaM.dose_interval_sequence — Method
function dose_interval_sequence(df::DataFrame; group, order)This function creates a column DINTSEQ from a either dose(EX) or combined dataset (EX and PC).
The values in DINTSEQ column contains a sequences of doses/washout (EVID in [1, 4]) starting from missing (predose sample) or 1 (initial dose).
Group and order variables (Vectors or Scalars) can be passed to customise the sequence.
Required columns for dose count: EVID
Example
julia> df = DataFrame([
(1, 1, 40),
(2, 1, 90),
(1, 0, 10),
(2, 0, 60),
(1, 1, 20),
(2, 1, 70),
(1, 1, 30),
(2, 1, 80),
(1, 0, 50),
(2, 0, 100)
], [:ID, :EVID, :AFRLT]) # unordered and ungrouped dataset
10×3 DataFrame
Row │ ID EVID AFRLT
│ Int64 Int64 Int64
─────┼─────────────────────
1 │ 1 1 40
2 │ 2 1 90
3 │ 1 0 10
4 │ 2 0 60
5 │ 1 1 20
6 │ 2 1 70
7 │ 1 1 30
8 │ 2 1 80
9 │ 1 0 50
10 │ 2 0 100
julia> dose_interval_sequence(df) # count without groupby
10×4 DataFrame
Row │ ID EVID AFRLT DINTSEQ
│ Int64 Int64 Int64 Int64
─────┼──────────────────────────────
1 │ 1 1 40 1
2 │ 2 1 90 2
3 │ 1 0 10 2
4 │ 2 0 60 2
5 │ 1 1 20 3
6 │ 2 1 70 4
7 │ 1 1 30 5
8 │ 2 1 80 6
9 │ 1 0 50 6
10 │ 2 0 100 6
julia> dose_interval_sequence(df, group = ["ID"])
10×4 DataFrame
Row │ ID EVID AFRLT DINTSEQ
│ Int64 Int64 Int64 Int64
─────┼──────────────────────────────
1 │ 1 1 40 1
2 │ 1 0 10 1
3 │ 1 1 20 2
4 │ 1 1 30 3
5 │ 1 0 50 3
6 │ 2 1 90 1
7 │ 2 0 60 1
8 │ 2 1 70 2
9 │ 2 1 80 3
10 │ 2 0 100 3
julia> dose_interval_sequence(df, group = :ID, order = :AFRLT)
10×4 DataFrame
Row │ ID EVID AFRLT DINTSEQ
│ Int64 Int64 Int64 Int64?
─────┼──────────────────────────────
1 │ 1 0 10 missing
2 │ 1 1 20 1
3 │ 1 1 30 2
4 │ 1 1 40 3
5 │ 1 0 50 3
6 │ 2 0 60 missing
7 │ 2 1 70 1
8 │ 2 1 80 2
9 │ 2 1 90 3
10 │ 2 0 100 3ADaM.est_glomerular_filtration_rate — Method
est_glomerular_filtration_rate(age::Number, sex::Union{Symbol,AbstractString}; kwargs...)
est_glomerular_filtration_rate(age::Quantity, sex::Union{Symbol,AbstractString}; kwargs...)
est_glomerular_filtration_rate(df::DataFrame; kwargs...)Calculates eGFR from age, creatinine and sex which can be provided as Quantitys, Scalars or Vectors via DataFrame. Supports various formulas:
- CKD-EPI Creatinine 2021 Formula -
ckd-epi-creat-2021(default) - CKD-EPI Cystatin C 2012 Formula -
ckd-epi-cyst-2012 - MDRD Formula -
mdrd(requiresraceparameter)
The age_unit, creat_unit, and cyst_unit can be explicitly passed as unit values or can be passed as a Vector. Follows the units from DynamicQuantities.jl
Arguments
age: Age value.sex: Sex/gender (valid values: "M", "MALE", "F", "FEMALE", case-insensitive).creat: Serum creatinine value (default: 0).cyst: Cystatin C value (default: 0).race: Race (default: "U"). Required for MDRD formula.age_unit: Age unit (default: "yr").creat_unit: Creatinine unit (default: "mg/dL").cyst_unit: Cystatin C unit (default: "mg/L").formula: eGFR calculation formula (default: "ckd-epi-creat-2021").
Default Columns
The following default column names are used for DataFrame input:
age = :AGEsex = :SEXcreat = :CREATBLcyst = :CYSTBLrace = :RACEage_unit = :AGEUcreat_unit = :CREATBLUcyst_unit = :CYSTBLUcol = :EGFRBL
Output
eGFR value in mL/min/1.73m² as a scalar or DataFrame column with unit column.
Notes
- Age must be between 1 and 140 years.
Examples
julia> est_glomerular_filtration_rate(53, "M", creat = 90, creat_unit = "umol/L")
88.08210123133497
julia> est_glomerular_filtration_rate(53, "M", creat = 1, creat_unit = "mg/dL")
89.99656911635697
julia> est_glomerular_filtration_rate(53us"yr", "M", creat = 90us"umol/L", formula="ckd-epi-creat-2021")
88.08210123133497
julia> est_glomerular_filtration_rate(53u"yr", "M", creat = 1us"mg/dL")
89.99656911635697
julia> est_glomerular_filtration_rate(53u"yr", "M", creat = 1us"mg/dL", race="BLACK OR AFRICAN AMERICAN", formula=:mdrd)
94.7354815425664
julia> est_glomerular_filtration_rate(53u"yr", "M", cyst = 0.6us"mg/L", formula="ckd-epi-cyst-2012")
124.1483657437901
julia> df = DataFrame(
AGE = [20, 30, 40, 53],
CREATBL = [60, 70, 80, 90],
SEX = ["M", "M", "F", "F"],
AGEU = :yr,
CREATBLU = "umol/L",
)
4×5 DataFrame
Row │ AGE CREATBL SEX AGEU CREATBLU
│ Int64 Int64 String Symbol String
─────┼──────────────────────────────────────────
1 │ 20 60 M yr umol/L
2 │ 30 70 M yr umol/L
3 │ 40 80 F yr umol/L
4 │ 53 90 F yr umol/L
julia> est_glomerular_filtration_rate(df)
4×7 DataFrame
Row │ AGE CREATBL SEX AGEU CREATBLU EGFRBL EGFRBLU
│ Int64 Int64 String Symbol String Float64 String
─────┼─────────────────────────────────────────────────────────────────────
1 │ 20 60 M yr umol/L 136.546 min⁻¹ mL/1.73m²
2 │ 30 70 M yr umol/L 122.476 min⁻¹ mL/1.73m²
3 │ 40 80 F yr umol/L 82.3362 min⁻¹ mL/1.73m²
4 │ 53 90 F yr umol/L 65.9318 min⁻¹ mL/1.73m²
julia> df = DataFrame(
AGEYRS = [20, 30, 40, 53],
CREAT = [60, 70, 80, 90],
GENDER = ["M", "M", "F", "F"],
AGEUNI = :yr,
RACEC = "BLACK OR AFRICAN AMERICAN",
CREATUNI = "umol/L"
)
4×6 DataFrame
Row │ AGEYRS CREAT GENDER AGEUNI RACEC CREATUNI
│ Int64 Int64 String Symbol String String
─────┼────────────────────────────────────────────────────────────────────
1 │ 20 60 M yr BLACK OR AFRICAN AMERICAN umol/L
2 │ 30 70 M yr BLACK OR AFRICAN AMERICAN umol/L
3 │ 40 80 F yr BLACK OR AFRICAN AMERICAN umol/L
4 │ 53 90 F yr BLACK OR AFRICAN AMERICAN umol/L
julia> est_glomerular_filtration_rate(
df;
age = :AGEYRS,
creat = :CREAT,
sex = :GENDER,
race = :RACEC,
age_unit = :AGEUNI,
creat_unit = :CREATUNI,
col = :EGFR,
formula=:mdrd
)
4×8 DataFrame
Row │ AGEYRS CREAT GENDER AGEUNI RACEC CREATUNI EGFR EGFRU
│ Int64 Int64 String Symbol String String Float64 String
─────┼───────────────────────────────────────────────────────────────────────────────────────────────
1 │ 20 60 M yr BLACK OR AFRICAN AMERICAN umol/L 180.576 min⁻¹ mL/1.73m²
2 │ 30 70 M yr BLACK OR AFRICAN AMERICAN umol/L 139.206 min⁻¹ mL/1.73m²
3 │ 40 80 F yr BLACK OR AFRICAN AMERICAN umol/L 83.5172 min⁻¹ mL/1.73m²
4 │ 53 90 F yr BLACK OR AFRICAN AMERICAN umol/L 68.8551 min⁻¹ mL/1.73m²
julia> df = DataFrame(
AGE = [20, 30, 40, 53],
CYSTBL = [0.6, 0.7, 0.8, 0.9],
SEX = ["M", "M", "F", "F"],
RACE = "ASIAN",
AGEU = :yr,
CYSTBLU = "mg/L",
)
julia> est_glomerular_filtration_rate(df, formula="ckd-epi-cyst-2012")
4×8 DataFrame
Row │ AGE CYSTBL SEX RACE AGEU CYSTBLU EGFRBL EGFRBLU
│ Int64 Float64 String String Symbol String Float64 String
─────┼───────────────────────────────────────────────────────────────────────────
1 │ 20 0.6 M ASIAN yr mg/L 141.704 min⁻¹ mL/1.73m²
2 │ 30 0.7 M ASIAN yr mg/L 126.058 min⁻¹ mL/1.73m²
3 │ 40 0.8 F ASIAN yr mg/L 105.594 min⁻¹ mL/1.73m²
4 │ 53 0.9 F ASIAN yr mg/L 85.72 min⁻¹ mL/1.73m²ADaM.expand_dose_events — Method
expand_dose_events(df::DataFrame; start_dtm, end_dtm, trt_start_dtm, dose_freq)- Returns expanded dataset when you pass
EX(dose) dataset as input with all columns. - Creates unique row for every dosing in a dosing period for a subject.
- Generates
NFRLT(Nominal time relative to first dose) that varies across every row based on dosing frequency. - All other column values will be duplicated on expansion.
Arguments
start_dtm: Column name containing the start datetime for the dosing periodend_dtm: Column name containing the end datetime for the dosing periodtrt_start_dtm: Column name containing the first analyte dose datetime (used as reference forNFRLTcalculation)dose_freq: Column name containing the dosing frequency
Default Columns
The following default column names are used for DataFrame input:
start_dtm = :ASTDTMend_dtm = :AENDTMtrt_start_dtm = :FANLDTMdose_freq = :EXDOSFRQ
Notes
If dosing frequency == 'ONCE' or start date == end date; no expansion happens.
Example
julia> df = DataFrame(USUBJID = "1",
EVID = 1,
ASTDTM = DateTime.(["2025-01-01", "2025-02-01", "2025-03-01"]),
AENDTM = DateTime.(["2025-01-31", "2025-02-28", "2025-03-31"]),
EXDOSFRQ = "QD",
FANLDTM = DateTime.("2025-01-01"))
3×6 DataFrame
Row │ USUBJID EVID ASTDTM AENDTM EXDOSFRQ FANLDTM
│ String Int64 DateTime DateTime String DateTime
─────┼─────────────────────────────────────────────────────────────────────────────────────────
1 │ 1 1 2025-01-01T00:00:00 2025-01-31T00:00:00 QD 2025-01-01T00:00:00
2 │ 1 1 2025-02-01T00:00:00 2025-02-28T00:00:00 QD 2025-01-01T00:00:00
3 │ 1 1 2025-03-01T00:00:00 2025-03-31T00:00:00 QD 2025-01-01T00:00:00
julia> expand_dose_events(df) # QD
90×7 DataFrame
Row │ USUBJID EVID ASTDTM AENDTM EXDOSFRQ FANLDTM NFRLT
│ String Int64 DateTime DateTime String DateTime Float64
─────┼──────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 1 1 2025-01-01T00:00:00 2025-01-31T00:00:00 QD 2025-01-01T00:00:00 0.0
2 │ 1 1 2025-01-02T00:00:00 2025-01-31T00:00:00 QD 2025-01-01T00:00:00 24.0
3 │ 1 1 2025-01-03T00:00:00 2025-01-31T00:00:00 QD 2025-01-01T00:00:00 48.0
4 │ 1 1 2025-01-04T00:00:00 2025-01-31T00:00:00 QD 2025-01-01T00:00:00 72.0
5 │ 1 1 2025-01-05T00:00:00 2025-01-31T00:00:00 QD 2025-01-01T00:00:00 96.0
6 │ 1 1 2025-01-06T00:00:00 2025-01-31T00:00:00 QD 2025-01-01T00:00:00 120.0
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮
86 │ 1 1 2025-03-27T00:00:00 2025-03-31T00:00:00 QD 2025-01-01T00:00:00 2040.0
87 │ 1 1 2025-03-28T00:00:00 2025-03-31T00:00:00 QD 2025-01-01T00:00:00 2064.0
88 │ 1 1 2025-03-29T00:00:00 2025-03-31T00:00:00 QD 2025-01-01T00:00:00 2088.0
89 │ 1 1 2025-03-30T00:00:00 2025-03-31T00:00:00 QD 2025-01-01T00:00:00 2112.0
90 │ 1 1 2025-03-31T00:00:00 2025-03-31T00:00:00 QD 2025-01-01T00:00:00 2136.0
79 rows omitted
julia> df = DataFrame(USUBJID = "1",
EVID = 1,
ASTDTM = DateTime.(["2025-01-01", "2025-02-01", "2025-03-01"]),
AENDTM = DateTime.(["2025-01-31", "2025-02-28", "2025-03-31"]),
EXDOSFRQ = "EVERY WEEK",
FANLDTM = DateTime.("2025-01-01"))
3×6 DataFrame
Row │ USUBJID EVID ASTDTM AENDTM EXDOSFRQ FANLDTM
│ String Int64 DateTime DateTime String DateTime
─────┼───────────────────────────────────────────────────────────────────────────────────────────
1 │ 1 1 2025-01-01T00:00:00 2025-01-31T00:00:00 EVERY WEEK 2025-01-01T00:00:00
2 │ 1 1 2025-02-01T00:00:00 2025-02-28T00:00:00 EVERY WEEK 2025-01-01T00:00:00
3 │ 1 1 2025-03-01T00:00:00 2025-03-31T00:00:00 EVERY WEEK 2025-01-01T00:00:00
julia> expand_dose_events(df) # EVERY WEEK
14×7 DataFrame
Row │ USUBJID EVID ASTDTM AENDTM EXDOSFRQ FANLDTM NFRLT
│ String Int64 DateTime DateTime String DateTime Float64
─────┼────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 1 1 2025-01-01T00:00:00 2025-01-31T00:00:00 EVERY WEEK 2025-01-01T00:00:00 0.0
2 │ 1 1 2025-01-08T00:00:00 2025-01-31T00:00:00 EVERY WEEK 2025-01-01T00:00:00 168.0
3 │ 1 1 2025-01-15T00:00:00 2025-01-31T00:00:00 EVERY WEEK 2025-01-01T00:00:00 336.0
4 │ 1 1 2025-01-22T00:00:00 2025-01-31T00:00:00 EVERY WEEK 2025-01-01T00:00:00 504.0
5 │ 1 1 2025-01-29T00:00:00 2025-01-31T00:00:00 EVERY WEEK 2025-01-01T00:00:00 672.0
6 │ 1 1 2025-02-01T00:00:00 2025-02-28T00:00:00 EVERY WEEK 2025-01-01T00:00:00 744.0
7 │ 1 1 2025-02-08T00:00:00 2025-02-28T00:00:00 EVERY WEEK 2025-01-01T00:00:00 912.0
8 │ 1 1 2025-02-15T00:00:00 2025-02-28T00:00:00 EVERY WEEK 2025-01-01T00:00:00 1080.0
9 │ 1 1 2025-02-22T00:00:00 2025-02-28T00:00:00 EVERY WEEK 2025-01-01T00:00:00 1248.0
10 │ 1 1 2025-03-01T00:00:00 2025-03-31T00:00:00 EVERY WEEK 2025-01-01T00:00:00 1416.0
11 │ 1 1 2025-03-08T00:00:00 2025-03-31T00:00:00 EVERY WEEK 2025-01-01T00:00:00 1584.0
12 │ 1 1 2025-03-15T00:00:00 2025-03-31T00:00:00 EVERY WEEK 2025-01-01T00:00:00 1752.0
13 │ 1 1 2025-03-22T00:00:00 2025-03-31T00:00:00 EVERY WEEK 2025-01-01T00:00:00 1920.0
14 │ 1 1 2025-03-29T00:00:00 2025-03-31T00:00:00 EVERY WEEK 2025-01-01T00:00:00 2088.0Example with custom column names using kwargs
julia> df_custom = DataFrame(USUBJID = "1",
EVID = 1,
START_TIME = DateTime.(["2025-01-01", "2025-01-08"]),
END_TIME = DateTime.(["2025-01-07", "2025-01-14"]),
FREQUENCY = ["QD", "BID"],
FIRST_DOSE = DateTime.("2025-01-01"))
2×6 DataFrame
Row │ USUBJID EVID START_TIME END_TIME FREQUENCY FIRST_DOSE
│ String Int64 DateTime DateTime String DateTime
─────┼─────────────────────────────────────────────────────────────────────────────────────────
1 │ 1 1 2025-01-01T00:00:00 2025-01-07T00:00:00 QD 2025-01-01T00:00:00
2 │ 1 1 2025-01-08T00:00:00 2025-01-14T00:00:00 BID 2025-01-01T00:00:00
julia> expand_dose_events(df_custom,
start_dtm = :START_TIME,
end_dtm = :END_TIME,
trt_start_dtm = :FIRST_DOSE,
dose_freq = :FREQUENCY)
20×7 DataFrame
Row │ USUBJID EVID START_TIME END_TIME FREQUENCY FIRST_DOSE NFRLT
│ String Int64 DateTime DateTime String DateTime Float64
─────┼───────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 1 1 2025-01-01T00:00:00 2025-01-07T00:00:00 QD 2025-01-01T00:00:00 0.0
2 │ 1 1 2025-01-01T00:00:00 2025-01-07T00:00:00 QD 2025-01-01T00:00:00 24.0
3 │ 1 1 2025-01-01T00:00:00 2025-01-07T00:00:00 QD 2025-01-01T00:00:00 48.0
4 │ 1 1 2025-01-01T00:00:00 2025-01-07T00:00:00 QD 2025-01-01T00:00:00 72.0
5 │ 1 1 2025-01-01T00:00:00 2025-01-07T00:00:00 QD 2025-01-01T00:00:00 96.0
6 │ 1 1 2025-01-01T00:00:00 2025-01-07T00:00:00 QD 2025-01-01T00:00:00 120.0
7 │ 1 1 2025-01-01T00:00:00 2025-01-07T00:00:00 QD 2025-01-01T00:00:00 144.0
8 │ 1 1 2025-01-08T00:00:00 2025-01-14T00:00:00 BID 2025-01-01T00:00:00 168.0
9 │ 1 1 2025-01-08T00:00:00 2025-01-14T00:00:00 BID 2025-01-01T00:00:00 180.0
10 │ 1 1 2025-01-08T00:00:00 2025-01-14T00:00:00 BID 2025-01-01T00:00:00 192.0
11 │ 1 1 2025-01-08T00:00:00 2025-01-14T00:00:00 BID 2025-01-01T00:00:00 204.0
12 │ 1 1 2025-01-08T00:00:00 2025-01-14T00:00:00 BID 2025-01-01T00:00:00 216.0
13 │ 1 1 2025-01-08T00:00:00 2025-01-14T00:00:00 BID 2025-01-01T00:00:00 228.0
14 │ 1 1 2025-01-08T00:00:00 2025-01-14T00:00:00 BID 2025-01-01T00:00:00 240.0
15 │ 1 1 2025-01-08T00:00:00 2025-01-14T00:00:00 BID 2025-01-01T00:00:00 252.0
16 │ 1 1 2025-01-08T00:00:00 2025-01-14T00:00:00 BID 2025-01-01T00:00:00 264.0
17 │ 1 1 2025-01-08T00:00:00 2025-01-14T00:00:00 BID 2025-01-01T00:00:00 276.0
18 │ 1 1 2025-01-08T00:00:00 2025-01-14T00:00:00 BID 2025-01-01T00:00:00 288.0
19 │ 1 1 2025-01-08T00:00:00 2025-01-14T00:00:00 BID 2025-01-01T00:00:00 300.0
20 │ 1 1 2025-01-08T00:00:00 2025-01-14T00:00:00 BID 2025-01-01T00:00:00 312.0Example with ONCE dosing (no expansion)
julia> df_once = DataFrame(USUBJID = "1",
EVID = 1,
ASTDTM = DateTime.("2025-01-01"),
AENDTM = DateTime.("2025-01-01"),
EXDOSFRQ = "ONCE",
FANLDTM = DateTime.("2025-01-01"))
1×6 DataFrame
Row │ USUBJID EVID ASTDTM AENDTM EXDOSFRQ FANLDTM
│ String Int64 DateTime DateTime String DateTime
─────┼─────────────────────────────────────────────────────────────────────────────────────────
1 │ 1 1 2025-01-01T00:00:00 2025-01-01T00:00:00 ONCE 2025-01-01T00:00:00
julia> expand_dose_events(df_once)
1×7 DataFrame
Row │ USUBJID EVID ASTDTM AENDTM EXDOSFRQ FANLDTM NFRLT
│ String Int64 DateTime DateTime String DateTime Float64
─────┼──────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 1 1 2025-01-01T00:00:00 2025-01-01T00:00:00 ONCE 2025-01-01T00:00:00 0.0Example with different dosing frequencies
julia> df_freq = DataFrame(USUBJID = ["1", "1", "1"],
EVID = [1, 1, 1],
ASTDTM = DateTime.(["2025-01-01", "2025-01-01", "2025-01-01"]),
AENDTM = DateTime.(["2025-01-03", "2025-01-03", "2025-01-02"]),
EXDOSFRQ = ["TID", "Q8H", "Q12H"],
FANLDTM = DateTime.(["2025-01-01", "2025-01-01", "2025-01-01"]))
3×6 DataFrame
Row │ USUBJID EVID ASTDTM AENDTM EXDOSFRQ FANLDTM
│ String Int64 DateTime DateTime String DateTime
─────┼─────────────────────────────────────────────────────────────────────────────────────────
1 │ 1 1 2025-01-01T00:00:00 2025-01-03T00:00:00 TID 2025-01-01T00:00:00
2 │ 1 1 2025-01-01T00:00:00 2025-01-03T00:00:00 Q8H 2025-01-01T00:00:00
3 │ 1 1 2025-01-01T00:00:00 2025-01-02T00:00:00 Q12H 2025-01-01T00:00:00
julia> expand_dose_events(df_freq)
17×7 DataFrame
Row │ USUBJID EVID ASTDTM AENDTM EXDOSFRQ FANLDTM NFRLT
│ String Int64 DateTime DateTime String DateTime Float64
─────┼──────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 1 1 2025-01-01T00:00:00 2025-01-03T00:00:00 TID 2025-01-01T00:00:00 0.0
2 │ 1 1 2025-01-01T08:00:00 2025-01-03T00:00:00 TID 2025-01-01T00:00:00 8.0
3 │ 1 1 2025-01-01T16:00:00 2025-01-03T00:00:00 TID 2025-01-01T00:00:00 16.0
4 │ 1 1 2025-01-02T00:00:00 2025-01-03T00:00:00 TID 2025-01-01T00:00:00 24.0
5 │ 1 1 2025-01-02T08:00:00 2025-01-03T00:00:00 TID 2025-01-01T00:00:00 32.0
6 │ 1 1 2025-01-02T16:00:00 2025-01-03T00:00:00 TID 2025-01-01T00:00:00 40.0
7 │ 1 1 2025-01-03T00:00:00 2025-01-03T00:00:00 TID 2025-01-01T00:00:00 48.0
8 │ 1 1 2025-01-01T00:00:00 2025-01-03T00:00:00 Q8H 2025-01-01T00:00:00 0.0
9 │ 1 1 2025-01-01T08:00:00 2025-01-03T00:00:00 Q8H 2025-01-01T00:00:00 8.0
10 │ 1 1 2025-01-01T16:00:00 2025-01-03T00:00:00 Q8H 2025-01-01T00:00:00 16.0
11 │ 1 1 2025-01-02T00:00:00 2025-01-03T00:00:00 Q8H 2025-01-01T00:00:00 24.0
12 │ 1 1 2025-01-02T08:00:00 2025-01-03T00:00:00 Q8H 2025-01-01T00:00:00 32.0
13 │ 1 1 2025-01-02T16:00:00 2025-01-03T00:00:00 Q8H 2025-01-01T00:00:00 40.0
14 │ 1 1 2025-01-03T00:00:00 2025-01-03T00:00:00 Q8H 2025-01-01T00:00:00 48.0
15 │ 1 1 2025-01-01T00:00:00 2025-01-02T00:00:00 Q12H 2025-01-01T00:00:00 0.0
16 │ 1 1 2025-01-01T12:00:00 2025-01-02T00:00:00 Q12H 2025-01-01T00:00:00 12.0
17 │ 1 1 2025-01-02T00:00:00 2025-01-02T00:00:00 Q12H 2025-01-01T00:00:00 24.0ADaM.interdose_interval — Method
interdose_interval(df::DataFrame)Returns Inter-dose Interval(II hours) when a valid dose freqency is passed. Dose frequencies are defined using CDISC SDTM Controlled Terminology. The assumptions made are
- A day is 24 hours
- A week is 7 days
- A month is 30 days
- An year is 52 weeks
Example
julia> interdose_interval("QD")
24
julia> interdose_interval("EVERY AFTERNOON")
24
julia> interdose_interval("Q45MIN")
0.75
julia> interdose_interval("QY")
8736
julia> df = DataFrame(USUBJID = "1", EXDOSFRQ = ["QD", "BID", "QD", "TID"])
4×2 DataFrame
Row │ USUBJID EXDOSFRQ
│ String String
─────┼───────────────────
1 │ 1 QD
2 │ 1 BID
3 │ 1 QD
4 │ 1 TID
julia> @rtransform! df :II = interdose_interval(:EXDOSFRQ)
4×3 DataFrame
Row │ USUBJID EXDOSFRQ II
│ String String Real
─────┼─────────────────────────
1 │ 1 QD 24
2 │ 1 BID 12.0
3 │ 1 QD 24
4 │ 1 TID 8.0ADaM.join_columns — Method
join_columns(target, reference; on, order, rev_order, keep, filter_ref, filter_join, mode)Add columns from a reference DataFrame to a target DataFrame with flexible matching and filtering.
Positional Arguments
target: Main DataFrame to join toreference: DataFrame to join columns from
Keyword Arguments
on: Column(s) to match on (join keys)order: Column(s) to sort matches by (optional)rev_order: Boolean vector indicating descending sort for each order column (default: all ascending)keep: Vector ofsource => destinationpairs for column mapping (default: all reference columns)filter_ref: Function to pre-filter reference rows (default: include all)filter_join: Function(target_row, ref_row)to filter matches (default: include all)mode: Take:firstor:lastmatch after sorting (default::first)
Examples
julia> patients = DataFrame(SUBJID = ["001", "002", "003"],
VISIT_DATE = [Date("2023-01-15"), Date("2023-01-20"), Date("2023-01-25")])
3×2 DataFrame
Row │ SUBJID VISIT_DATE
│ String Date
─────┼────────────────────
1 │ 001 2023-01-15
2 │ 002 2023-01-20
3 │ 003 2023-01-25
julia> lab_data = DataFrame(SUBJID = ["001", "001", "001", "002", "002", "003"],
LAB_DATE = [Date("2023-01-01"), Date("2023-01-10"), Date("2023-01-18"),
Date("2023-01-05"), Date("2023-01-25"), Date("2023-01-20")],
LAB_VALUE = [100, 95, 88, 110, 105, 92])
6×3 DataFrame
Row │ SUBJID LAB_DATE LAB_VALUE
│ String Date Int64
─────┼───────────────────────────────
1 │ 001 2023-01-01 100
2 │ 001 2023-01-10 95
3 │ 001 2023-01-18 88
4 │ 002 2023-01-05 110
5 │ 002 2023-01-25 105
6 │ 003 2023-01-20 92
julia> join_columns(patients,
lab_data;
on = [:SUBJID],
order = [:LAB_DATE],
keep = [:LAB_VALUE => :PRIOR_LAB, :LAB_DATE => :PRIOR_LAB_DATE],
filter_join = (t, r) -> r.LAB_DATE < t.VISIT_DATE,
mode = :last)
3×4 DataFrame
Row │ SUBJID VISIT_DATE PRIOR_LAB PRIOR_LAB_DATE
│ String Date Int64? Date?
─────┼───────────────────────────────────────────────
1 │ 001 2023-01-15 95 2023-01-10
2 │ 002 2023-01-20 110 2023-01-05
3 │ 003 2023-01-25 92 2023-01-20ADaM.make_dose_interruption — Method
make_dose_interruption(df::DataFrame)- Creates dose interruption columns
INTRFL(Interruption Flag) andINTRDUR(Interruption Duration) on theEX(dose) dataset where doses are recorded as intervals. - If there is any gap(more than a day) between end date
AENDTof one dosing interval and start dateASTDTof the next interval within a subject, the interruption will be recorded in rows. If there is no interruption, these columns will bemissing.
The EX dataset needs to be prepared with the following columns :
ASTDT(Analysis Start Date)AENDT(Analysis End Date)FANLDTM(First Anaalyte dose DateTime)
Example
julia> df = DataFrame(USUBJID = "1", EXDOSFRQ = "QD", EVID = 1, ASTDT = Date.(["2025-01-01", "2025-02-01", "2025-03-03"]), AENDT = Date.(["2025-01-31", "2025-02-28", "2025-03-31"]), FANLDTM = Date.("2025-01-01"))
3×6 DataFrame
Row │ USUBJID EXDOSFRQ EVID ASTDT AENDT FANLDTM
│ String String Int64 Date Date Date
─────┼──────────────────────────────────────────────────────────────
1 │ 1 QD 1 2025-01-01 2025-01-31 2025-01-01
2 │ 1 QD 1 2025-02-01 2025-02-28 2025-01-01
3 │ 1 QD 1 2025-03-03 2025-03-31 2025-01-01
julia> make_dose_interruption(df)
4×10 DataFrame
Row │ USUBJID EXDOSFRQ EVID ASTDT AENDT FANLDTM ASTDY AENDY INTRFL INTRDUR
│ String String Int64 Date Date Date Int64 Int64 String? Int64?
─────┼──────────────────────────────────────────────────────────────────────────────────────────────
1 │ 1 QD 1 2025-01-01 2025-01-31 2025-01-01 1 31 missing missing
2 │ 1 QD 1 2025-02-01 2025-02-28 2025-01-01 32 59 missing missing
3 │ 1 QD 1 2025-03-01 2025-03-02 2025-01-01 60 61 Y 2
4 │ 1 QD 1 2025-03-03 2025-03-31 2025-01-01 62 90 missing missingADaM.make_dtm — Method
make_dtm(df::DataFrame, col; prefix, fmt)- Derives
DateTimecolumn fromDateTime-convertibleString/Datecolumn in aDataFrame. - The
DateTimecolumn with of customprefix(kwarg) suffix 'DTM' will be of the formatyyyy-mm-ddTHH:MM:SS. - Custom DateTime formats can be passes through the
fmtkeyword.
Example
julia> df = DataFrame(DTC = ["2024", "2025", "2026"])
3×1 DataFrame
Row │ DTC
│ String
─────┼────────
1 │ 2024
2 │ 2025
3 │ 2026
julia> make_dtm(df, "DTC")
3×2 DataFrame
Row │ DTC ADTM
│ String DateTime
─────┼─────────────────────────────
1 │ 2024 2024-01-01T00:00:00
2 │ 2025 2025-01-01T00:00:00
3 │ 2026 2026-01-01T00:00:00
julia> df = DataFrame(DTC = Date.(["2024", "2025", "2026"]))
3×1 DataFrame
Row │ DTC
│ Date
─────┼────────────
1 │ 2024-01-01
2 │ 2025-01-01
3 │ 2026-01-01
julia> make_dtm(df, "DTC", prefix = "B")
3×2 DataFrame
Row │ DTC BDTM
│ Date DateTime
─────┼─────────────────────────────────
1 │ 2024-01-01 2024-01-01T00:00:00
2 │ 2025-01-01 2025-01-01T00:00:00
3 │ 2026-01-01 2026-01-01T00:00:00Input col and prefix can be passed as either String or Symbol.
julia> df = DataFrame(DTC = ["2024", "2025", "2026"])
3×1 DataFrame
Row │ DTC
│ String
─────┼────────
1 │ 2024
2 │ 2025
3 │ 2026
julia> make_dtm(df, :DTC, prefix = :AST)
3×2 DataFrame
Row │ DTC ASTDTM
│ String DateTime
─────┼─────────────────────────────
1 │ 2024 2024-01-01T00:00:00
2 │ 2025 2025-01-01T00:00:00
3 │ 2026 2026-01-01T00:00:00 For a different DateTime Format. Valid Julia Date Formats
julia> df = DataFrame(DTC = ["2024-May-21T01:10", "2024-Jun-22T02:20", "2024-Aug-23T03:30"])
3×1 DataFrame
Row │ DTC
│ String
─────┼───────────────────
1 │ 2024-May-21T01:10
2 │ 2024-Jun-22T02:20
3 │ 2024-Aug-23T03:30
julia> make_dtm(df, "DTC", prefix = "C", fmt = "yyyy-u-ddTHH:MM")
3×2 DataFrame
Row │ DTC CDTM
│ String DateTime
─────┼────────────────────────────────────────
1 │ 2024-May-21T01:10 2024-05-21T01:10:00
2 │ 2024-Jun-22T02:20 2024-06-22T02:20:00
3 │ 2024-Aug-23T03:30 2024-08-23T03:30:00
julia> make_dtm(df, :DTC, prefix = "C", fmt = dateformat"yyyy-u-ddTHH:MM") # dateformat object
3×2 DataFrame
Row │ DTC CDTM
│ String DateTime
─────┼────────────────────────────────────────
1 │ 2024-May-21T01:10 2024-05-21T01:10:00
2 │ 2024-Jun-22T02:20 2024-06-22T02:20:00
3 │ 2024-Aug-23T03:30 2024-08-23T03:30:00ADaM.make_dtm_to_dt — Method
make_dtm_to_dt(df::DataFrame, col; prefix)- Derives
Datecolumn fromDateTimecolumn in aDataFrame. - The
Timecolumn with suffix 'TM' will be of the formatyyyy-mm-dd. - Input
colandprefixcan be passed as eitherStringorSymbol.
Example
julia> df = DataFrame(ADTM = DateTime.(["2024-10-21T01:10", "2024-11-22T02:20", "2024-12-23T03:30"]))
3×1 DataFrame
Row │ ADTM
│ DateTime
─────┼─────────────────────
1 │ 2024-10-21T01:10:00
2 │ 2024-11-22T02:20:00
3 │ 2024-12-23T03:30:00
julia> make_dtm_to_dt(df, "ADTM", prefix = "A")
3×2 DataFrame
Row │ ADTM ADT
│ DateTime Date
─────┼─────────────────────────────────
1 │ 2024-10-21T01:10:00 2024-10-21
2 │ 2024-11-22T02:20:00 2024-11-22
3 │ 2024-12-23T03:30:00 2024-12-23
julia> make_dtm_to_dt(df, "ADTM")
3×2 DataFrame
Row │ ADTM ADT
│ DateTime Date
─────┼─────────────────────────────────
1 │ 2024-10-21T01:10:00 2024-10-21
2 │ 2024-11-22T02:20:00 2024-11-22
3 │ 2024-12-23T03:30:00 2024-12-23ADaM.make_dtm_to_tm — Method
make_dtm_to_tm(df::DataFrame, col; prefix)- Derives
Timecolumn fromDateTimecolumn in aDataFrame. - The
Timecolumn with suffix 'TM' will be of the formathh:mm:ss. - Input
colandprefixcan be passed as eitherStringorSymbol.
Example
julia> df = DataFrame(ADTM = DateTime.(["2024-10-21T01:10", "2024-11-22T02:20", "2024-12-23T03:30"]))
3×1 DataFrame
Row │ ADTM
│ DateTime
─────┼─────────────────────
1 │ 2024-10-21T01:10:00
2 │ 2024-11-22T02:20:00
3 │ 2024-12-23T03:30:00
julia> make_dtm_to_tm(df, "ADTM", prefix = "A")
3×2 DataFrame
Row │ ADTM ATM
│ DateTime Time
─────┼───────────────────────────────
1 │ 2024-10-21T01:10:00 01:10:00
2 │ 2024-11-22T02:20:00 02:20:00
3 │ 2024-12-23T03:30:00 03:30:00
julia> make_dtm_to_tm(df, "ADTM")
3×2 DataFrame
Row │ ADTM ATM
│ DateTime Time
─────┼───────────────────────────────
1 │ 2024-10-21T01:10:00 01:10:00
2 │ 2024-11-22T02:20:00 02:20:00
3 │ 2024-12-23T03:30:00 03:30:00ADaM.make_duration — Method
make_duration(; start_dtm, end_dtm, output_unit)
make_duration(df, col; start_dtm, end_dtm, output_unit)Computes the duration between two dates or datetimes and returns the result in the specified unit. Dates can be passed as Scalars or Vectors via DataFrame.`
Arguments:
start_dtm: The starting date or datetime.end_dtm: The ending date or datetime.output_unit: The unit for the output duration (e.g.,:hfor hours,:minfor minutes). Default is:h.
Valid time units supported by DynamicQuantities.jl.
Returns:
- The DataFrame with new columns (
col) for the duration, its unit, and the stripped value.
Examples
Scalar Dates
julia> make_duration(start_dtm=DateTime(2024, 1, 1, 8), end_dtm=DateTime(2024, 1, 1, 10), output_unit=:h)
2.0 h
julia> make_duration(start_dtm=Date(2024, 1, 1), end_dtm=Date(2024, 1, 2), output_unit=:h)
24.0 hDate columns
julia> df = DataFrame(ASTDT = [Date(2024,1,1), Date(2024,2,1)],
AENDT = [Date(2024,3,1), Date(2024,4,1)])
2×2 DataFrame
Row │ ASTDT AENDT
│ Date Date
─────┼────────────────────────
1 │ 2024-01-01 2024-03-01
2 │ 2024-02-01 2024-04-01
julia> make_duration(df, :ADUR; start_dtm=:ASTDT, end_dtm=:AENDT, output_unit=:day)
2×4 DataFrame
Row │ ASTDT AENDT ADUR ADURU
│ Date Date Float64 Symbolic…
─────┼────────────────────────────────────────────
1 │ 2024-01-01 2024-03-01 60.0 day
2 │ 2024-02-01 2024-04-01 60.0 dayDateTime columns
julia> df = DataFrame(ASTDTM = [DateTime(2024,1,1,8), DateTime(2024,1,1,9)],
AENDTM = [DateTime(2024,1,1,10), DateTime(2024,1,1,12)])
2×2 DataFrame
Row │ ASTDTM AENDTM
│ DateTime DateTime
─────┼──────────────────────────────────────────
1 │ 2024-01-01T08:00:00 2024-01-01T10:00:00
2 │ 2024-01-01T09:00:00 2024-01-01T12:00:00
julia> make_duration(df, :ADUR; start_dtm=:ASTDTM, end_dtm=:AENDTM, output_unit=:h)
2×4 DataFrame
Row │ ASTDTM AENDTM ADUR ADURU
│ DateTime DateTime Float64 Symbolic…
─────┼──────────────────────────────────────────────────────────────
1 │ 2024-01-01T08:00:00 2024-01-01T10:00:00 2.0 h
2 │ 2024-01-01T09:00:00 2024-01-01T12:00:00 3.0 h
julia> make_duration(df, :ADUR; start_dtm=:ASTDTM, end_dtm=:AENDTM, output_unit=:day)
2×4 DataFrame
Row │ ASTDTM AENDTM ADUR ADURU
│ DateTime DateTime Float64 Symbolic…
─────┼────────────────────────────────────────────────────────────────
1 │ 2024-01-01T08:00:00 2024-01-01T10:00:00 0.0833333 day
2 │ 2024-01-01T09:00:00 2024-01-01T12:00:00 0.125 dayADaM.make_nominal_time — Method
derive_nominal_time(pc, ex, domain; visitdy_col, atptn_col)Derives the nominal time column NFRLT for PC and EX SDTM datasets.
Arguments
pc: PC DataFrame (must containATPTN)ex: EX DataFramedomain: The output DataFrame (PCorEX)visitdy_col: Column name forVISITDY(Symbol or String, default = :VISITDY)atptn_col: Column name forATPTN(Symbol or String, default = :ATPTN)
Output
- Returns a DataFrame (PC or EX) with a new
NFRLTcolumn representing nominal time in hours.
Notes
- For PC:
NFRLT = (VISITDY - 1) * 24 + ATPTN - For EX:
NFRLT = (VISITDY - 1) * 24 - If
VISITDYis missing in either dataset, it is joined from the other dataset using common keys (excludingATPTNfor PC). - Throws an error if required columns are
missingor if join keys cannot be determined.
Examples
# Example PC and EX DataFrames
julia> pc = DataFrame(DOMAIN = ["PC", "PC"], USUBJID = ["01", "01"], VISITDY = [2, 3], ATPTN = [0, 24])
2×4 DataFrame
Row │ DOMAIN USUBJID VISITDY ATPTN
│ String String Int64 Int64
─────┼─────────────────────────────────
1 │ PC 01 2 0
2 │ PC 01 3 24
julia> ex = DataFrame(DOMAIN = ["EX", "EX"], USUBJID = ["01", "01"], VISITDY = [2, 3])
2×3 DataFrame
Row │ DOMAIN USUBJID VISITDY
│ String String Int64
─────┼──────────────────────────
1 │ EX 01 2
2 │ EX 01 3
# Example 1: Standard usage for PC
julia> make_nominal_time(pc, ex, "PC")
2×5 DataFrame
Row │ DOMAIN USUBJID VISITDY ATPTN NFRLT
│ String String Int64 Int64 Int64
─────┼────────────────────────────────────────
1 │ PC 01 2 0 24
2 │ PC 01 3 24 72
# Example 1: Standard usage for PC
julia> make_nominal_time(pc, ex, "EX")
2×4 DataFrame
Row │ DOMAIN USUBJID VISITDY NFRLT
│ String String Int64 Int64
─────┼─────────────────────────────────
1 │ EX 01 2 24
2 │ EX 01 3 48
# Example 2: Custom column names
julia> rename!(pc, :VISITDY => :VISDY, :ATPTN => :TIMEPT)
2×4 DataFrame
Row │ DOMAIN USUBJID VISDY TIMEPT
│ String String Int64 Int64
─────┼────────────────────────────────
1 │ PC 01 2 0
2 │ PC 01 3 24
julia> make_nominal_time(pc, ex, "PC"; visitdy_col=:VISDY, atptn_col=:TIMEPT)
2×5 DataFrame
Row │ DOMAIN USUBJID VISDY TIMEPT NFRLT
│ String String Int64 Int64 Int64
─────┼───────────────────────────────────────
1 │ PC 01 2 0 24
2 │ PC 01 3 24 72ADaM.make_time_fl — Method
make_time_fl(df::DataFrame, col; group, order)This function creates flag columns (FN (Flag Numeric) and FC (Flag comment)), that flags the time information based on following criteria.
- Missing Times
- Negative Times
- Non-ascending Times
The points to be considered for data validation can be referred through this link: https://www.lexjansen.com/phuse-us/2020/as/AS06.pdf#page=3
order and group variables can be used while flagging.
Example
julia> df = DataFrame(USUBJID = 1, AFRLT = [0, -1, -2, 3, missing, 5, 4])
10×2 DataFrame
Row │ USUBJID AFRLT
│ Int64 Int64?
─────┼──────────────────
1 │ 1 0
2 │ 1 -1
3 │ 1 -2
4 │ 1 3
5 │ 1 missing
6 │ 1 5
7 │ 1 4
julia> make_time_fl(df, :AFRLT, group = [], order = []) # group variables can be kept empty for dataframe without group.
7×4 DataFrame
Row │ USUBJID AFRLT AFRLTFN AFRLTFC
│ Int64 Int64? Int64 String
─────┼───────────────────────────────────────────────
1 │ 1 0 0
2 │ 1 -1 1 Negative Time
3 │ 1 -2 1 Negative Time
4 │ 1 3 0
5 │ 1 missing 1 Missing Time
6 │ 1 5 0
7 │ 1 4 1 Non-ascending Time
julia> df1 = DataFrame(USUBJID = [1,1,1,2,2,2], NFRLT = [missing, 2, 1, 0, 1, -1])
6×2 DataFrame
Row │ USUBJID NFRLT
│ Int64 Int64?
─────┼──────────────────
1 │ 1 missing
2 │ 1 2
3 │ 1 1
4 │ 2 0
5 │ 2 1
6 │ 2 -1
julia> make_time_fl(df1, "NFRLT", group=["USUBJID"])
6×4 DataFrame
Row │ USUBJID NFRLT NFRLTFN NFRLTFC
│ Int64 Int64? Int64 String
─────┼───────────────────────────────────────────────
1 │ 1 missing 1 Missing Time
2 │ 1 2 0
3 │ 1 1 1 Non-ascending Time
4 │ 2 0 0
5 │ 2 1 0
6 │ 2 -1 1 Negative TimeADaM.merge_columns — Method
merge_columns(df1, df2; filter_func, keep, kwargs...)leftjoin's new column(s) to the primary dataset based on columns from a secondary dataset.
Positional Arguments
df1: Main DataFrame to join todf2: DataFrame to leftjoin columns from
Keyword Arguments
keep: Vector ofsource => destinationpairs for column mapping (default: all reference columns)filter_func: Function to filter rows from secondary dataset (default: include all)kwargs...: Keyword aguments of leftjoin function can be passed. Eg:on,makeuniqueetc.
Example
julia> dm = DataFrame(USUBJID = ["01-701-1015", "01-701-1023", "01-701-1028"], SEX = ["M", "F", "F"], COUNTRY = ["USA", "USA", "NOR"])
3×3 DataFrame
Row │ ID SEX COUNTRY
│ String String String
─────┼──────────────────────────────
1 │ 01-701-1015 M USA
2 │ 01-701-1023 F USA
3 │ 01-701-1028 F NOR
julia> vs = DataFrame(
[
("01-701-1015", "WEIGHT", 50, "kg"),
("01-701-1015", "HEIGHT", 150, "cm"),
("01-701-1023", "WEIGHT", 60, "kg"),
("01-701-1023", "HEIGHT", 160, "cm"),
("01-701-1028", "WEIGHT", 70, "kg"),
("01-701-1028", "HEIGHT", 170, "cm"),
],
[:USUBJID, :VSTESTCD, :VSSTRESN, :VSSTRESU],
)
6×4 DataFrame
Row │ USUBJID VSTESTCD VSSTRESN VSSTRESU
│ String String Int64 String
─────┼───────────────────────────────────────────
1 │ 01-701-1015 WEIGHT 50 kg
2 │ 01-701-1015 HEIGHT 150 cm
3 │ 01-701-1023 WEIGHT 60 kg
4 │ 01-701-1023 HEIGHT 160 cm
5 │ 01-701-1028 WEIGHT 70 kg
6 │ 01-701-1028 HEIGHT 170 cm
julia> merge_ht = merge_columns(
dm,
vs,
filter_func = r -> r.VSTESTCD == "HEIGHT",
keep = [:VSSTRESN => :HTBL, :VSSTRESU => :HTBLU],
on = [:USUBJID]
)
3×5 DataFrame
Row │ USUBJID SEX COUNTRY HTBL HTBLU
│ String String String Int64? String?
─────┼───────────────────────────────────────────────
1 │ 01-701-1015 M USA 150 cm
2 │ 01-701-1023 F USA 160 cm
3 │ 01-701-1028 F NOR 170 cm
julia> merge_ht_wt = merge_columns(
merge_ht,
vs,
filter_func = r -> r.VSTESTCD == "WEIGHT",
keep = ["VSSTRESN" => "WTBL", "VSSTRESU" => "WTBLU"],
on = ["USUBJID"]
)
3×7 DataFrame
Row │ USUBJID SEX COUNTRY HTBL HTBLU WTBL WTBLU
│ String String String Int64? String? Int64? String?
─────┼────────────────────────────────────────────────────────────────
1 │ 01-701-1015 M USA 150 cm 50 kg
2 │ 01-701-1023 F USA 160 cm 60 kg
3 │ 01-701-1028 F NOR 170 cm 70 kg
julia> merge_columns(
dm,
vs,
filter_func = r -> r.VSTESTCD == "HEIGHT",
on = [:USUBJID]
) # without specifying keep
3×6 DataFrame
Row │ USUBJID SEX COUNTRY VSTESTCD VSSTRESN VSSTRESU
│ String String String String? Int64? String?
─────┼────────────────────────────────────────────────────────────
1 │ 01-701-1015 M USA HEIGHT 150 cm
2 │ 01-701-1023 F USA HEIGHT 160 cm
3 │ 01-701-1028 F NOR HEIGHT 170 cm
julia> merge_columns(
dm,
vs,
on = [:USUBJID]
) # without specifying keep and filter_func
6×6 DataFrame
Row │ USUBJID SEX COUNTRY VSTESTCD VSSTRESN VSSTRESU
│ String String String String? Int64? String?
─────┼────────────────────────────────────────────────────────────
1 │ 01-701-1015 M USA WEIGHT 50 kg
2 │ 01-701-1015 M USA HEIGHT 150 cm
3 │ 01-701-1023 F USA WEIGHT 60 kg
4 │ 01-701-1023 F USA HEIGHT 160 cm
5 │ 01-701-1028 F NOR WEIGHT 70 kg
6 │ 01-701-1028 F NOR HEIGHT 170 cmADaM.merge_covariates — Method
merge_covariates(df1, df2; domain, covariates, filter_func, baseline=true, kwargs...)- Iteratively merges covariate columns from a secondary DataFrame (
df2) into a primary DataFrame (df1) usingADaM.merge_columnsfor each covariate in thecovariatesvector. - This is designed for SDTM-style domains (e.g., LB, VS) where a test code column (e.g., LBTESTCD) identifies the covariate.
Positional Arguments
df1: Main DataFrame to join to (primary dataset)df2: DataFrame to leftjoin columns from (secondary dataset)
Keyword Arguments
domain: domain prefix (String or Symbol) used to construct test code and value column names (e.g., LBTESTCD, LBSTRESN, LBSTRESU)covariates: Vector of covariate names to merge iterativelyfilter_func: Function to filter rows from the secondary dataset for each covariate.baseline: Iftrue, appends "BL"/"BLU" to output column names (default: true)kwargs...: Keyword aguments of leftjoin function can be passed. Eg:on,makeuniqueetc.
Examples
julia> dm = DataFrame(USUBJID = ["01-701-1015"], SEX = ["M"], COUNTRY = ["USA"])
1×3 DataFrame
Row │ USUBJID SEX COUNTRY
│ String String String
─────┼──────────────────────────────
1 │ 01-701-1015 M USA
julia> lb = DataFrame(USUBJID = fill("01-701-1015", 16),
LBTESTCD = repeat(["AST", "ALT", "CREAT", "BILI"], inner = 4),
LBSTRESN = [35, 30, 28, 33, 40, 32, 29, 35, 1, 0, 1, 1, 0, 1, 1, 0],
LBSTRESU = vcat(fill("U/L", 8), fill("mg/dL", 8)),
LBBLFL = repeat(["Y", "", "", ""], 4),
)
16×5 DataFrame
Row │ USUBJID LBTESTCD LBSTRESN LBSTRESU LBBLFL
│ String String Int64 String String
─────┼───────────────────────────────────────────────────
1 │ 01-701-1015 AST 35 U/L Y
2 │ 01-701-1015 AST 30 U/L
3 │ 01-701-1015 AST 28 U/L
4 │ 01-701-1015 AST 33 U/L
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮
13 │ 01-701-1015 BILI 0 mg/dL Y
14 │ 01-701-1015 BILI 1 mg/dL
15 │ 01-701-1015 BILI 1 mg/dL
16 │ 01-701-1015 BILI 0 mg/dL
8 rows omitted
julia> merge_covariates(
dm,
lb;
domain = "LB",
covariates = ["AST", "ALT", "CREAT", "BILI"],
filter_func = r -> coalesce(r.LBBLFL, "") == "Y",
on = [:USUBJID],
)
1×11 DataFrame
Row │ USUBJID SEX COUNTRY ASTBL ASTBLU ALTBL ALTBLU CREATBL CREATBLU BILIBL BILIBLU
│ String String String Int64? String? Int64? String? Int64? String? Int64? String?
─────┼────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 01-701-1015 M USA 35 U/L 40 U/L 1 mg/dL 0 mg/dLThe baseline filter will be automatically determined, if the appropriate columns are present (LBBLFL == "Y", LBLOBXFL == "Y", VISIT == "Screening")
julia> merge_covariates(
dm,
lb;
domain = "LB",
covariates = ["AST", "ALT", "CREAT", "BILI"],
on = [:USUBJID],
)
1×11 DataFrame
Row │ USUBJID SEX COUNTRY ASTBL ASTBLU ALTBL ALTBLU CREATBL CREATBLU BILIBL BILIBLU
│ String String String Int64? String? Int64? String? Int64? String? Int64? String?
─────┼────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ 01-701-1015 M USA 35 U/L 40 U/L 1 mg/dL 0 mg/dLADaM.nci_liver_score — Method
nci_liver_score(bili::Number, ast::Number, uln_bili::Number, uln_ast::Number; kwargs...)
nci_liver_score(bili::Quantity, ast::Number, uln_bili::Quantity, uln_ast::Quantity; kwargs...)
nci_liver_score(df::DataFrame; kwargs...)Compute NCI (National Cancer Institute) hepatic impairment score (numeric, categorical) from Bilirubin and AST values which can be provided as Quantitys, Scalars or Vectors via DataFrame. Refer NCI Criteria
Arguments
bili: Bilirubin value.ast: AST value.uln_bili: Upper limit of normal for bilirubin.uln_ast: Upper limit of normal for AST.bili_unit: Bilirubin unit. (default: "umol/L")ast_unit: Column name for AST unit (default: "U/L").
Default Columns
The following default column names are used for DataFrame input:
bili = :BILIBLast = :ASTBLuln_bili = :ULN_BILIBLuln_ast = :ULN_ASTBLbili_unit = :BILIBLUast_unit = :ASTBLUcol = :NCI
Output
- Numeric NCI liver score (1=Normal, 2=Mild, 3=Moderate, 4=Severe, or
missing) - Categorical NCI liver score ("Normal", "Mild", "Moderate", "Severe", or
missing)
Result output in form of DataFrame Columns or a 2-element Tuple.
Notes
- The function implements the NCI hepatic impairment criteria:
- Severe:
bili > 3 * uln_bili - Moderate:
1.5 * uln_bili < bili <= 3 * uln_bili - Mild:
(bili > uln_bili && bili <= 1.5 * uln_bili) || (ast > uln_ast) - Normal:
bili <= uln_bili && ast <= uln_ast
- Severe:
- Returns
missingfor both outputs if any argument is missing or does not match a category. - Output columns can be customized via the
colkeyword argument. - The units of
bilianduln_bilimust match; otherwise, an error is thrown.
Examples
julia> nci_liver_score(7.0, 20.0, 2.0, 30.0, bili_unit="mg/dL", ast_unit="U/L")
(4, "Severe")
julia> nci_liver_score(4.0, 20.0, 2.0, 30.0)
(3, "Moderate")
julia> nci_liver_score(2.2, 20.0, 2.0, 15.0)
(2, "Mild")
julia> nci_liver_score(1.0, 10.0, 2.0, 15.0)
(1, "Normal")
julia> nci_liver_score(7us"mg/dL", 20, 2us"mg/dL", 30)
(4, "Severe")
julia> nci_liver_score(4us"mg/dL", 20, 2us"mg/dL", 30)
(3, "Moderate")
julia> nci_liver_score(2.2us"mg/dL", 20, 2us"mg/dL", 15)
(2, "Mild")
julia> nci_liver_score(1us"mg/dL", 10, 2us"mg/dL", 15)
(1, "Normal")
julia> df = DataFrame(BILIBL = [7.0, 4.0, 2.2, 1.0, missing],
ASTBL = [20.0, 20.0, 20.0, 10.0, 10.0],
ULN_BILIBL = [2.0, 2.0, 2.0, 2.0, 2.0],
ULN_ASTBL = [30.0, 30.0, 15.0, 15.0, 15.0],
BILIBLU = "mg/dL",
ASTBLU = "U/L")
5×6 DataFrame
Row │ BILIBL ASTBL ULN_BILIBL ULN_ASTBL BILIBLU ASTBLU
│ Float64? Float64 Float64 Float64 String String
─────┼────────────────────────────────────────────────────────────
1 │ 7.0 20.0 2.0 30.0 mg/dL U/L
2 │ 4.0 20.0 2.0 30.0 mg/dL U/L
3 │ 2.2 20.0 2.0 15.0 mg/dL U/L
4 │ 1.0 10.0 2.0 15.0 mg/dL U/L
5 │ missing 10.0 2.0 15.0 mg/dL U/L
julia> nci_liver_score(df)
5×8 DataFrame
Row │ BILIBL ASTBL ULN_BILIBL ULN_ASTBL BILIBLU ASTBLU NCIN NCIC
│ Float64? Float64 Float64 Float64 String String Int64? String?
─────┼───────────────────────────────────────────────────────────────────────────────
1 │ 7.0 20.0 2.0 30.0 mg/dL U/L 4 Severe
2 │ 4.0 20.0 2.0 30.0 mg/dL U/L 3 Moderate
3 │ 2.2 20.0 2.0 15.0 mg/dL U/L 2 Mild
4 │ 1.0 10.0 2.0 15.0 mg/dL U/L 1 Normal
5 │ missing 10.0 2.0 15.0 mg/dL U/L missing missing
julia> nci_liver_score(df, col=:MYNCI)
5×8 DataFrame
Row │ BILIBL ASTBL ULN_BILIBL ULN_ASTBL BILIBLU ASTBLU MYNCIN MYNCIC
│ Float64? Float64 Float64 Float64 String String Int64? String?
─────┼───────────────────────────────────────────────────────────────────────────────
1 │ 7.0 20.0 2.0 30.0 mg/dL U/L 4 Severe
2 │ 4.0 20.0 2.0 30.0 mg/dL U/L 3 Moderate
3 │ 2.2 20.0 2.0 15.0 mg/dL U/L 2 Mild
4 │ 1.0 10.0 2.0 15.0 mg/dL U/L 1 Normal
5 │ missing 10.0 2.0 15.0 mg/dL U/L missing missingADaM.round_columns — Method
round_columns(df::DataFrame, digit::Int64)Rounds values across Float columns to specific number of digits.
Example
julia> df = DataFrame(Col1 = [1, 1.0023], Col2 = [3.14159, 2], Col3 = [3, 1.7], Col4 = [1, 4])
2×4 DataFrame
Row │ Col1 Col2 Col3 Col4
│ Float64 Float64 Float64 Int64
─────┼──────────────────────────────────
1 │ 1.0 3.14159 3.0 1
2 │ 1.0023 2.0 1.7 4
julia> round_columns(df, 2)
2×4 DataFrame
Row │ Col1 Col2 Col3 Col4
│ Float64 Float64 Float64 Float64
─────┼────────────────────────────────────
1 │ 1.0 3.14 3.0 1.0
2 │ 1.0 2.0 1.7 4.0ADaM.set_exclusion — Method
set_exclusion(df, excl_func, comment; group, order)- Helps to set exclusions to the dataframe rows based on input excl_func and group variables.
- The exclusion comment can be passed as a
Stringor from a DataFrame column(Symbol).
Keyword Arguments
excl_func: Function condition based on which rows of a dataframe are flagged as excluded.group: Grouping variable(s) for exclusion logic.order: Ordering variable(s) for sorting within groups.
Example
julia> df = DataFrame(
ID = [1,1,1,2,2,2,3,3,3],
SEQ = [1,2,3,1,2,3,1,2,3],
EVID = [1,1,1,0,0,0,0,1,0],
STAT = ["NA","NA", "NA", missing, missing, missing, missing, "NA", missing],
CONC = [missing,missing,missing,20,40,60,30,missing,70],
)
9×5 DataFrame
Row │ ID SEQ EVID STAT CONC
│ Int64 Int64 Int64 String? Int64?
─────┼───────────────────────────────────────
1 │ 1 1 1 NA missing
2 │ 1 2 1 NA missing
3 │ 1 3 1 NA missing
4 │ 2 1 0 missing 20
5 │ 2 2 0 missing 40
6 │ 2 3 0 missing 60
7 │ 3 1 0 missing 30
8 │ 3 2 1 NA missing
9 │ 3 3 0 missing 70
# Exclusion 1: Subjects with missing conc. data
julia> set_exclusion(df, "All missing conc subs", excl_func = group -> all(ismissing, group.CONC), group = :ID)
9×7 DataFrame
Row │ ID SEQ EVID STAT CONC EXCLFCOM EXCLF
│ Int64 Int64 Int64 String? Int64? String? Int64
─────┼─────────────────────────────────────────────────────────────────────
1 │ 1 1 1 NA missing All missing conc subs 1
2 │ 1 2 1 NA missing All missing conc subs 1
3 │ 1 3 1 NA missing All missing conc subs 1
4 │ 2 1 0 missing 20 missing 0
5 │ 2 2 0 missing 40 missing 0
6 │ 2 3 0 missing 60 missing 0
7 │ 3 1 0 missing 30 missing 0
8 │ 3 2 1 NA missing missing 0
9 │ 3 3 0 missing 70 missing 0
# Exclusion 2: Subjects with no dosing data
julia> set_exclusion(df, "No dose subs", excl_func = group -> all(iszero, group.EVID), group = "ID")
9×7 DataFrame
Row │ ID SEQ EVID STAT CONC EXCLFCOM EXCLF
│ Int64 Int64 Int64 String? Int64? String? Int64
─────┼────────────────────────────────────────────────────────────
1 │ 1 1 1 NA missing missing 0
2 │ 1 2 1 NA missing missing 0
3 │ 1 3 1 NA missing missing 0
4 │ 2 1 0 missing 20 No dose subs 1
5 │ 2 2 0 missing 40 No dose subs 1
6 │ 2 3 0 missing 60 No dose subs 1
7 │ 3 1 0 missing 30 missing 0
8 │ 3 2 1 NA missing missing 0
9 │ 3 3 0 missing 70 missing 0
# Exclusion 3: Subjects with no conc. data
julia> set_exclusion(df, "No conc subs", excl_func = group -> all(isone, group.EVID), group = [:ID])
9×7 DataFrame
Row │ ID SEQ EVID STAT CONC EXCLFCOM EXCLF
│ Int64 Int64 Int64 String? Int64? String? Int64
─────┼────────────────────────────────────────────────────────────
1 │ 1 1 1 NA missing No conc subs 1
2 │ 1 2 1 NA missing No conc subs 1
3 │ 1 3 1 NA missing No conc subs 1
4 │ 2 1 0 missing 20 missing 0
5 │ 2 2 0 missing 40 missing 0
6 │ 2 3 0 missing 60 missing 0
7 │ 3 1 0 missing 30 missing 0
8 │ 3 2 1 NA missing missing 0
9 │ 3 3 0 missing 70 missing 0
julia> set_exclusion(df, :STAT, excl_func = group -> all(ismissing, group.CONC), group = [:ID])
9×7 DataFrame
Row │ ID SEQ EVID STAT CONC EXCLFCOM EXCLF
│ Int64 Int64 Int64 String? Int64? String? Int64
─────┼────────────────────────────────────────────────────────
1 │ 1 1 1 NA missing NA 1
2 │ 1 2 1 NA missing NA 1
3 │ 1 3 1 NA missing NA 1
4 │ 2 1 0 missing 20 missing 0
5 │ 2 2 0 missing 40 missing 0
6 │ 2 3 0 missing 60 missing 0
7 │ 3 1 0 missing 30 missing 0
8 │ 3 2 1 NA missing missing 0
9 │ 3 3 0 missing 70 missing 0
julia> set_exclusion(df, :STAT, excl_func = group -> all(ismissing, group.CONC), group = [:ID, :SEQ])
9×7 DataFrame
Row │ ID SEQ EVID STAT CONC EXCLFCOM EXCLF
│ Int64 Int64 Int64 String? Int64? String? Int64
─────┼────────────────────────────────────────────────────────
1 │ 1 1 1 NA missing NA 1
2 │ 1 2 1 NA missing NA 1
3 │ 1 3 1 NA missing NA 1
4 │ 2 1 0 missing 20 missing 0
5 │ 2 2 0 missing 40 missing 0
6 │ 2 3 0 missing 60 missing 0
7 │ 3 1 0 missing 30 missing 0
8 │ 3 2 1 NA missing NA 1
9 │ 3 3 0 missing 70 missing 0ADaM.subject_trt_count — Method
subject_trt_count(df::DataFrame)- Displays the information about the subject treatment count. Outputs a DataFrame with
DRUG,DRUGCD(DRUG Code) andcount - Valid datsets are :
PC,EX,DM,ADPC,ADEX,ADSL
Example
julia> ex = PharmaDatasets.dataset("SDTM/CDISCPILOT01/ex")
591×17 DataFrame
Row │ STUDYID DOMAIN USUBJID EXSEQ EXTRT EXDOSE EXDOSU EXDOSFRM EXDOSFRQ EXROUTE VISITNUM V ⋯
│ String15 String3 String15 Float64 String15 Float64 String3 String7 String3 String15 Float64 S ⋯
─────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ CDISCPILOT01 EX 01-701-1015 1.0 PLACEBO 0.0 mg PATCH QD TRANSDERMAL 3.0 B ⋯
2 │ CDISCPILOT01 EX 01-701-1015 2.0 PLACEBO 0.0 mg PATCH QD TRANSDERMAL 4.0 W
3 │ CDISCPILOT01 EX 01-701-1015 3.0 PLACEBO 0.0 mg PATCH QD TRANSDERMAL 12.0 W
4 │ CDISCPILOT01 EX 01-701-1023 1.0 PLACEBO 0.0 mg PATCH QD TRANSDERMAL 3.0 B
5 │ CDISCPILOT01 EX 01-701-1023 2.0 PLACEBO 0.0 mg PATCH QD TRANSDERMAL 4.0 W ⋯
6 │ CDISCPILOT01 EX 01-701-1028 1.0 XANOMELINE 54.0 mg PATCH QD TRANSDERMAL 3.0 B
7 │ CDISCPILOT01 EX 01-701-1028 2.0 XANOMELINE 81.0 mg PATCH QD TRANSDERMAL 4.0 W
8 │ CDISCPILOT01 EX 01-701-1028 3.0 XANOMELINE 54.0 mg PATCH QD TRANSDERMAL 12.0 W
⋮ │ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋮ ⋱
585 │ CDISCPILOT01 EX 01-718-1355 1.0 PLACEBO 0.0 mg PATCH QD TRANSDERMAL 3.0 B ⋯
586 │ CDISCPILOT01 EX 01-718-1355 2.0 PLACEBO 0.0 mg PATCH QD TRANSDERMAL 4.0 W
587 │ CDISCPILOT01 EX 01-718-1355 3.0 PLACEBO 0.0 mg PATCH QD TRANSDERMAL 12.0 W
588 │ CDISCPILOT01 EX 01-718-1371 1.0 XANOMELINE 54.0 mg PATCH QD TRANSDERMAL 3.0 B
589 │ CDISCPILOT01 EX 01-718-1371 2.0 XANOMELINE 81.0 mg PATCH QD TRANSDERMAL 4.0 W ⋯
590 │ CDISCPILOT01 EX 01-718-1427 1.0 XANOMELINE 54.0 mg PATCH QD TRANSDERMAL 3.0 B
591 │ CDISCPILOT01 EX 01-718-1427 2.0 XANOMELINE 81.0 mg PATCH QD TRANSDERMAL 4.0 W
6 columns and 576 rows omitted
julia> subject_trt_count(ex)
2×3 DataFrame
Row │ DRUG DRUGCD count
│ String String Int64
─────┼───────────────────────────
1 │ PLACEBO NA 86
2 │ XANOMELINE NA 168ADaM.validate_column_structure — Method
validate_column_structure(df::DataFrame; ignore, silence_errors)Validate a DataFrame's column structure according to ADaM Data conventions(https://sastricks.com/cdisc/ADaMIG_v1.3.pdf#page=14).
Returns a DataFrame with all validation checks. Each row represents a validation check with columns:
check: Description of the validation checktag: Check tag identifier (Symbol)column: The column name (if applicable)status: Check result (pass,fail, ormissingfor ignored checks)level: Severity level:errorfor failures,ignoredfor ignored checksrows: Affected row numbers (if applicable)details: Additional details about the finding
Arguments
df: The DataFrame to validateignore: Check tags to ignore. Can be provided asSymbols orStrings. Available tags (validation checks):name_start- Column name must start with a lettername_format- Column name can only contain letters, underscores, and numeralsname_length- Column name length exceeds 8 characterschar_length- Character column value length exceeds 200 characterslabel_required- Column labels must be presentlabel_length- Column label length exceeds 40 characters
silence_errors: Iftrue, errors are not thrown even if validation fails; instead, the findings DataFrame is returned with all errors and passes. Default isfalse, which throws an error if any validation fails.
Returns
A DataFrame with all validation checks and their pass/fail status. Errors will be thrown on failed checks.
Examples
# Basic validation
julia> df = DataFrame(ID = [1, 2], USUBJID = ["S001", "S002"], DV = [10.5, 20.3])
2×3 DataFrame
Row │ ID USUBJID DV
│ Int64 String Float64
─────┼─────────────────────────
1 │ 1 S001 10.5
2 │ 2 S002 20.3
julia> validate_column_structure(df)
ERROR: Column structure validation failed with 3 error(s):
1. "ID": Column label is missing (label_required)
2. "USUBJID": Column label is missing (label_required)
3. "DV": Column label is missing (label_required)
julia> validate_column_structure(df, silence_errors=true)
7×7 DataFrame
Row │ check tag column status level rows details
│ String Symbol? String? String? String? Array…? String?
─────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ Column label is missing label_required ID fail error missing All columns must have a label
2 │ Column label is missing label_required USUBJID fail error missing All columns must have a label
3 │ Column label is missing label_required DV fail error missing All columns must have a label
4 │ All column names start with a le… name_start missing pass missing missing Checked: 3/3 column(s)
5 │ All column names are valid format name_format missing pass missing missing Checked: 3/3 column(s)
6 │ All column name lengths are valid name_length missing pass missing missing Checked: 3/3 column(s)
7 │ All character column lengths are… char_length missing pass missing missing Checked: 1/3 column(s)
julia> validate_column_structure(df, ignore = [:label_required])
5×7 DataFrame
Row │ check tag column status level rows details
│ String Symbol? String? String? String? Array…? String?
─────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────
1 │ Column labels must be present label_required missing missing ignored missing Check ignored by user
2 │ All column names start with a le… name_start missing pass missing missing Checked: 3/3 column(s)
3 │ All column names are valid format name_format missing pass missing missing Checked: 3/3 column(s)
4 │ All column name lengths are valid name_length missing pass missing missing Checked: 3/3 column(s)
5 │ All character column lengths are… char_length missing pass missing missing Checked: 1/3 column(s)