ADaM Docstrings

ADaM.JoinColumnsKeywordErrorType
JoinColumnsKeywordError(keyword::Symbol, err::Exception)

Custom error type for join_columns keyword argument failures.

ADaM.basic_info_pcMethod
basic_info_pc(df::DataFrame)

Displays basic information about the pc(PK conc.) dataset in a dictionary containing

  • Studies involved
  • No of subjects(overall)
  • Treatments
  • Sample specimens

Example

julia> pc = PharmaDatasets.dataset("SDTM/CDISCPILOT01/pc")
3556×20 DataFrame
  Row │ STUDYID       DOMAIN   USUBJID      PCSEQ    PCTESTCD  PCTEST      PCORRES            PCORRESU  PCSTRESC           PCS ⋯
      │ String15      String3  String15     Float64  String3   String15    String31           String7   String31           Flo ⋯
──────┼─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
    1 │ CDISCPILOT01  PC       01-701-1015      1.0  XAN       XANOMELINE  <BLQ               ug/ml     <BLQ                   ⋯
    2 │ CDISCPILOT01  PC       01-701-1015      2.0  XAN       XANOMELINE  <BLQ               ug/ml     <BLQ               mis
    3 │ CDISCPILOT01  PC       01-701-1015      3.0  XAN       XANOMELINE  <BLQ               ug/ml     <BLQ               mis
    4 │ CDISCPILOT01  PC       01-701-1015      4.0  XAN       XANOMELINE  <BLQ               ug/ml     <BLQ               mis
    5 │ CDISCPILOT01  PC       01-701-1015      5.0  XAN       XANOMELINE  <BLQ               ug/ml     <BLQ               mis ⋯
    6 │ CDISCPILOT01  PC       01-701-1015      6.0  XAN       XANOMELINE  <BLQ               ug/ml     <BLQ               mis
    7 │ CDISCPILOT01  PC       01-701-1015      7.0  XAN       XANOMELINE  <BLQ               ug/ml     <BLQ               mis
    8 │ CDISCPILOT01  PC       01-701-1015      8.0  XAN       XANOMELINE  <BLQ               ug/ml     <BLQ               mis
  ⋮   │      ⋮           ⋮          ⋮          ⋮        ⋮          ⋮               ⋮             ⋮              ⋮              ⋱
 3550 │ CDISCPILOT01  PC       01-718-1427      8.0  XAN       XANOMELINE  1.87286298246525   ug/ml     1.87286298246525       ⋯
 3551 │ CDISCPILOT01  PC       01-718-1427      9.0  XAN       XANOMELINE  1.8956805216499    ug/ml     1.8956805216499
 3552 │ CDISCPILOT01  PC       01-718-1427     10.0  XAN       XANOMELINE  0.575294228033741  ug/ml     0.575294228033741
 3553 │ CDISCPILOT01  PC       01-718-1427     11.0  XAN       XANOMELINE  0.173882563295603  ug/ml     0.173882563295603
 3554 │ CDISCPILOT01  PC       01-718-1427     12.0  XAN       XANOMELINE  0.015885031037154  ug/ml     0.015885031037154      ⋯
 3555 │ CDISCPILOT01  PC       01-718-1427     13.0  XAN       XANOMELINE  <BLQ               ug/ml     <BLQ               mis
 3556 │ CDISCPILOT01  PC       01-718-1427     14.0  XAN       XANOMELINE  <BLQ               ug/ml     <BLQ               mis
                                                                                                11 columns and 3541 rows omitted

julia> basic_info_pc(pc)
Dict{String, Any} with 4 entries:
  "studies"    => String15["CDISCPILOT01"]
  "subjects"   => 254
  "treatments" => String15["XANOMELINE"]
  "specimens"  => String7["PLASMA"]
ADaM.bmi_summaryMethod
bmi_summary(df::DataFrame; id, bmi)

Displays the count of each BMI category based on id. Category names are based on National Library of Medicine.

  • Underweight (Below 18.5)
  • Normal (18.5 to 24.9)
  • Overweight (25.0 to 29.9)
  • Obese (30.0 to 39.9)
  • Extreme (Over 40)

id defaults to :USUBJID and bmi defaults to :BMIBL.

Example

julia> df = DataFrame(USUBJID = [1,2,3,4,5,6,7,8], BMIBL = [15,42,31,25,21,46,18,19])

julia> bmi_summary(df)
5×2 DataFrame
 Row │ BMIC         count 
     │ String       Int64 
─────┼────────────────────
   1 │ Underweight      2
   2 │ Extreme          2
   3 │ Obese            1
   4 │ Overweight       1
   5 │ Normal           2

julia> df = DataFrame(ID = [1,2,3,4,5,6,7,8], BMI = [15,42,31,25,21,46,18,19])

julia> bmi_summary(df, id="ID", bmi = "BMI")
5×2 DataFrame
 Row │ BMIC         count 
     │ String       Int64 
─────┼────────────────────
   1 │ Underweight      2
   2 │ Extreme          2
   3 │ Obese            1
   4 │ Overweight       1
   5 │ Normal           2
ADaM.body_mass_indexMethod
body_mass_index(weight::Number, height::Number; kwargs...)
body_mass_index(weight::Quantity, height::Quantity)
body_mass_index(df::DataFrame; kwargs...)

Calculates BMI from height and weight which can be provided as Quantitys, Scalars or Vectors via DataFrame. BMI Wikipedia

The weight_unit and height_unit can be explicitly passed as unit values or can be passed as a Vector. Follows the units from DynamicQuantities.jl

Arguments

  • weight: Weight value.
  • height: Height value.
  • weight_unit: Weight unit (default: "kg").
  • height_unit: Height unit (default: "cm").

Default Columns

The following default column names are used for DataFrame input:

  • weight = :WTBL
  • height = :HTBL
  • weight_unit = :WTBLU
  • height_unit = :HTBLU
  • col = :BMIBL

Output

BMI value in kg/m² as a scalar or DataFrame column with unit column.

Examples

julia> bmi = body_mass_index(60, 160)
23.437499999999996 m⁻² kg

julia> value, unit = ustrip(bmi), dimension(bmi)
(23.437499999999996, m⁻² kg)

julia> body_mass_index(60, 1.6, height_unit = :m)
23.437499999999996 m⁻² kg

julia> body_mass_index(60000, 160, weight_unit = :g)
23.437499999999996 m⁻² kg

julia> body_mass_index(60u"kg", 160u"cm")
23.437499999999996 m⁻² kg

julia> df = DataFrame(
        HTBL = [150, 160, 170, 180],
        WTBL = [50, 60, 70, 80],
        HTBLU = "cm",
        WTBLU = "kg",
    )
4×4 DataFrame
 Row │ HTBL   WTBL   HTBLU   WTBLU  
     │ Int64  Int64  String  String 
─────┼──────────────────────────────
   1 │   150     50  cm      kg
   2 │   160     60  cm      kg
   3 │   170     70  cm      kg
   4 │   180     80  cm      kg

julia> body_mass_index(df)
4×6 DataFrame
 Row │ HTBL   WTBL   HTBLU   WTBLU   BMIBL    BMIBLU    
     │ Int64  Int64  String  String  Float64  Symbolic… 
─────┼──────────────────────────────────────────────────
   1 │   150     50  cm      kg      22.2222  m⁻² kg
   2 │   160     60  cm      kg      23.4375  m⁻² kg
   3 │   170     70  cm      kg      24.2215  m⁻² kg
   4 │   180     80  cm      kg      24.6914  m⁻² kg

julia> df = DataFrame(HT = [150, 160, 170, 180], WT = [50, 60, 70, 80], HTU = "cm", WTU = "kg")
4×4 DataFrame
 Row │ HT     WT     HTU     WTU    
     │ Int64  Int64  String  String 
─────┼──────────────────────────────
   1 │   150     50  cm      kg
   2 │   160     60  cm      kg
   3 │   170     70  cm      kg
   4 │   180     80  cm      kg

julia> body_mass_index(df, height = :HT, weight = :WT, height_unit = :HTU, weight_unit = :WTU, col = :BMI)
4×6 DataFrame
 Row │ HT     WT     HTU     WTU     BMI      BMIU      
     │ Int64  Int64  String  String  Float64  Symbolic… 
─────┼──────────────────────────────────────────────────
   1 │   150     50  cm      kg      22.2222  m⁻² kg
   2 │   160     60  cm      kg      23.4375  m⁻² kg
   3 │   170     70  cm      kg      24.2215  m⁻² kg
   4 │   180     80  cm      kg      24.6914  m⁻² kg
ADaM.body_surface_areaMethod
body_surface_area(height::Number, weight::Number; kwargs...)
body_surface_area(height::Quantity, weight::Quantity; kwargs...)
body_surface_area(df::DataFrame; kwargs...)

Calculates BSA from height and weight which can be provided as Quantitys, Scalars or Vectors via DataFrame. BSA Wikipedia

BSA can be calculated using the following formulas:

  • mosteller (default)
  • dubois-dubois
  • haycock
  • gehan-george
  • boyd
  • fujimoto
  • takahira

The weight_unit and height_unit can be explicitly passed as unit values or can be passed as a Vector. Follows the units from DynamicQuantities.jl

Arguments

  • height: Height value.
  • weight: Weight value.
  • height_unit: Height unit (default: "cm").
  • weight_unit: Weight unit (default: "kg").
  • formula: BSA calculation formula (default: :mosteller).

Default Columns

The following default column names are used for DataFrame input:

  • height = :HTBL
  • weight = :WTBL
  • height_unit = :HTBLU
  • weight_unit = :WTBLU
  • col = :BSABL

Output

BSA value in m² as a scalar or DataFrame column with unit column.

Examples

julia> bsa = body_surface_area(160, 60) # default height(cm), weight(kg), formula(mosteller)
1.632993161855452 m²

julia> value, unit = ustrip(bsa), dimension(bsa)
(1.632993161855452, m²)

julia> bsa = body_surface_area(160, 60, formula="dubois-dubois")
1.6220414635466536 m²

julia> bsa = body_surface_area(160, 60, formula=:takahira)
1.6349324596500971 m²

julia> body_surface_area( 1.6, 60, height_unit = :m)
1.632993161855452 m²

julia> body_surface_area(160, 60000, weight_unit = :g)
1.632993161855452 m²

julia> body_surface_area(160u"cm", 60u"kg")
1.632993161855452 m²

julia> df = DataFrame(
        HTBL = [150, 160, 170, 180],
        WTBL = [50, 60, 70, 80],
        HTBLU = "cm",
        WTBLU = "kg",
    )
4×4 DataFrame
 Row │ HTBL   WTBL   HTBLU   WTBLU  
     │ Int64  Int64  String  String 
─────┼──────────────────────────────
   1 │   150     50  cm      kg
   2 │   160     60  cm      kg
   3 │   170     70  cm      kg
   4 │   180     80  cm      kg

julia> body_surface_area(df)
4×6 DataFrame
 Row │ HTBL   WTBL   HTBLU   WTBLU   BSABL    BSABLU    
     │ Int64  Int64  String  String  Float64  Symbolic… 
─────┼──────────────────────────────────────────────────
   1 │   150     50  cm      kg      1.44338  m²
   2 │   160     60  cm      kg      1.63299  m²
   3 │   170     70  cm      kg      1.81812  m²
   4 │   180     80  cm      kg      2.0      m²

julia> df = DataFrame(HT = [150, 160, 170, 180], WT = [50, 60, 70, 80], HTU = "cm", WTU = "kg")
4×4 DataFrame
 Row │ HT     WT     HTU     WTU    
     │ Int64  Int64  String  String 
─────┼──────────────────────────────
   1 │   150     50  cm      kg
   2 │   160     60  cm      kg
   3 │   170     70  cm      kg
   4 │   180     80  cm      kg

julia> body_surface_area(df, height = :HT, weight = :WT, height_unit = :HTU, weight_unit = :WTU, col = :BSA)
4×6 DataFrame
 Row │ HT     WT     HTU     WTU     BSA      BSAU      
     │ Int64  Int64  String  String  Float64  Symbolic… 
─────┼──────────────────────────────────────────────────
   1 │   150     50  cm      kg      1.44338  m²
   2 │   160     60  cm      kg      1.63299  m²
   3 │   170     70  cm      kg      1.81812  m²
   4 │   180     80  cm      kg      2.0      m²
ADaM.compress_dose_eventsMethod
compress_dose_events(df::DataFrame; group, order, sampling_rows)

This function replaces a sequence of dosing rows (EVID == 1) into compressed format based on EVID column, creating ADDL (Additional Doses) and II (Inter-dose Interval) columns

group and order variables (Vectors or Scalars) can be passed to customise the compression.

Compression can be done so as to retain one inter-sampling row sampling_rows = :single or two sampling rows sampling_rows = :double. Only the information of the 1st row of the sequence is retained.

Required columns for expansion: EVID

Example

julia> df = DataFrame([
    (1, 1, 40),
    (2, 1, 90),
    (1, 0, 10),
    (2, 0, 60),
    (1, 1, 20),
    (2, 1, 70),
    (1, 1, 30),
    (2, 1, 80),
    (1, 0, 50),
    (2, 0, 100)
], [:ID, :EVID, :AFRLT]) # unordered and ungrouped dataset
10×3 DataFrame
 Row │ ID     EVID   AFRLT 
     │ Int64  Int64  Int64 
─────┼─────────────────────
   1 │     1      1     40
   2 │     2      1     90
   3 │     1      0     10
   4 │     2      0     60
   5 │     1      1     20
   6 │     2      1     70
   7 │     1      1     30
   8 │     2      1     80
   9 │     1      0     50
  10 │     2      0    100

julia> compress_dose_events(df) # compress without groupby
6×4 DataFrame
 Row │ ID     EVID   AFRLT  ADDL  
     │ Int64  Int64  Int64  Int64 
─────┼────────────────────────────
   1 │     1      1     40      1
   2 │     1      0     10      0
   3 │     2      0     60      0
   4 │     1      1     20      3
   5 │     1      0     50      0
   6 │     2      0    100      0

julia> compress_dose_events(df, group = ["ID"])
8×4 DataFrame
 Row │ ID     EVID   AFRLT  ADDL  
     │ Int64  Int64  Int64  Int64 
─────┼────────────────────────────
   1 │     1      1     40      0
   2 │     1      0     10      0
   3 │     1      1     20      1
   4 │     1      0     50      0
   5 │     2      1     90      0
   6 │     2      0     60      0
   7 │     2      1     70      1
   8 │     2      0    100      0

julia> compress_dose_events(df, group = [:ID], order = [:AFRLT])
6×4 DataFrame
 Row │ ID     EVID   AFRLT  ADDL  
     │ Int64  Int64  Int64  Int64 
─────┼────────────────────────────
   1 │     1      0     10      0
   2 │     1      1     20      2
   3 │     1      0     50      0
   4 │     2      0     60      0
   5 │     2      1     70      2
   6 │     2      0    100      0

julia> compress_dose_events(df, group = "ID", order = "AFRLT", sampling_rows = :double)
8×4 DataFrame
 Row │ ID     EVID   AFRLT  ADDL  
     │ Int64  Int64  Int64  Int64 
─────┼────────────────────────────
   1 │     1      0     10      0
   2 │     1      1     20      1
   3 │     1      1     40      0
   4 │     1      0     50      0
   5 │     2      0     60      0
   6 │     2      1     70      1
   7 │     2      1     90      0
   8 │     2      0    100      0
ADaM.convert_to_missingMethod
convert_to_missing(df::DataFrame, NaStr::Vector)
  • Converts values to missing. The values that need to be converted to missing can be passed as a Vector.
  • Example : [nothing, "", NaN, ".", "-"]

Example

julia> df = DataFrame(Col1 = [1, "."], Col2 = ["", 2], Col3 = [3, nothing], Col4 = ["-", 4])
2×4 DataFrame
 Row │ Col1  Col2  Col3    Col4 
     │ Any   Any   Union…  Any  
─────┼──────────────────────────
   1 │ 1           3       -
   2 │ .     2             4

julia> convert_to_missing(df, ["", nothing, ".", "-"])
2×4 DataFrame
 Row │ Col1     Col2     Col3     Col4
     │ Int64?   Int64?   Int64?   Int64?
─────┼────────────────────────────────────
   1 │       1  missing        3  missing
   2 │ missing        2  missing        4
ADaM.creatinine_clearanceMethod
creatinine_clearance(weight::Number, height::Number; kwargs...)
creatinine_clearance(weight::Quantity, height::Quantity)
creatinine_clearance(df::DataFrame; kwargs...)

Calculates creatinine clearance from age, weight,creatinine and sex using Cockcroft–Gault formula which can be provided as Quantitys, Scalars or Vectors via DataFrame. Cockcroft–Gault formula Wikipedia

The age_unit, weight_unit and creat_unit can be explicitly passed as unit values or can be passed as a Vector. Follows the units from DynamicQuantities.jl

Default values for keyword arguments:

  • age = :AGE
  • weight = :WTBL
  • creat = :CREATBL
  • sex = :SEX
  • age_unit = :AGEU
  • weight_unit = :WTBLU
  • creat_unit = :CREATBLU
  • col = :CRCLBL

Example

julia> creatinine_clearance(53, 85, 90, "M", creat_unit = "umol/L")
100.88434438681963 min⁻¹ mL

julia> creatinine_clearance(53, 85, 1, "M", creat_unit = "mg/dL")
102.70833333333333 min⁻¹ mL

julia> creatinine_clearance(53us"yr", 85us"kg", 90us"umol/L", "M")
100.88434438681963 min⁻¹ mL

julia> creatinine_clearance(53u"yr", 85u"kg", 1us"mg/dL", "M")
102.70833333333333 min⁻¹ mL

julia> df = DataFrame(
        AGE = [20, 30, 40, 53],
        WTBL = [50, 60, 70, 85],
        CREATBL = [60, 70, 80, 90],
        SEX = ["M", "M", "F", "F"],
        AGEU = :yr,
        WTBLU = :kg,
        CREATBLU = "umol/L",
    )
4×7 DataFrame
 Row │ AGE    WTBL   CREATBL  SEX     AGEU    WTBLU   CREATBLU 
     │ Int64  Int64  Int64    String  Symbol  Symbol  String   
─────┼─────────────────────────────────────────────────────────
   1 │    20     50       60  M       yr      kg      umol/L
   2 │    30     60       70  M       yr      kg      umol/L
   3 │    40     70       80  F       yr      kg      umol/L
   4 │    53     85       90  F       yr      kg      umol/L

julia> creatinine_clearance(df)
4×9 DataFrame
 Row │ AGE    WTBL   CREATBL  SEX     AGEU    WTBLU   CREATBLU  CRCLBL    CRCLBLU   
     │ Int64  Int64  Int64    String  Symbol  Symbol  String    Float64   Symbolic… 
─────┼──────────────────────────────────────────────────────────────────────────────
   1 │    20     50       60  M       yr      kg      umol/L    122.78    min⁻¹ mL
   2 │    30     60       70  M       yr      kg      umol/L    115.764   min⁻¹ mL
   3 │    40     70       80  F       yr      kg      umol/L     91.3177  min⁻¹ mL
   4 │    53     85       90  F       yr      kg      umol/L     85.7517  min⁻¹ mL

julia> df = DataFrame(
        AGEYRS = [20, 30, 40, 53],
        WEIGHT = [50, 60, 70, 85],
        CREAT = [60, 70, 80, 90],
        GENDER = ["M", "M", "F", "F"],
        AGEUNI = :yr,
        WTUNI = :kg,
        CREATUNI = "umol/L",
    )
4×7 DataFrame
 Row │ AGEYRS  WEIGHT  CREAT  GENDER  AGEUNI  WTUNI   CREATUNI 
     │ Int64   Int64   Int64  String  Symbol  Symbol  String   
─────┼─────────────────────────────────────────────────────────
   1 │     20      50     60  M       yr      kg      umol/L
   2 │     30      60     70  M       yr      kg      umol/L
   3 │     40      70     80  F       yr      kg      umol/L
   4 │     53      85     90  F       yr      kg      umol/L

julia> creatinine_clearance(
        df;
        age = :AGEYRS,
        weight = :WEIGHT,
        creat = :CREAT,
        sex = :GENDER,
        age_unit = :AGEUNI,
        weight_unit = :WTUNI,
        creat_unit = :CREATUNI,
        col = :CREATCL,
    )
4×9 DataFrame
 Row │ AGEYRS  WEIGHT  CREAT  GENDER  AGEUNI  WTUNI   CREATUNI  CREATCL   CREATCLU  
     │ Int64   Int64   Int64  String  Symbol  Symbol  String    Float64   Symbolic… 
─────┼──────────────────────────────────────────────────────────────────────────────
   1 │     20      50     60  M       yr      kg      umol/L    122.78    min⁻¹ mL
   2 │     30      60     70  M       yr      kg      umol/L    115.764   min⁻¹ mL
   3 │     40      70     80  F       yr      kg      umol/L     91.3177  min⁻¹ mL
   4 │     53      85     90  F       yr      kg      umol/L     85.7517  min⁻¹ mL
ADaM.definition_tableMethod
definition_table(table)

Creates a Table that gives a defintion overview of the columns of table, intended to give a quick intuition of the dataset. ategoric and Numeric columns(NUM, CD , N) are automatically mapped to each other in the Summary column. Custom comments can be passed for each column as additional information.

Keyword arguments

  • max_categories = 10: Limit the number of categories listed individually for categorical columns, the rest will be lumped together.
  • label_metadata_key = "label": Key to look up column label metadata with.
  • map_dict = Helps map unique values of 2 columns.
  • comment_dict = Helps display custom comments for columns.
ADaM.derive_body_covariatesMethod
derive_body_covariates(df; kwargs...)

Derive standard body size (BMI, BSA) and renal function (CRCL, EGFR) covariates from a DataFrame.

This is a convenience function that internally calls:

  • body_mass_index
  • body_surface_area
  • creatinine_clearance
  • est_glomerular_filtration_rate

Positional Arguments

  • df: Input DataFrame containing subject data.

Keyword Arguments

  • bmi: Output column for Body Mass Index (default: :BMIBL)
  • bsa: Output column for Body Surface Area (default: :BSABL)
  • crcl: Output column for Creatinine Clearance (default: :CRCLBL)
  • egfr: Output column for estimated GFR (default: :EGFRBL)
  • weight: Input column for weight (default: :WTBL)
  • height: Input column for height (default: :HTBL)
  • weight_unit: Input column for weight units (default: :WTBLU)
  • height_unit: Input column for height units (default: :HTBLU)
  • age: Input column for age (default: :AGE)
  • age_unit: Input column for age units (default: :AGEU)
  • sex: Input column for sex/gender (default: :SEX)
  • creat: Input column for serum creatinine (default: :CREATBL)
  • creat_unit: Input column for creatinine units (default: :CREATBLU)
  • bsa_formula: Formula for BSA calculation (default: :mosteller)
  • egfr_formula: Formula for eGFR calculation (default: "ckd-epi-creat-2021")

Returns

A new DataFrame with additional columns for derived covariates, including BMI, BSA, CRCL, and EGFR column values and units.

Examples

julia> df = DataFrame(
        HTBL = [150, 160, 170, 180],
        WTBL = [50, 60, 70, 80],
        HTBLU = "cm",
        WTBLU = "kg",
    )
4×4 DataFrame
 Row │ HTBL   WTBL   HTBLU   WTBLU  
     │ Int64  Int64  String  String 
─────┼──────────────────────────────
   1 │   150     50  cm      kg
   2 │   160     60  cm      kg
   3 │   170     70  cm      kg
   4 │   180     80  cm      kg

julia> derive_body_covariates(df)
4×8 DataFrame
 Row │ HTBL   WTBL   HTBLU   WTBLU   BMIBL    BMIBLU     BSABL    BSABLU    
     │ Int64  Int64  String  String  Float64  Symbolic…  Float64  Symbolic… 
─────┼──────────────────────────────────────────────────────────────────────
   1 │   150     50  cm      kg      22.2222  m⁻² kg     1.44338  m²
   2 │   160     60  cm      kg      23.4375  m⁻² kg     1.63299  m²
   3 │   170     70  cm      kg      24.2215  m⁻² kg     1.81812  m²
   4 │   180     80  cm      kg      24.6914  m⁻² kg     2.0      m²

julia> df = DataFrame(
        HTBL = [150, 160, 170, 180],
        WTBL = [50, 60, 70, 80],
        HTBLU = "cm",
        WTBLU = "kg",
        AGE = [20, 30, 40, 53],
        CREATBL = [60, 70, 80, 90],
        SEX = ["M", "M", "F", "F"],
        AGEU = :yr,
        CREATBLU = "umol/L",
    )
4×9 DataFrame
 Row │ HTBL   WTBL   HTBLU   WTBLU   AGE    CREATBL  SEX     AGEU    CREATBLU 
     │ Int64  Int64  String  String  Int64  Int64    String  Symbol  String   
─────┼────────────────────────────────────────────────────────────────────────
   1 │   150     50  cm      kg         20       60  M       yr      umol/L
   2 │   160     60  cm      kg         30       70  M       yr      umol/L
   3 │   170     70  cm      kg         40       80  F       yr      umol/L
   4 │   180     80  cm      kg         53       90  F       yr      umol/L

julia> derive_body_covariates(df)
4×17 DataFrame
 Row │ HTBL   WTBL   HTBLU   WTBLU   AGE    CREATBL  SEX     AGEU    CREATBLU  BMIBL    BMIBLU     BSABL    BSABLU     CRCLBL    CRCLBLU    EGFRBL    EGFRBLU         
     │ Int64  Int64  String  String  Int64  Int64    String  Symbol  String    Float64  Symbolic…  Float64  Symbolic…  Float64   Symbolic…  Float64   String          
─────┼────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1 │   150     50  cm      kg         20       60  M       yr      umol/L    22.2222  m⁻² kg     1.44338  m²         122.78    min⁻¹ mL   136.546   min⁻¹ mL/1.73m²
   2 │   160     60  cm      kg         30       70  M       yr      umol/L    23.4375  m⁻² kg     1.63299  m²         115.764   min⁻¹ mL   122.476   min⁻¹ mL/1.73m²
   3 │   170     70  cm      kg         40       80  F       yr      umol/L    24.2215  m⁻² kg     1.81812  m²          91.3177  min⁻¹ mL    82.3362  min⁻¹ mL/1.73m²
   4 │   180     80  cm      kg         53       90  F       yr      umol/L    24.6914  m⁻² kg     2.0      m²          80.7075  min⁻¹ mL    65.9318  min⁻¹ mL/1.73m²
ADaM.dose_interval_sequenceMethod
function dose_interval_sequence(df::DataFrame; group, order)

This function creates a column DINTSEQ from a either dose(EX) or combined dataset (EX and PC).

The values in DINTSEQ column contains a sequences of doses/washout (EVID in [1, 4]) starting from missing (predose sample) or 1 (initial dose).

Group and order variables (Vectors or Scalars) can be passed to customise the sequence.

Required columns for dose count: EVID

Example

julia> df = DataFrame([
    (1, 1, 40),
    (2, 1, 90),
    (1, 0, 10),
    (2, 0, 60),
    (1, 1, 20),
    (2, 1, 70),
    (1, 1, 30),
    (2, 1, 80),
    (1, 0, 50),
    (2, 0, 100)
], [:ID, :EVID, :AFRLT]) # unordered and ungrouped dataset
10×3 DataFrame
 Row │ ID     EVID   AFRLT 
     │ Int64  Int64  Int64 
─────┼─────────────────────
   1 │     1      1     40
   2 │     2      1     90
   3 │     1      0     10
   4 │     2      0     60
   5 │     1      1     20
   6 │     2      1     70
   7 │     1      1     30
   8 │     2      1     80
   9 │     1      0     50
  10 │     2      0    100

julia> dose_interval_sequence(df) # count without groupby
10×4 DataFrame
 Row │ ID     EVID   AFRLT  DINTSEQ 
     │ Int64  Int64  Int64  Int64   
─────┼──────────────────────────────
   1 │     1      1     40        1
   2 │     2      1     90        2
   3 │     1      0     10        2
   4 │     2      0     60        2
   5 │     1      1     20        3
   6 │     2      1     70        4
   7 │     1      1     30        5
   8 │     2      1     80        6
   9 │     1      0     50        6
  10 │     2      0    100        6

julia> dose_interval_sequence(df, group = ["ID"])
10×4 DataFrame
 Row │ ID     EVID   AFRLT  DINTSEQ 
     │ Int64  Int64  Int64  Int64   
─────┼──────────────────────────────
   1 │     1      1     40        1
   2 │     1      0     10        1
   3 │     1      1     20        2
   4 │     1      1     30        3
   5 │     1      0     50        3
   6 │     2      1     90        1
   7 │     2      0     60        1
   8 │     2      1     70        2
   9 │     2      1     80        3
  10 │     2      0    100        3

julia> dose_interval_sequence(df, group = :ID, order = :AFRLT)
10×4 DataFrame
 Row │ ID     EVID   AFRLT  DINTSEQ 
     │ Int64  Int64  Int64  Int64?  
─────┼──────────────────────────────
   1 │     1      0     10  missing 
   2 │     1      1     20        1
   3 │     1      1     30        2
   4 │     1      1     40        3
   5 │     1      0     50        3
   6 │     2      0     60  missing 
   7 │     2      1     70        1
   8 │     2      1     80        2
   9 │     2      1     90        3
  10 │     2      0    100        3
ADaM.est_glomerular_filtration_rateMethod
est_glomerular_filtration_rate(age::Number, sex::Union{Symbol,AbstractString}; kwargs...)
est_glomerular_filtration_rate(age::Quantity, sex::Union{Symbol,AbstractString}; kwargs...)
est_glomerular_filtration_rate(df::DataFrame; kwargs...)

Calculates eGFR from age, creatinine and sex which can be provided as Quantitys, Scalars or Vectors via DataFrame. Supports various formulas:

  • CKD-EPI Creatinine 2021 Formula - ckd-epi-creat-2021 (default)
  • CKD-EPI Cystatin C 2012 Formula - ckd-epi-cyst-2012
  • MDRD Formula - mdrd (requires race parameter)

The age_unit, creat_unit, and cyst_unit can be explicitly passed as unit values or can be passed as a Vector. Follows the units from DynamicQuantities.jl

Arguments

  • age: Age value.
  • sex: Sex/gender (valid values: "M", "MALE", "F", "FEMALE", case-insensitive).
  • creat: Serum creatinine value (default: 0).
  • cyst: Cystatin C value (default: 0).
  • race: Race (default: "U"). Required for MDRD formula.
  • age_unit: Age unit (default: "yr").
  • creat_unit: Creatinine unit (default: "mg/dL").
  • cyst_unit: Cystatin C unit (default: "mg/L").
  • formula: eGFR calculation formula (default: "ckd-epi-creat-2021").

Default Columns

The following default column names are used for DataFrame input:

  • age = :AGE
  • sex = :SEX
  • creat = :CREATBL
  • cyst = :CYSTBL
  • race = :RACE
  • age_unit = :AGEU
  • creat_unit = :CREATBLU
  • cyst_unit = :CYSTBLU
  • col = :EGFRBL

Output

eGFR value in mL/min/1.73m² as a scalar or DataFrame column with unit column.

Notes

  • Age must be between 1 and 140 years.

Examples

julia> est_glomerular_filtration_rate(53, "M", creat = 90, creat_unit = "umol/L")
88.08210123133497

julia> est_glomerular_filtration_rate(53, "M", creat = 1, creat_unit = "mg/dL")
89.99656911635697

julia> est_glomerular_filtration_rate(53us"yr", "M", creat =  90us"umol/L", formula="ckd-epi-creat-2021")
88.08210123133497

julia> est_glomerular_filtration_rate(53u"yr", "M", creat = 1us"mg/dL")
89.99656911635697

julia> est_glomerular_filtration_rate(53u"yr", "M", creat = 1us"mg/dL", race="BLACK OR AFRICAN AMERICAN", formula=:mdrd)
94.7354815425664

julia> est_glomerular_filtration_rate(53u"yr", "M", cyst = 0.6us"mg/L", formula="ckd-epi-cyst-2012")
124.1483657437901

julia> df = DataFrame(
        AGE = [20, 30, 40, 53],
        CREATBL = [60, 70, 80, 90],
        SEX = ["M", "M", "F", "F"],
        AGEU = :yr,
        CREATBLU = "umol/L",
    )
4×5 DataFrame
 Row │ AGE    CREATBL  SEX     AGEU    CREATBLU 
     │ Int64  Int64    String  Symbol  String   
─────┼──────────────────────────────────────────
   1 │    20       60  M       yr      umol/L
   2 │    30       70  M       yr      umol/L
   3 │    40       80  F       yr      umol/L
   4 │    53       90  F       yr      umol/L

julia> est_glomerular_filtration_rate(df)
4×7 DataFrame
 Row │ AGE    CREATBL  SEX     AGEU    CREATBLU  EGFRBL    EGFRBLU         
     │ Int64  Int64    String  Symbol  String    Float64   String          
─────┼─────────────────────────────────────────────────────────────────────
   1 │    20       60  M       yr      umol/L    136.546   min⁻¹ mL/1.73m²
   2 │    30       70  M       yr      umol/L    122.476   min⁻¹ mL/1.73m²
   3 │    40       80  F       yr      umol/L     82.3362  min⁻¹ mL/1.73m²
   4 │    53       90  F       yr      umol/L     65.9318  min⁻¹ mL/1.73m²

julia> df = DataFrame(
        AGEYRS = [20, 30, 40, 53],
        CREAT = [60, 70, 80, 90],
        GENDER = ["M", "M", "F", "F"],
        AGEUNI = :yr,
        RACEC = "BLACK OR AFRICAN AMERICAN",
        CREATUNI = "umol/L"
    )
4×6 DataFrame
 Row │ AGEYRS  CREAT  GENDER  AGEUNI  RACEC                      CREATUNI 
     │ Int64   Int64  String  Symbol  String                     String   
─────┼────────────────────────────────────────────────────────────────────
   1 │     20     60  M       yr      BLACK OR AFRICAN AMERICAN  umol/L
   2 │     30     70  M       yr      BLACK OR AFRICAN AMERICAN  umol/L
   3 │     40     80  F       yr      BLACK OR AFRICAN AMERICAN  umol/L
   4 │     53     90  F       yr      BLACK OR AFRICAN AMERICAN  umol/L

julia> est_glomerular_filtration_rate(
        df;
        age = :AGEYRS,
        creat = :CREAT,
        sex = :GENDER,
        race = :RACEC,
        age_unit = :AGEUNI,
        creat_unit = :CREATUNI,
        col = :EGFR,
        formula=:mdrd
    )
4×8 DataFrame
 Row │ AGEYRS  CREAT  GENDER  AGEUNI  RACEC                      CREATUNI  EGFR      EGFRU           
     │ Int64   Int64  String  Symbol  String                     String    Float64   String          
─────┼───────────────────────────────────────────────────────────────────────────────────────────────
   1 │     20     60  M       yr      BLACK OR AFRICAN AMERICAN  umol/L    180.576   min⁻¹ mL/1.73m²
   2 │     30     70  M       yr      BLACK OR AFRICAN AMERICAN  umol/L    139.206   min⁻¹ mL/1.73m²
   3 │     40     80  F       yr      BLACK OR AFRICAN AMERICAN  umol/L     83.5172  min⁻¹ mL/1.73m²
   4 │     53     90  F       yr      BLACK OR AFRICAN AMERICAN  umol/L     68.8551  min⁻¹ mL/1.73m²

julia> df = DataFrame(
        AGE = [20, 30, 40, 53],
        CYSTBL = [0.6, 0.7, 0.8, 0.9],
        SEX = ["M", "M", "F", "F"],
        RACE = "ASIAN",
        AGEU = :yr,
        CYSTBLU = "mg/L",
    )

julia> est_glomerular_filtration_rate(df, formula="ckd-epi-cyst-2012")
4×8 DataFrame
 Row │ AGE    CYSTBL   SEX     RACE    AGEU    CYSTBLU  EGFRBL   EGFRBLU         
     │ Int64  Float64  String  String  Symbol  String   Float64  String          
─────┼───────────────────────────────────────────────────────────────────────────
   1 │    20      0.6  M       ASIAN   yr      mg/L     141.704  min⁻¹ mL/1.73m²
   2 │    30      0.7  M       ASIAN   yr      mg/L     126.058  min⁻¹ mL/1.73m²
   3 │    40      0.8  F       ASIAN   yr      mg/L     105.594  min⁻¹ mL/1.73m²
   4 │    53      0.9  F       ASIAN   yr      mg/L      85.72   min⁻¹ mL/1.73m²
ADaM.expand_dose_eventsMethod
expand_dose_events(df::DataFrame; start_dtm, end_dtm, trt_start_dtm, dose_freq)
  • Returns expanded dataset when you pass EX (dose) dataset as input with all columns.
  • Creates unique row for every dosing in a dosing period for a subject.
  • Generates NFRLT (Nominal time relative to first dose) that varies across every row based on dosing frequency.
  • All other column values will be duplicated on expansion.

Arguments

  • start_dtm: Column name containing the start datetime for the dosing period
  • end_dtm: Column name containing the end datetime for the dosing period
  • trt_start_dtm: Column name containing the first analyte dose datetime (used as reference for NFRLT calculation)
  • dose_freq: Column name containing the dosing frequency

Default Columns

The following default column names are used for DataFrame input:

  • start_dtm = :ASTDTM
  • end_dtm = :AENDTM
  • trt_start_dtm = :FANLDTM
  • dose_freq = :EXDOSFRQ

Notes

If dosing frequency == 'ONCE' or start date == end date; no expansion happens.

Example

julia> df = DataFrame(USUBJID = "1", 
                      EVID = 1, 
                      ASTDTM = DateTime.(["2025-01-01", "2025-02-01", "2025-03-01"]), 
                      AENDTM = DateTime.(["2025-01-31", "2025-02-28", "2025-03-31"]), 
                      EXDOSFRQ = "QD", 
                      FANLDTM = DateTime.("2025-01-01"))
3×6 DataFrame
 Row │ USUBJID  EVID   ASTDTM               AENDTM               EXDOSFRQ  FANLDTM             
     │ String   Int64  DateTime             DateTime             String    DateTime            
─────┼─────────────────────────────────────────────────────────────────────────────────────────
   1 │ 1            1  2025-01-01T00:00:00  2025-01-31T00:00:00  QD        2025-01-01T00:00:00
   2 │ 1            1  2025-02-01T00:00:00  2025-02-28T00:00:00  QD        2025-01-01T00:00:00
   3 │ 1            1  2025-03-01T00:00:00  2025-03-31T00:00:00  QD        2025-01-01T00:00:00

julia> expand_dose_events(df) # QD
90×7 DataFrame
 Row │ USUBJID  EVID   ASTDTM               AENDTM               EXDOSFRQ  FANLDTM              NFRLT   
     │ String   Int64  DateTime             DateTime             String    DateTime             Float64 
─────┼──────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ 1            1  2025-01-01T00:00:00  2025-01-31T00:00:00  QD        2025-01-01T00:00:00      0.0
   2 │ 1            1  2025-01-02T00:00:00  2025-01-31T00:00:00  QD        2025-01-01T00:00:00     24.0
   3 │ 1            1  2025-01-03T00:00:00  2025-01-31T00:00:00  QD        2025-01-01T00:00:00     48.0
   4 │ 1            1  2025-01-04T00:00:00  2025-01-31T00:00:00  QD        2025-01-01T00:00:00     72.0
   5 │ 1            1  2025-01-05T00:00:00  2025-01-31T00:00:00  QD        2025-01-01T00:00:00     96.0
   6 │ 1            1  2025-01-06T00:00:00  2025-01-31T00:00:00  QD        2025-01-01T00:00:00    120.0
  ⋮  │    ⋮       ⋮             ⋮                    ⋮              ⋮               ⋮              ⋮
  86 │ 1            1  2025-03-27T00:00:00  2025-03-31T00:00:00  QD        2025-01-01T00:00:00   2040.0
  87 │ 1            1  2025-03-28T00:00:00  2025-03-31T00:00:00  QD        2025-01-01T00:00:00   2064.0
  88 │ 1            1  2025-03-29T00:00:00  2025-03-31T00:00:00  QD        2025-01-01T00:00:00   2088.0
  89 │ 1            1  2025-03-30T00:00:00  2025-03-31T00:00:00  QD        2025-01-01T00:00:00   2112.0
  90 │ 1            1  2025-03-31T00:00:00  2025-03-31T00:00:00  QD        2025-01-01T00:00:00   2136.0
                                                                                         79 rows omitted

julia> df = DataFrame(USUBJID = "1", 
                      EVID = 1, 
                      ASTDTM = DateTime.(["2025-01-01", "2025-02-01", "2025-03-01"]), 
                      AENDTM = DateTime.(["2025-01-31", "2025-02-28", "2025-03-31"]), 
                      EXDOSFRQ = "EVERY WEEK", 
                      FANLDTM = DateTime.("2025-01-01"))
3×6 DataFrame
 Row │ USUBJID  EVID   ASTDTM               AENDTM               EXDOSFRQ    FANLDTM             
     │ String   Int64  DateTime             DateTime             String      DateTime            
─────┼───────────────────────────────────────────────────────────────────────────────────────────
   1 │ 1            1  2025-01-01T00:00:00  2025-01-31T00:00:00  EVERY WEEK  2025-01-01T00:00:00
   2 │ 1            1  2025-02-01T00:00:00  2025-02-28T00:00:00  EVERY WEEK  2025-01-01T00:00:00
   3 │ 1            1  2025-03-01T00:00:00  2025-03-31T00:00:00  EVERY WEEK  2025-01-01T00:00:00

julia> expand_dose_events(df) # EVERY WEEK
14×7 DataFrame
 Row │ USUBJID  EVID   ASTDTM               AENDTM               EXDOSFRQ    FANLDTM              NFRLT   
     │ String   Int64  DateTime             DateTime             String      DateTime             Float64 
─────┼────────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ 1            1  2025-01-01T00:00:00  2025-01-31T00:00:00  EVERY WEEK  2025-01-01T00:00:00      0.0
   2 │ 1            1  2025-01-08T00:00:00  2025-01-31T00:00:00  EVERY WEEK  2025-01-01T00:00:00    168.0
   3 │ 1            1  2025-01-15T00:00:00  2025-01-31T00:00:00  EVERY WEEK  2025-01-01T00:00:00    336.0
   4 │ 1            1  2025-01-22T00:00:00  2025-01-31T00:00:00  EVERY WEEK  2025-01-01T00:00:00    504.0
   5 │ 1            1  2025-01-29T00:00:00  2025-01-31T00:00:00  EVERY WEEK  2025-01-01T00:00:00    672.0
   6 │ 1            1  2025-02-01T00:00:00  2025-02-28T00:00:00  EVERY WEEK  2025-01-01T00:00:00    744.0
   7 │ 1            1  2025-02-08T00:00:00  2025-02-28T00:00:00  EVERY WEEK  2025-01-01T00:00:00    912.0
   8 │ 1            1  2025-02-15T00:00:00  2025-02-28T00:00:00  EVERY WEEK  2025-01-01T00:00:00   1080.0
   9 │ 1            1  2025-02-22T00:00:00  2025-02-28T00:00:00  EVERY WEEK  2025-01-01T00:00:00   1248.0
  10 │ 1            1  2025-03-01T00:00:00  2025-03-31T00:00:00  EVERY WEEK  2025-01-01T00:00:00   1416.0
  11 │ 1            1  2025-03-08T00:00:00  2025-03-31T00:00:00  EVERY WEEK  2025-01-01T00:00:00   1584.0
  12 │ 1            1  2025-03-15T00:00:00  2025-03-31T00:00:00  EVERY WEEK  2025-01-01T00:00:00   1752.0
  13 │ 1            1  2025-03-22T00:00:00  2025-03-31T00:00:00  EVERY WEEK  2025-01-01T00:00:00   1920.0
  14 │ 1            1  2025-03-29T00:00:00  2025-03-31T00:00:00  EVERY WEEK  2025-01-01T00:00:00   2088.0

Example with custom column names using kwargs

julia> df_custom = DataFrame(USUBJID = "1", 
                             EVID = 1, 
                             START_TIME = DateTime.(["2025-01-01", "2025-01-08"]), 
                             END_TIME = DateTime.(["2025-01-07", "2025-01-14"]), 
                             FREQUENCY = ["QD", "BID"], 
                             FIRST_DOSE = DateTime.("2025-01-01"))
2×6 DataFrame
 Row │ USUBJID  EVID   START_TIME           END_TIME             FREQUENCY  FIRST_DOSE          
     │ String   Int64  DateTime             DateTime             String     DateTime            
─────┼─────────────────────────────────────────────────────────────────────────────────────────
   1 │ 1            1  2025-01-01T00:00:00  2025-01-07T00:00:00  QD         2025-01-01T00:00:00
   2 │ 1            1  2025-01-08T00:00:00  2025-01-14T00:00:00  BID        2025-01-01T00:00:00

julia> expand_dose_events(df_custom, 
                          start_dtm = :START_TIME,
                          end_dtm = :END_TIME, 
                          trt_start_dtm = :FIRST_DOSE, 
                          dose_freq = :FREQUENCY)
20×7 DataFrame
 Row │ USUBJID  EVID   START_TIME           END_TIME             FREQUENCY  FIRST_DOSE           NFRLT   
     │ String   Int64  DateTime             DateTime             String     DateTime             Float64 
─────┼───────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ 1            1  2025-01-01T00:00:00  2025-01-07T00:00:00  QD         2025-01-01T00:00:00      0.0
   2 │ 1            1  2025-01-01T00:00:00  2025-01-07T00:00:00  QD         2025-01-01T00:00:00     24.0
   3 │ 1            1  2025-01-01T00:00:00  2025-01-07T00:00:00  QD         2025-01-01T00:00:00     48.0
   4 │ 1            1  2025-01-01T00:00:00  2025-01-07T00:00:00  QD         2025-01-01T00:00:00     72.0
   5 │ 1            1  2025-01-01T00:00:00  2025-01-07T00:00:00  QD         2025-01-01T00:00:00     96.0
   6 │ 1            1  2025-01-01T00:00:00  2025-01-07T00:00:00  QD         2025-01-01T00:00:00    120.0
   7 │ 1            1  2025-01-01T00:00:00  2025-01-07T00:00:00  QD         2025-01-01T00:00:00    144.0
   8 │ 1            1  2025-01-08T00:00:00  2025-01-14T00:00:00  BID        2025-01-01T00:00:00    168.0
   9 │ 1            1  2025-01-08T00:00:00  2025-01-14T00:00:00  BID        2025-01-01T00:00:00    180.0
  10 │ 1            1  2025-01-08T00:00:00  2025-01-14T00:00:00  BID        2025-01-01T00:00:00    192.0
  11 │ 1            1  2025-01-08T00:00:00  2025-01-14T00:00:00  BID        2025-01-01T00:00:00    204.0
  12 │ 1            1  2025-01-08T00:00:00  2025-01-14T00:00:00  BID        2025-01-01T00:00:00    216.0
  13 │ 1            1  2025-01-08T00:00:00  2025-01-14T00:00:00  BID        2025-01-01T00:00:00    228.0
  14 │ 1            1  2025-01-08T00:00:00  2025-01-14T00:00:00  BID        2025-01-01T00:00:00    240.0
  15 │ 1            1  2025-01-08T00:00:00  2025-01-14T00:00:00  BID        2025-01-01T00:00:00    252.0
  16 │ 1            1  2025-01-08T00:00:00  2025-01-14T00:00:00  BID        2025-01-01T00:00:00    264.0
  17 │ 1            1  2025-01-08T00:00:00  2025-01-14T00:00:00  BID        2025-01-01T00:00:00    276.0
  18 │ 1            1  2025-01-08T00:00:00  2025-01-14T00:00:00  BID        2025-01-01T00:00:00    288.0
  19 │ 1            1  2025-01-08T00:00:00  2025-01-14T00:00:00  BID        2025-01-01T00:00:00    300.0
  20 │ 1            1  2025-01-08T00:00:00  2025-01-14T00:00:00  BID        2025-01-01T00:00:00    312.0

Example with ONCE dosing (no expansion)

julia> df_once = DataFrame(USUBJID = "1", 
                           EVID = 1, 
                           ASTDTM = DateTime.("2025-01-01"), 
                           AENDTM = DateTime.("2025-01-01"), 
                           EXDOSFRQ = "ONCE", 
                           FANLDTM = DateTime.("2025-01-01"))
1×6 DataFrame
 Row │ USUBJID  EVID   ASTDTM               AENDTM               EXDOSFRQ  FANLDTM             
     │ String   Int64  DateTime             DateTime             String    DateTime            
─────┼─────────────────────────────────────────────────────────────────────────────────────────
   1 │ 1            1  2025-01-01T00:00:00  2025-01-01T00:00:00  ONCE      2025-01-01T00:00:00

julia> expand_dose_events(df_once)
1×7 DataFrame
 Row │ USUBJID  EVID   ASTDTM               AENDTM               EXDOSFRQ  FANLDTM              NFRLT   
     │ String   Int64  DateTime             DateTime             String    DateTime             Float64 
─────┼──────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ 1            1  2025-01-01T00:00:00  2025-01-01T00:00:00  ONCE      2025-01-01T00:00:00      0.0

Example with different dosing frequencies

julia> df_freq = DataFrame(USUBJID = ["1", "1", "1"], 
                           EVID = [1, 1, 1], 
                           ASTDTM = DateTime.(["2025-01-01", "2025-01-01", "2025-01-01"]), 
                           AENDTM = DateTime.(["2025-01-03", "2025-01-03", "2025-01-02"]),
                           EXDOSFRQ = ["TID", "Q8H", "Q12H"], 
                           FANLDTM = DateTime.(["2025-01-01", "2025-01-01", "2025-01-01"]))
3×6 DataFrame
 Row │ USUBJID  EVID   ASTDTM               AENDTM               EXDOSFRQ  FANLDTM             
     │ String   Int64  DateTime             DateTime             String    DateTime            
─────┼─────────────────────────────────────────────────────────────────────────────────────────
   1 │ 1            1  2025-01-01T00:00:00  2025-01-03T00:00:00  TID       2025-01-01T00:00:00
   2 │ 1            1  2025-01-01T00:00:00  2025-01-03T00:00:00  Q8H       2025-01-01T00:00:00
   3 │ 1            1  2025-01-01T00:00:00  2025-01-02T00:00:00  Q12H      2025-01-01T00:00:00

julia> expand_dose_events(df_freq)
17×7 DataFrame
 Row │ USUBJID  EVID   ASTDTM               AENDTM               EXDOSFRQ  FANLDTM              NFRLT   
     │ String   Int64  DateTime             DateTime             String    DateTime             Float64 
─────┼──────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ 1            1  2025-01-01T00:00:00  2025-01-03T00:00:00  TID       2025-01-01T00:00:00      0.0
   2 │ 1            1  2025-01-01T08:00:00  2025-01-03T00:00:00  TID       2025-01-01T00:00:00      8.0
   3 │ 1            1  2025-01-01T16:00:00  2025-01-03T00:00:00  TID       2025-01-01T00:00:00     16.0
   4 │ 1            1  2025-01-02T00:00:00  2025-01-03T00:00:00  TID       2025-01-01T00:00:00     24.0
   5 │ 1            1  2025-01-02T08:00:00  2025-01-03T00:00:00  TID       2025-01-01T00:00:00     32.0
   6 │ 1            1  2025-01-02T16:00:00  2025-01-03T00:00:00  TID       2025-01-01T00:00:00     40.0
   7 │ 1            1  2025-01-03T00:00:00  2025-01-03T00:00:00  TID       2025-01-01T00:00:00     48.0
   8 │ 1            1  2025-01-01T00:00:00  2025-01-03T00:00:00  Q8H       2025-01-01T00:00:00      0.0
   9 │ 1            1  2025-01-01T08:00:00  2025-01-03T00:00:00  Q8H       2025-01-01T00:00:00      8.0
  10 │ 1            1  2025-01-01T16:00:00  2025-01-03T00:00:00  Q8H       2025-01-01T00:00:00     16.0
  11 │ 1            1  2025-01-02T00:00:00  2025-01-03T00:00:00  Q8H       2025-01-01T00:00:00     24.0
  12 │ 1            1  2025-01-02T08:00:00  2025-01-03T00:00:00  Q8H       2025-01-01T00:00:00     32.0
  13 │ 1            1  2025-01-02T16:00:00  2025-01-03T00:00:00  Q8H       2025-01-01T00:00:00     40.0
  14 │ 1            1  2025-01-03T00:00:00  2025-01-03T00:00:00  Q8H       2025-01-01T00:00:00     48.0
  15 │ 1            1  2025-01-01T00:00:00  2025-01-02T00:00:00  Q12H      2025-01-01T00:00:00      0.0
  16 │ 1            1  2025-01-01T12:00:00  2025-01-02T00:00:00  Q12H      2025-01-01T00:00:00     12.0
  17 │ 1            1  2025-01-02T00:00:00  2025-01-02T00:00:00  Q12H      2025-01-01T00:00:00     24.0
ADaM.interdose_intervalMethod
interdose_interval(df::DataFrame)

Returns Inter-dose Interval(II hours) when a valid dose freqency is passed. Dose frequencies are defined using CDISC SDTM Controlled Terminology. The assumptions made are

  • A day is 24 hours
  • A week is 7 days
  • A month is 30 days
  • An year is 52 weeks

Example

julia> interdose_interval("QD")
24

julia> interdose_interval("EVERY AFTERNOON")
24

julia> interdose_interval("Q45MIN")
0.75

julia> interdose_interval("QY")
8736

julia> df = DataFrame(USUBJID = "1", EXDOSFRQ = ["QD", "BID", "QD", "TID"])
4×2 DataFrame
 Row │ USUBJID  EXDOSFRQ 
     │ String   String   
─────┼───────────────────
   1 │ 1        QD
   2 │ 1        BID
   3 │ 1        QD
   4 │ 1        TID

julia> @rtransform! df :II = interdose_interval(:EXDOSFRQ)
4×3 DataFrame
 Row │ USUBJID  EXDOSFRQ  II   
     │ String   String    Real 
─────┼─────────────────────────
   1 │ 1        QD        24
   2 │ 1        BID       12.0
   3 │ 1        QD        24
   4 │ 1        TID        8.0
ADaM.join_columnsMethod
join_columns(target, reference; on, order, rev_order, keep, filter_ref, filter_join, mode)

Add columns from a reference DataFrame to a target DataFrame with flexible matching and filtering.

Positional Arguments

  • target: Main DataFrame to join to
  • reference: DataFrame to join columns from

Keyword Arguments

  • on: Column(s) to match on (join keys)
  • order: Column(s) to sort matches by (optional)
  • rev_order: Boolean vector indicating descending sort for each order column (default: all ascending)
  • keep: Vector of source => destination pairs for column mapping (default: all reference columns)
  • filter_ref: Function to pre-filter reference rows (default: include all)
  • filter_join: Function (target_row, ref_row) to filter matches (default: include all)
  • mode: Take :first or :last match after sorting (default: :first)

Examples

julia> patients = DataFrame(SUBJID = ["001", "002", "003"], 
                            VISIT_DATE = [Date("2023-01-15"), Date("2023-01-20"), Date("2023-01-25")])
3×2 DataFrame
 Row │ SUBJID  VISIT_DATE 
     │ String  Date       
─────┼────────────────────
   1 │ 001     2023-01-15
   2 │ 002     2023-01-20
   3 │ 003     2023-01-25

julia> lab_data = DataFrame(SUBJID = ["001", "001", "001", "002", "002", "003"],
                            LAB_DATE = [Date("2023-01-01"), Date("2023-01-10"), Date("2023-01-18"), 
                                        Date("2023-01-05"), Date("2023-01-25"), Date("2023-01-20")],
                            LAB_VALUE = [100, 95, 88, 110, 105, 92])
6×3 DataFrame
 Row │ SUBJID  LAB_DATE    LAB_VALUE 
     │ String  Date        Int64     
─────┼───────────────────────────────
   1 │ 001     2023-01-01        100
   2 │ 001     2023-01-10         95
   3 │ 001     2023-01-18         88
   4 │ 002     2023-01-05        110
   5 │ 002     2023-01-25        105
   6 │ 003     2023-01-20         92

julia> join_columns(patients,
                    lab_data;
                    on = [:SUBJID],
                    order = [:LAB_DATE],
                    keep = [:LAB_VALUE => :PRIOR_LAB, :LAB_DATE => :PRIOR_LAB_DATE],
                    filter_join = (t, r) -> r.LAB_DATE < t.VISIT_DATE,
                    mode = :last)
3×4 DataFrame
 Row │ SUBJID  VISIT_DATE  PRIOR_LAB  PRIOR_LAB_DATE 
     │ String  Date        Int64?     Date?          
─────┼───────────────────────────────────────────────
   1 │ 001     2023-01-15         95  2023-01-10
   2 │ 002     2023-01-20        110  2023-01-05
   3 │ 003     2023-01-25         92  2023-01-20
ADaM.make_dose_interruptionMethod
make_dose_interruption(df::DataFrame)
  • Creates dose interruption columns INTRFL (Interruption Flag) and INTRDUR (Interruption Duration) on the EX (dose) dataset where doses are recorded as intervals.
  • If there is any gap(more than a day) between end date AENDT of one dosing interval and start date ASTDT of the next interval within a subject, the interruption will be recorded in rows. If there is no interruption, these columns will be missing.

The EX dataset needs to be prepared with the following columns :

  • ASTDT (Analysis Start Date)
  • AENDT (Analysis End Date)
  • FANLDTM (First Anaalyte dose DateTime)

Example

julia> df = DataFrame(USUBJID = "1", EXDOSFRQ = "QD", EVID = 1, ASTDT = Date.(["2025-01-01", "2025-02-01", "2025-03-03"]), AENDT = Date.(["2025-01-31", "2025-02-28", "2025-03-31"]), FANLDTM = Date.("2025-01-01"))
3×6 DataFrame
 Row │ USUBJID  EXDOSFRQ  EVID   ASTDT       AENDT       FANLDTM    
     │ String   String    Int64  Date        Date        Date       
─────┼──────────────────────────────────────────────────────────────
   1 │ 1        QD            1  2025-01-01  2025-01-31  2025-01-01
   2 │ 1        QD            1  2025-02-01  2025-02-28  2025-01-01
   3 │ 1        QD            1  2025-03-03  2025-03-31  2025-01-01

julia> make_dose_interruption(df)
4×10 DataFrame
 Row │ USUBJID  EXDOSFRQ  EVID   ASTDT       AENDT       FANLDTM     ASTDY  AENDY  INTRFL   INTRDUR 
     │ String   String    Int64  Date        Date        Date        Int64  Int64  String?  Int64?  
─────┼──────────────────────────────────────────────────────────────────────────────────────────────
   1 │ 1        QD            1  2025-01-01  2025-01-31  2025-01-01      1     31  missing  missing 
   2 │ 1        QD            1  2025-02-01  2025-02-28  2025-01-01     32     59  missing  missing 
   3 │ 1        QD            1  2025-03-01  2025-03-02  2025-01-01     60     61  Y              2
   4 │ 1        QD            1  2025-03-03  2025-03-31  2025-01-01     62     90  missing  missing
ADaM.make_dtmMethod
make_dtm(df::DataFrame, col; prefix, fmt)
  • Derives DateTime column from DateTime-convertible String/Date column in a DataFrame.
  • The DateTime column with of custom prefix(kwarg) suffix 'DTM' will be of the format yyyy-mm-ddTHH:MM:SS.
  • Custom DateTime formats can be passes through the fmt keyword.

Example

julia> df = DataFrame(DTC = ["2024", "2025", "2026"])
3×1 DataFrame
 Row │ DTC    
     │ String 
─────┼────────
   1 │ 2024
   2 │ 2025
   3 │ 2026

julia> make_dtm(df, "DTC")
3×2 DataFrame
 Row │ DTC     ADTM                
     │ String  DateTime            
─────┼─────────────────────────────
   1 │ 2024    2024-01-01T00:00:00
   2 │ 2025    2025-01-01T00:00:00
   3 │ 2026    2026-01-01T00:00:00

julia> df = DataFrame(DTC = Date.(["2024", "2025", "2026"]))
3×1 DataFrame
 Row │ DTC        
     │ Date       
─────┼────────────
   1 │ 2024-01-01
   2 │ 2025-01-01
   3 │ 2026-01-01

julia> make_dtm(df, "DTC", prefix = "B")
3×2 DataFrame
 Row │ DTC         BDTM                
     │ Date        DateTime            
─────┼─────────────────────────────────
   1 │ 2024-01-01  2024-01-01T00:00:00
   2 │ 2025-01-01  2025-01-01T00:00:00
   3 │ 2026-01-01  2026-01-01T00:00:00

Input col and prefix can be passed as either String or Symbol.

julia> df = DataFrame(DTC = ["2024", "2025", "2026"])
3×1 DataFrame
 Row │ DTC    
     │ String 
─────┼────────
   1 │ 2024
   2 │ 2025
   3 │ 2026

julia> make_dtm(df, :DTC, prefix = :AST)
3×2 DataFrame
 Row │ DTC     ASTDTM              
     │ String  DateTime            
─────┼─────────────────────────────
   1 │ 2024    2024-01-01T00:00:00
   2 │ 2025    2025-01-01T00:00:00
   3 │ 2026    2026-01-01T00:00:00 

For a different DateTime Format. Valid Julia Date Formats

julia> df = DataFrame(DTC = ["2024-May-21T01:10", "2024-Jun-22T02:20", "2024-Aug-23T03:30"])
3×1 DataFrame
 Row │ DTC               
     │ String            
─────┼───────────────────
   1 │ 2024-May-21T01:10
   2 │ 2024-Jun-22T02:20
   3 │ 2024-Aug-23T03:30

julia> make_dtm(df, "DTC", prefix = "C", fmt = "yyyy-u-ddTHH:MM")
3×2 DataFrame
 Row │ DTC                CDTM                
     │ String             DateTime            
─────┼────────────────────────────────────────
   1 │ 2024-May-21T01:10  2024-05-21T01:10:00
   2 │ 2024-Jun-22T02:20  2024-06-22T02:20:00
   3 │ 2024-Aug-23T03:30  2024-08-23T03:30:00

julia> make_dtm(df, :DTC, prefix = "C", fmt = dateformat"yyyy-u-ddTHH:MM") # dateformat object
3×2 DataFrame
 Row │ DTC                CDTM                
     │ String             DateTime            
─────┼────────────────────────────────────────
   1 │ 2024-May-21T01:10  2024-05-21T01:10:00
   2 │ 2024-Jun-22T02:20  2024-06-22T02:20:00
   3 │ 2024-Aug-23T03:30  2024-08-23T03:30:00
ADaM.make_dtm_to_dtMethod
make_dtm_to_dt(df::DataFrame, col; prefix)
  • Derives Date column from DateTime column in a DataFrame.
  • The Time column with suffix 'TM' will be of the format yyyy-mm-dd.
  • Input col and prefix can be passed as either String or Symbol.

Example

julia> df = DataFrame(ADTM = DateTime.(["2024-10-21T01:10", "2024-11-22T02:20", "2024-12-23T03:30"]))
3×1 DataFrame
 Row │ ADTM                
     │ DateTime            
─────┼─────────────────────
   1 │ 2024-10-21T01:10:00
   2 │ 2024-11-22T02:20:00
   3 │ 2024-12-23T03:30:00

julia> make_dtm_to_dt(df, "ADTM", prefix = "A")
3×2 DataFrame
 Row │ ADTM                 ADT        
     │ DateTime             Date       
─────┼─────────────────────────────────
   1 │ 2024-10-21T01:10:00  2024-10-21
   2 │ 2024-11-22T02:20:00  2024-11-22
   3 │ 2024-12-23T03:30:00  2024-12-23

julia> make_dtm_to_dt(df, "ADTM")
3×2 DataFrame
 Row │ ADTM                 ADT        
     │ DateTime             Date       
─────┼─────────────────────────────────
   1 │ 2024-10-21T01:10:00  2024-10-21
   2 │ 2024-11-22T02:20:00  2024-11-22
   3 │ 2024-12-23T03:30:00  2024-12-23
ADaM.make_dtm_to_tmMethod
make_dtm_to_tm(df::DataFrame, col; prefix)
  • Derives Time column from DateTime column in a DataFrame.
  • The Time column with suffix 'TM' will be of the format hh:mm:ss.
  • Input col and prefix can be passed as either String or Symbol.

Example

julia> df = DataFrame(ADTM = DateTime.(["2024-10-21T01:10", "2024-11-22T02:20", "2024-12-23T03:30"]))
3×1 DataFrame
 Row │ ADTM                
     │ DateTime            
─────┼─────────────────────
   1 │ 2024-10-21T01:10:00
   2 │ 2024-11-22T02:20:00
   3 │ 2024-12-23T03:30:00

julia> make_dtm_to_tm(df, "ADTM", prefix = "A")
3×2 DataFrame
 Row │ ADTM                 ATM      
     │ DateTime             Time     
─────┼───────────────────────────────
   1 │ 2024-10-21T01:10:00  01:10:00
   2 │ 2024-11-22T02:20:00  02:20:00
   3 │ 2024-12-23T03:30:00  03:30:00

julia> make_dtm_to_tm(df, "ADTM")
3×2 DataFrame
 Row │ ADTM                 ATM      
     │ DateTime             Time     
─────┼───────────────────────────────
   1 │ 2024-10-21T01:10:00  01:10:00
   2 │ 2024-11-22T02:20:00  02:20:00
   3 │ 2024-12-23T03:30:00  03:30:00
ADaM.make_durationMethod
make_duration(; start_dtm, end_dtm, output_unit)
make_duration(df, col; start_dtm, end_dtm, output_unit)

Computes the duration between two dates or datetimes and returns the result in the specified unit. Dates can be passed as Scalars or Vectors via DataFrame.`

Arguments:

  • start_dtm: The starting date or datetime.
  • end_dtm: The ending date or datetime.
  • output_unit: The unit for the output duration (e.g., :h for hours, :min for minutes). Default is :h.

Valid time units supported by DynamicQuantities.jl.

Returns:

  • The DataFrame with new columns (col) for the duration, its unit, and the stripped value.

Examples

Scalar Dates

julia> make_duration(start_dtm=DateTime(2024, 1, 1, 8), end_dtm=DateTime(2024, 1, 1, 10), output_unit=:h)
2.0 h

julia> make_duration(start_dtm=Date(2024, 1, 1), end_dtm=Date(2024, 1, 2), output_unit=:h)
24.0 h

Date columns

julia> df = DataFrame(ASTDT = [Date(2024,1,1), Date(2024,2,1)], 
                      AENDT = [Date(2024,3,1), Date(2024,4,1)])
2×2 DataFrame
 Row │ ASTDT       AENDT      
     │ Date        Date       
─────┼────────────────────────
   1 │ 2024-01-01  2024-03-01
   2 │ 2024-02-01  2024-04-01

julia> make_duration(df, :ADUR; start_dtm=:ASTDT, end_dtm=:AENDT, output_unit=:day)
2×4 DataFrame
 Row │ ASTDT       AENDT       ADUR     ADURU     
     │ Date        Date        Float64  Symbolic… 
─────┼────────────────────────────────────────────
   1 │ 2024-01-01  2024-03-01     60.0  day
   2 │ 2024-02-01  2024-04-01     60.0  day

DateTime columns

julia> df = DataFrame(ASTDTM = [DateTime(2024,1,1,8), DateTime(2024,1,1,9)], 
                      AENDTM = [DateTime(2024,1,1,10), DateTime(2024,1,1,12)])
2×2 DataFrame
 Row │ ASTDTM               AENDTM              
     │ DateTime             DateTime            
─────┼──────────────────────────────────────────
   1 │ 2024-01-01T08:00:00  2024-01-01T10:00:00
   2 │ 2024-01-01T09:00:00  2024-01-01T12:00:00

julia> make_duration(df, :ADUR; start_dtm=:ASTDTM, end_dtm=:AENDTM, output_unit=:h)
2×4 DataFrame
 Row │ ASTDTM               AENDTM               ADUR     ADURU     
     │ DateTime             DateTime             Float64  Symbolic… 
─────┼──────────────────────────────────────────────────────────────
   1 │ 2024-01-01T08:00:00  2024-01-01T10:00:00      2.0  h
   2 │ 2024-01-01T09:00:00  2024-01-01T12:00:00      3.0  h
   
julia> make_duration(df, :ADUR; start_dtm=:ASTDTM, end_dtm=:AENDTM, output_unit=:day)
2×4 DataFrame
 Row │ ASTDTM               AENDTM               ADUR       ADURU     
     │ DateTime             DateTime             Float64    Symbolic… 
─────┼────────────────────────────────────────────────────────────────
   1 │ 2024-01-01T08:00:00  2024-01-01T10:00:00  0.0833333  day
   2 │ 2024-01-01T09:00:00  2024-01-01T12:00:00  0.125      day
ADaM.make_nominal_timeMethod
derive_nominal_time(pc, ex, domain; visitdy_col, atptn_col)

Derives the nominal time column NFRLT for PC and EX SDTM datasets.

Arguments

  • pc: PC DataFrame (must contain ATPTN)
  • ex: EX DataFrame
  • domain: The output DataFrame (PC or EX)
  • visitdy_col: Column name for VISITDY (Symbol or String, default = :VISITDY)
  • atptn_col: Column name for ATPTN (Symbol or String, default = :ATPTN)

Output

  • Returns a DataFrame (PC or EX) with a new NFRLT column representing nominal time in hours.

Notes

  • For PC: NFRLT = (VISITDY - 1) * 24 + ATPTN
  • For EX: NFRLT = (VISITDY - 1) * 24
  • If VISITDY is missing in either dataset, it is joined from the other dataset using common keys (excluding ATPTN for PC).
  • Throws an error if required columns are missing or if join keys cannot be determined.

Examples

# Example PC and EX DataFrames
julia> pc = DataFrame(DOMAIN = ["PC", "PC"], USUBJID = ["01", "01"], VISITDY = [2, 3], ATPTN = [0, 24])
2×4 DataFrame
 Row │ DOMAIN  USUBJID  VISITDY  ATPTN 
     │ String  String   Int64    Int64 
─────┼─────────────────────────────────
   1 │ PC      01             2      0
   2 │ PC      01             3     24

julia> ex = DataFrame(DOMAIN = ["EX", "EX"], USUBJID = ["01", "01"], VISITDY = [2, 3])
2×3 DataFrame
 Row │ DOMAIN  USUBJID  VISITDY 
     │ String  String   Int64   
─────┼──────────────────────────
   1 │ EX      01             2
   2 │ EX      01             3

# Example 1: Standard usage for PC
julia> make_nominal_time(pc, ex, "PC")
2×5 DataFrame
 Row │ DOMAIN  USUBJID  VISITDY  ATPTN  NFRLT 
     │ String  String   Int64    Int64  Int64 
─────┼────────────────────────────────────────
   1 │ PC      01             2      0     24
   2 │ PC      01             3     24     72

# Example 1: Standard usage for PC
julia> make_nominal_time(pc, ex, "EX")
2×4 DataFrame
 Row │ DOMAIN  USUBJID  VISITDY  NFRLT 
     │ String  String   Int64    Int64 
─────┼─────────────────────────────────
   1 │ EX      01             2     24
   2 │ EX      01             3     48

# Example 2: Custom column names
julia> rename!(pc, :VISITDY => :VISDY, :ATPTN => :TIMEPT)
2×4 DataFrame
 Row │ DOMAIN  USUBJID  VISDY  TIMEPT 
     │ String  String   Int64  Int64  
─────┼────────────────────────────────
   1 │ PC      01           2       0
   2 │ PC      01           3      24

julia> make_nominal_time(pc, ex, "PC"; visitdy_col=:VISDY, atptn_col=:TIMEPT)
2×5 DataFrame
 Row │ DOMAIN  USUBJID  VISDY  TIMEPT  NFRLT 
     │ String  String   Int64  Int64   Int64 
─────┼───────────────────────────────────────
   1 │ PC      01           2       0     24
   2 │ PC      01           3      24     72
ADaM.make_time_flMethod
make_time_fl(df::DataFrame, col; group, order)

This function creates flag columns (FN (Flag Numeric) and FC (Flag comment)), that flags the time information based on following criteria.

  • Missing Times
  • Negative Times
  • Non-ascending Times

The points to be considered for data validation can be referred through this link: https://www.lexjansen.com/phuse-us/2020/as/AS06.pdf#page=3

order and group variables can be used while flagging.

Example

julia> df = DataFrame(USUBJID = 1, AFRLT = [0, -1, -2, 3, missing, 5, 4])
10×2 DataFrame
 Row │ USUBJID  AFRLT   
     │ Int64    Int64?  
─────┼──────────────────
   1 │       1        0
   2 │       1       -1
   3 │       1       -2
   4 │       1        3
   5 │       1  missing 
   6 │       1        5
   7 │       1        4

julia> make_time_fl(df, :AFRLT, group = [], order = []) # group variables can be kept empty for dataframe without group.
7×4 DataFrame
 Row │ USUBJID  AFRLT    AFRLTFN  AFRLTFC            
     │ Int64    Int64?   Int64    String             
─────┼───────────────────────────────────────────────
   1 │       1        0        0
   2 │       1       -1        1  Negative Time
   3 │       1       -2        1  Negative Time
   4 │       1        3        0
   5 │       1  missing        1  Missing Time
   6 │       1        5        0
   7 │       1        4        1  Non-ascending Time

julia> df1 = DataFrame(USUBJID = [1,1,1,2,2,2], NFRLT = [missing, 2, 1, 0, 1, -1])
6×2 DataFrame
 Row │ USUBJID  NFRLT   
     │ Int64    Int64?  
─────┼──────────────────
   1 │       1  missing 
   2 │       1        2
   3 │       1        1
   4 │       2        0
   5 │       2        1
   6 │       2       -1

julia> make_time_fl(df1, "NFRLT", group=["USUBJID"])
6×4 DataFrame
 Row │ USUBJID  NFRLT    NFRLTFN  NFRLTFC            
     │ Int64    Int64?   Int64    String             
─────┼───────────────────────────────────────────────
   1 │       1  missing        1  Missing Time
   2 │       1        2        0
   3 │       1        1        1  Non-ascending Time
   4 │       2        0        0
   5 │       2        1        0
   6 │       2       -1        1  Negative Time
ADaM.merge_columnsMethod
 merge_columns(df1, df2; filter_func, keep, kwargs...)
  • leftjoin's new column(s) to the primary dataset based on columns from a secondary dataset.

Positional Arguments

  • df1: Main DataFrame to join to
  • df2: DataFrame to leftjoin columns from

Keyword Arguments

  • keep: Vector of source => destination pairs for column mapping (default: all reference columns)
  • filter_func: Function to filter rows from secondary dataset (default: include all)
  • kwargs...: Keyword aguments of leftjoin function can be passed. Eg: on, makeunique etc.

Example

julia> dm = DataFrame(USUBJID = ["01-701-1015", "01-701-1023", "01-701-1028"], SEX = ["M", "F", "F"], COUNTRY = ["USA", "USA", "NOR"])
3×3 DataFrame
 Row │ ID           SEX     COUNTRY 
     │ String       String  String  
─────┼──────────────────────────────
   1 │ 01-701-1015  M       USA
   2 │ 01-701-1023  F       USA
   3 │ 01-701-1028  F       NOR

julia> vs = DataFrame(
    [
        ("01-701-1015", "WEIGHT", 50, "kg"),
        ("01-701-1015", "HEIGHT", 150, "cm"),
        ("01-701-1023", "WEIGHT", 60, "kg"),
        ("01-701-1023", "HEIGHT", 160, "cm"),
        ("01-701-1028", "WEIGHT", 70, "kg"),
        ("01-701-1028", "HEIGHT", 170, "cm"),
    ],
    [:USUBJID, :VSTESTCD, :VSSTRESN, :VSSTRESU],
)
6×4 DataFrame
 Row │ USUBJID      VSTESTCD  VSSTRESN  VSSTRESU 
     │ String       String    Int64     String   
─────┼───────────────────────────────────────────
   1 │ 01-701-1015  WEIGHT          50  kg
   2 │ 01-701-1015  HEIGHT         150  cm
   3 │ 01-701-1023  WEIGHT          60  kg
   4 │ 01-701-1023  HEIGHT         160  cm
   5 │ 01-701-1028  WEIGHT          70  kg
   6 │ 01-701-1028  HEIGHT         170  cm

julia> merge_ht = merge_columns(
            dm, 
            vs, 
            filter_func = r -> r.VSTESTCD == "HEIGHT", 
            keep = [:VSSTRESN => :HTBL, :VSSTRESU => :HTBLU], 
            on = [:USUBJID]
        )
3×5 DataFrame
 Row │ USUBJID      SEX     COUNTRY  HTBL    HTBLU   
     │ String       String  String   Int64?  String? 
─────┼───────────────────────────────────────────────
   1 │ 01-701-1015  M       USA         150  cm
   2 │ 01-701-1023  F       USA         160  cm
   3 │ 01-701-1028  F       NOR         170  cm

julia> merge_ht_wt = merge_columns(
            merge_ht, 
            vs, 
            filter_func = r -> r.VSTESTCD == "WEIGHT", 
            keep = ["VSSTRESN" => "WTBL", "VSSTRESU" => "WTBLU"], 
            on = ["USUBJID"]
        )
3×7 DataFrame
 Row │ USUBJID      SEX     COUNTRY  HTBL    HTBLU    WTBL    WTBLU   
     │ String       String  String   Int64?  String?  Int64?  String? 
─────┼────────────────────────────────────────────────────────────────
   1 │ 01-701-1015  M       USA         150  cm           50  kg
   2 │ 01-701-1023  F       USA         160  cm           60  kg
   3 │ 01-701-1028  F       NOR         170  cm           70  kg

julia> merge_columns(
            dm, 
            vs, 
            filter_func = r -> r.VSTESTCD == "HEIGHT", 
            on = [:USUBJID]
        ) # without specifying keep
3×6 DataFrame
 Row │ USUBJID      SEX     COUNTRY  VSTESTCD  VSSTRESN  VSSTRESU 
     │ String       String  String   String?   Int64?    String?  
─────┼────────────────────────────────────────────────────────────
   1 │ 01-701-1015  M       USA      HEIGHT         150  cm
   2 │ 01-701-1023  F       USA      HEIGHT         160  cm
   3 │ 01-701-1028  F       NOR      HEIGHT         170  cm

julia> merge_columns(
            dm, 
            vs, 
            on = [:USUBJID]
        ) # without specifying keep and filter_func
6×6 DataFrame
 Row │ USUBJID      SEX     COUNTRY  VSTESTCD  VSSTRESN  VSSTRESU 
     │ String       String  String   String?   Int64?    String?  
─────┼────────────────────────────────────────────────────────────
   1 │ 01-701-1015  M       USA      WEIGHT          50  kg
   2 │ 01-701-1015  M       USA      HEIGHT         150  cm
   3 │ 01-701-1023  F       USA      WEIGHT          60  kg
   4 │ 01-701-1023  F       USA      HEIGHT         160  cm
   5 │ 01-701-1028  F       NOR      WEIGHT          70  kg
   6 │ 01-701-1028  F       NOR      HEIGHT         170  cm
ADaM.merge_covariatesMethod
merge_covariates(df1, df2; domain, covariates, filter_func, baseline=true, kwargs...)
  • Iteratively merges covariate columns from a secondary DataFrame (df2) into a primary DataFrame (df1) using ADaM.merge_columns for each covariate in the covariates vector.
  • This is designed for SDTM-style domains (e.g., LB, VS) where a test code column (e.g., LBTESTCD) identifies the covariate.

Positional Arguments

  • df1: Main DataFrame to join to (primary dataset)
  • df2: DataFrame to leftjoin columns from (secondary dataset)

Keyword Arguments

  • domain: domain prefix (String or Symbol) used to construct test code and value column names (e.g., LBTESTCD, LBSTRESN, LBSTRESU)
  • covariates: Vector of covariate names to merge iteratively
  • filter_func: Function to filter rows from the secondary dataset for each covariate.
  • baseline: If true, appends "BL"/"BLU" to output column names (default: true)
  • kwargs...: Keyword aguments of leftjoin function can be passed. Eg: on, makeunique etc.

Examples

julia> dm = DataFrame(USUBJID = ["01-701-1015"], SEX = ["M"], COUNTRY = ["USA"])
1×3 DataFrame
 Row │ USUBJID      SEX     COUNTRY 
     │ String       String  String  
─────┼──────────────────────────────
   1 │ 01-701-1015  M       USA

julia> lb = DataFrame(USUBJID = fill("01-701-1015", 16),
                    LBTESTCD = repeat(["AST", "ALT", "CREAT", "BILI"], inner = 4),
                    LBSTRESN = [35, 30, 28, 33, 40, 32, 29, 35, 1, 0, 1, 1, 0, 1, 1, 0],
                    LBSTRESU = vcat(fill("U/L", 8), fill("mg/dL", 8)),
                    LBBLFL = repeat(["Y", "", "", ""], 4),
                )
16×5 DataFrame
 Row │ USUBJID      LBTESTCD  LBSTRESN  LBSTRESU  LBBLFL 
     │ String       String    Int64     String    String 
─────┼───────────────────────────────────────────────────
   1 │ 01-701-1015  AST             35  U/L       Y
   2 │ 01-701-1015  AST             30  U/L
   3 │ 01-701-1015  AST             28  U/L
   4 │ 01-701-1015  AST             33  U/L
  ⋮  │      ⋮          ⋮         ⋮         ⋮        ⋮
  13 │ 01-701-1015  BILI             0  mg/dL     Y
  14 │ 01-701-1015  BILI             1  mg/dL
  15 │ 01-701-1015  BILI             1  mg/dL
  16 │ 01-701-1015  BILI             0  mg/dL
                                           8 rows omitted

julia> merge_covariates(
            dm,
            lb;
            domain = "LB",
            covariates = ["AST", "ALT", "CREAT", "BILI"],
            filter_func = r -> coalesce(r.LBBLFL, "") == "Y",
            on = [:USUBJID],
        )
1×11 DataFrame
 Row │ USUBJID      SEX     COUNTRY  ASTBL   ASTBLU   ALTBL   ALTBLU   CREATBL  CREATBLU  BILIBL  BILIBLU 
     │ String       String  String   Int64?  String?  Int64?  String?  Int64?   String?   Int64?  String? 
─────┼────────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ 01-701-1015  M       USA          35  U/L          40  U/L            1  mg/dL          0  mg/dL

The baseline filter will be automatically determined, if the appropriate columns are present (LBBLFL == "Y", LBLOBXFL == "Y", VISIT == "Screening")

julia> merge_covariates(
            dm,
            lb;
            domain = "LB",
            covariates = ["AST", "ALT", "CREAT", "BILI"],
            on = [:USUBJID],
        )
1×11 DataFrame
 Row │ USUBJID      SEX     COUNTRY  ASTBL   ASTBLU   ALTBL   ALTBLU   CREATBL  CREATBLU  BILIBL  BILIBLU 
     │ String       String  String   Int64?  String?  Int64?  String?  Int64?   String?   Int64?  String? 
─────┼────────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ 01-701-1015  M       USA          35  U/L          40  U/L            1  mg/dL          0  mg/dL
ADaM.nci_liver_scoreMethod
nci_liver_score(bili::Number, ast::Number, uln_bili::Number, uln_ast::Number; kwargs...)
nci_liver_score(bili::Quantity, ast::Number, uln_bili::Quantity, uln_ast::Quantity; kwargs...)
nci_liver_score(df::DataFrame; kwargs...)

Compute NCI (National Cancer Institute) hepatic impairment score (numeric, categorical) from Bilirubin and AST values which can be provided as Quantitys, Scalars or Vectors via DataFrame. Refer NCI Criteria

Arguments

  • bili: Bilirubin value.
  • ast: AST value.
  • uln_bili: Upper limit of normal for bilirubin.
  • uln_ast: Upper limit of normal for AST.
  • bili_unit: Bilirubin unit. (default: "umol/L")
  • ast_unit: Column name for AST unit (default: "U/L").

Default Columns

The following default column names are used for DataFrame input:

  • bili = :BILIBL
  • ast = :ASTBL
  • uln_bili = :ULN_BILIBL
  • uln_ast = :ULN_ASTBL
  • bili_unit = :BILIBLU
  • ast_unit = :ASTBLU
  • col = :NCI

Output

  • Numeric NCI liver score (1=Normal, 2=Mild, 3=Moderate, 4=Severe, or missing)
  • Categorical NCI liver score ("Normal", "Mild", "Moderate", "Severe", or missing)

Result output in form of DataFrame Columns or a 2-element Tuple.

Notes

  • The function implements the NCI hepatic impairment criteria:
    • Severe: bili > 3 * uln_bili
    • Moderate: 1.5 * uln_bili < bili <= 3 * uln_bili
    • Mild: (bili > uln_bili && bili <= 1.5 * uln_bili) || (ast > uln_ast)
    • Normal: bili <= uln_bili && ast <= uln_ast
  • Returns missing for both outputs if any argument is missing or does not match a category.
  • Output columns can be customized via the col keyword argument.
  • The units of bili and uln_bili must match; otherwise, an error is thrown.

Examples

julia> nci_liver_score(7.0, 20.0, 2.0, 30.0, bili_unit="mg/dL", ast_unit="U/L")
(4, "Severe")

julia> nci_liver_score(4.0, 20.0, 2.0, 30.0)
(3, "Moderate")

julia> nci_liver_score(2.2, 20.0, 2.0, 15.0)
(2, "Mild")

julia> nci_liver_score(1.0, 10.0, 2.0, 15.0)
(1, "Normal")

julia> nci_liver_score(7us"mg/dL", 20, 2us"mg/dL", 30)
(4, "Severe")

julia> nci_liver_score(4us"mg/dL", 20, 2us"mg/dL", 30)
(3, "Moderate")

julia> nci_liver_score(2.2us"mg/dL", 20, 2us"mg/dL", 15)
(2, "Mild")

julia> nci_liver_score(1us"mg/dL", 10, 2us"mg/dL", 15)
(1, "Normal")

julia> df = DataFrame(BILIBL = [7.0, 4.0, 2.2, 1.0, missing],
                    ASTBL = [20.0, 20.0, 20.0, 10.0, 10.0],
                    ULN_BILIBL = [2.0, 2.0, 2.0, 2.0, 2.0],
                    ULN_ASTBL = [30.0, 30.0, 15.0, 15.0, 15.0],
                    BILIBLU = "mg/dL",
                    ASTBLU = "U/L")
5×6 DataFrame
 Row │ BILIBL     ASTBL    ULN_BILIBL  ULN_ASTBL  BILIBLU  ASTBLU 
     │ Float64?   Float64  Float64     Float64    String   String 
─────┼────────────────────────────────────────────────────────────
   1 │       7.0     20.0         2.0       30.0  mg/dL    U/L
   2 │       4.0     20.0         2.0       30.0  mg/dL    U/L
   3 │       2.2     20.0         2.0       15.0  mg/dL    U/L
   4 │       1.0     10.0         2.0       15.0  mg/dL    U/L
   5 │ missing       10.0         2.0       15.0  mg/dL    U/L

julia> nci_liver_score(df)
5×8 DataFrame
 Row │ BILIBL     ASTBL    ULN_BILIBL  ULN_ASTBL  BILIBLU  ASTBLU  NCIN     NCIC     
     │ Float64?   Float64  Float64     Float64    String   String  Int64?   String?  
─────┼───────────────────────────────────────────────────────────────────────────────
   1 │       7.0     20.0         2.0       30.0  mg/dL    U/L           4  Severe
   2 │       4.0     20.0         2.0       30.0  mg/dL    U/L           3  Moderate
   3 │       2.2     20.0         2.0       15.0  mg/dL    U/L           2  Mild
   4 │       1.0     10.0         2.0       15.0  mg/dL    U/L           1  Normal
   5 │ missing       10.0         2.0       15.0  mg/dL    U/L     missing  missing

julia> nci_liver_score(df, col=:MYNCI)
5×8 DataFrame
 Row │ BILIBL     ASTBL    ULN_BILIBL  ULN_ASTBL  BILIBLU  ASTBLU  MYNCIN   MYNCIC   
     │ Float64?   Float64  Float64     Float64    String   String  Int64?   String?  
─────┼───────────────────────────────────────────────────────────────────────────────
   1 │       7.0     20.0         2.0       30.0  mg/dL    U/L           4  Severe
   2 │       4.0     20.0         2.0       30.0  mg/dL    U/L           3  Moderate
   3 │       2.2     20.0         2.0       15.0  mg/dL    U/L           2  Mild
   4 │       1.0     10.0         2.0       15.0  mg/dL    U/L           1  Normal
   5 │ missing       10.0         2.0       15.0  mg/dL    U/L     missing  missing
ADaM.round_columnsMethod
round_columns(df::DataFrame, digit::Int64)

Rounds values across Float columns to specific number of digits.

Example

julia> df = DataFrame(Col1 = [1, 1.0023], Col2 = [3.14159, 2], Col3 = [3, 1.7], Col4 = [1, 4])
2×4 DataFrame
 Row │ Col1     Col2     Col3     Col4  
     │ Float64  Float64  Float64  Int64 
─────┼──────────────────────────────────
   1 │  1.0     3.14159      3.0      1
   2 │  1.0023  2.0          1.7      4


julia> round_columns(df, 2)
2×4 DataFrame
 Row │ Col1     Col2     Col3     Col4    
     │ Float64  Float64  Float64  Float64 
─────┼────────────────────────────────────
   1 │     1.0     3.14      3.0      1.0
   2 │     1.0     2.0       1.7      4.0
ADaM.set_exclusionMethod
set_exclusion(df, excl_func, comment; group, order)
  • Helps to set exclusions to the dataframe rows based on input excl_func and group variables.
  • The exclusion comment can be passed as a String or from a DataFrame column(Symbol).

Keyword Arguments

  • excl_func: Function condition based on which rows of a dataframe are flagged as excluded.
  • group: Grouping variable(s) for exclusion logic.
  • order: Ordering variable(s) for sorting within groups.

Example

julia> df = DataFrame(
        ID = [1,1,1,2,2,2,3,3,3], 
        SEQ = [1,2,3,1,2,3,1,2,3],
        EVID = [1,1,1,0,0,0,0,1,0],  
        STAT = ["NA","NA", "NA", missing, missing, missing, missing, "NA", missing],
        CONC = [missing,missing,missing,20,40,60,30,missing,70],
    )
9×5 DataFrame
 Row │ ID     SEQ    EVID   STAT     CONC    
     │ Int64  Int64  Int64  String?  Int64?  
─────┼───────────────────────────────────────
   1 │     1      1      1  NA       missing 
   2 │     1      2      1  NA       missing 
   3 │     1      3      1  NA       missing 
   4 │     2      1      0  missing       20
   5 │     2      2      0  missing       40
   6 │     2      3      0  missing       60
   7 │     3      1      0  missing       30
   8 │     3      2      1  NA       missing 
   9 │     3      3      0  missing       70

# Exclusion 1: Subjects with missing conc. data
julia> set_exclusion(df, "All missing conc subs", excl_func = group -> all(ismissing, group.CONC), group = :ID)
9×7 DataFrame
 Row │ ID     SEQ    EVID   STAT     CONC     EXCLFCOM               EXCLF 
     │ Int64  Int64  Int64  String?  Int64?   String?                Int64 
─────┼─────────────────────────────────────────────────────────────────────
   1 │     1      1      1  NA       missing  All missing conc subs      1
   2 │     1      2      1  NA       missing  All missing conc subs      1
   3 │     1      3      1  NA       missing  All missing conc subs      1
   4 │     2      1      0  missing       20  missing                    0
   5 │     2      2      0  missing       40  missing                    0
   6 │     2      3      0  missing       60  missing                    0
   7 │     3      1      0  missing       30  missing                    0
   8 │     3      2      1  NA       missing  missing                    0
   9 │     3      3      0  missing       70  missing                    0

# Exclusion 2: Subjects with no dosing data
julia> set_exclusion(df, "No dose subs", excl_func = group -> all(iszero, group.EVID), group = "ID")
9×7 DataFrame
 Row │ ID     SEQ    EVID   STAT     CONC     EXCLFCOM      EXCLF 
     │ Int64  Int64  Int64  String?  Int64?   String?       Int64 
─────┼────────────────────────────────────────────────────────────
   1 │     1      1      1  NA       missing  missing           0
   2 │     1      2      1  NA       missing  missing           0
   3 │     1      3      1  NA       missing  missing           0
   4 │     2      1      0  missing       20  No dose subs      1
   5 │     2      2      0  missing       40  No dose subs      1
   6 │     2      3      0  missing       60  No dose subs      1
   7 │     3      1      0  missing       30  missing           0
   8 │     3      2      1  NA       missing  missing           0
   9 │     3      3      0  missing       70  missing           0

# Exclusion 3: Subjects with no conc. data
julia> set_exclusion(df, "No conc subs", excl_func = group -> all(isone, group.EVID), group = [:ID])
9×7 DataFrame
 Row │ ID     SEQ    EVID   STAT     CONC     EXCLFCOM      EXCLF 
     │ Int64  Int64  Int64  String?  Int64?   String?       Int64 
─────┼────────────────────────────────────────────────────────────
   1 │     1      1      1  NA       missing  No conc subs      1
   2 │     1      2      1  NA       missing  No conc subs      1
   3 │     1      3      1  NA       missing  No conc subs      1
   4 │     2      1      0  missing       20  missing           0
   5 │     2      2      0  missing       40  missing           0
   6 │     2      3      0  missing       60  missing           0
   7 │     3      1      0  missing       30  missing           0
   8 │     3      2      1  NA       missing  missing           0
   9 │     3      3      0  missing       70  missing           0

julia> set_exclusion(df, :STAT, excl_func = group -> all(ismissing, group.CONC), group = [:ID])
9×7 DataFrame
 Row │ ID     SEQ    EVID   STAT     CONC     EXCLFCOM  EXCLF 
     │ Int64  Int64  Int64  String?  Int64?   String?   Int64 
─────┼────────────────────────────────────────────────────────
   1 │     1      1      1  NA       missing  NA            1
   2 │     1      2      1  NA       missing  NA            1
   3 │     1      3      1  NA       missing  NA            1
   4 │     2      1      0  missing       20  missing       0
   5 │     2      2      0  missing       40  missing       0
   6 │     2      3      0  missing       60  missing       0
   7 │     3      1      0  missing       30  missing       0
   8 │     3      2      1  NA       missing  missing       0
   9 │     3      3      0  missing       70  missing       0

julia> set_exclusion(df, :STAT, excl_func = group -> all(ismissing, group.CONC), group = [:ID, :SEQ])
9×7 DataFrame
 Row │ ID     SEQ    EVID   STAT     CONC     EXCLFCOM  EXCLF 
     │ Int64  Int64  Int64  String?  Int64?   String?   Int64 
─────┼────────────────────────────────────────────────────────
   1 │     1      1      1  NA       missing  NA            1
   2 │     1      2      1  NA       missing  NA            1
   3 │     1      3      1  NA       missing  NA            1
   4 │     2      1      0  missing       20  missing       0
   5 │     2      2      0  missing       40  missing       0
   6 │     2      3      0  missing       60  missing       0
   7 │     3      1      0  missing       30  missing       0
   8 │     3      2      1  NA       missing  NA            1
   9 │     3      3      0  missing       70  missing       0
ADaM.subject_trt_countMethod
subject_trt_count(df::DataFrame)
  • Displays the information about the subject treatment count. Outputs a DataFrame with DRUG, DRUGCD (DRUG Code) and count
  • Valid datsets are : PC, EX, DM, ADPC, ADEX, ADSL

Example

julia> ex = PharmaDatasets.dataset("SDTM/CDISCPILOT01/ex")
591×17 DataFrame
 Row │ STUDYID       DOMAIN   USUBJID      EXSEQ    EXTRT       EXDOSE   EXDOSU   EXDOSFRM  EXDOSFRQ  EXROUTE      VISITNUM  V ⋯
     │ String15      String3  String15     Float64  String15    Float64  String3  String7   String3   String15     Float64   S ⋯
─────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ CDISCPILOT01  EX       01-701-1015      1.0  PLACEBO         0.0  mg       PATCH     QD        TRANSDERMAL       3.0  B ⋯
   2 │ CDISCPILOT01  EX       01-701-1015      2.0  PLACEBO         0.0  mg       PATCH     QD        TRANSDERMAL       4.0  W
   3 │ CDISCPILOT01  EX       01-701-1015      3.0  PLACEBO         0.0  mg       PATCH     QD        TRANSDERMAL      12.0  W
   4 │ CDISCPILOT01  EX       01-701-1023      1.0  PLACEBO         0.0  mg       PATCH     QD        TRANSDERMAL       3.0  B
   5 │ CDISCPILOT01  EX       01-701-1023      2.0  PLACEBO         0.0  mg       PATCH     QD        TRANSDERMAL       4.0  W ⋯
   6 │ CDISCPILOT01  EX       01-701-1028      1.0  XANOMELINE     54.0  mg       PATCH     QD        TRANSDERMAL       3.0  B
   7 │ CDISCPILOT01  EX       01-701-1028      2.0  XANOMELINE     81.0  mg       PATCH     QD        TRANSDERMAL       4.0  W
   8 │ CDISCPILOT01  EX       01-701-1028      3.0  XANOMELINE     54.0  mg       PATCH     QD        TRANSDERMAL      12.0  W
  ⋮  │      ⋮           ⋮          ⋮          ⋮         ⋮          ⋮        ⋮        ⋮         ⋮           ⋮          ⋮        ⋱
 585 │ CDISCPILOT01  EX       01-718-1355      1.0  PLACEBO         0.0  mg       PATCH     QD        TRANSDERMAL       3.0  B ⋯
 586 │ CDISCPILOT01  EX       01-718-1355      2.0  PLACEBO         0.0  mg       PATCH     QD        TRANSDERMAL       4.0  W
 587 │ CDISCPILOT01  EX       01-718-1355      3.0  PLACEBO         0.0  mg       PATCH     QD        TRANSDERMAL      12.0  W
 588 │ CDISCPILOT01  EX       01-718-1371      1.0  XANOMELINE     54.0  mg       PATCH     QD        TRANSDERMAL       3.0  B
 589 │ CDISCPILOT01  EX       01-718-1371      2.0  XANOMELINE     81.0  mg       PATCH     QD        TRANSDERMAL       4.0  W ⋯
 590 │ CDISCPILOT01  EX       01-718-1427      1.0  XANOMELINE     54.0  mg       PATCH     QD        TRANSDERMAL       3.0  B
 591 │ CDISCPILOT01  EX       01-718-1427      2.0  XANOMELINE     81.0  mg       PATCH     QD        TRANSDERMAL       4.0  W
                                                                                                  6 columns and 576 rows omitted

julia> subject_trt_count(ex)
2×3 DataFrame
 Row │ DRUG        DRUGCD  count 
     │ String      String  Int64 
─────┼───────────────────────────
   1 │ PLACEBO     NA         86
   2 │ XANOMELINE  NA        168
ADaM.validate_column_structureMethod
validate_column_structure(df::DataFrame; ignore, silence_errors)

Validate a DataFrame's column structure according to ADaM Data conventions(https://sastricks.com/cdisc/ADaMIG_v1.3.pdf#page=14).

Returns a DataFrame with all validation checks. Each row represents a validation check with columns:

  • check: Description of the validation check
  • tag: Check tag identifier (Symbol)
  • column: The column name (if applicable)
  • status: Check result (pass, fail, or missing for ignored checks)
  • level: Severity level: error for failures, ignored for ignored checks
  • rows: Affected row numbers (if applicable)
  • details: Additional details about the finding

Arguments

  • df: The DataFrame to validate
  • ignore: Check tags to ignore. Can be provided as Symbols or Strings. Available tags (validation checks):
    • name_start - Column name must start with a letter
    • name_format - Column name can only contain letters, underscores, and numerals
    • name_length - Column name length exceeds 8 characters
    • char_length - Character column value length exceeds 200 characters
    • label_required - Column labels must be present
    • label_length - Column label length exceeds 40 characters
  • silence_errors: If true, errors are not thrown even if validation fails; instead, the findings DataFrame is returned with all errors and passes. Default is false, which throws an error if any validation fails.

Returns

A DataFrame with all validation checks and their pass/fail status. Errors will be thrown on failed checks.

Examples

# Basic validation
julia> df = DataFrame(ID = [1, 2], USUBJID = ["S001", "S002"], DV = [10.5, 20.3])
2×3 DataFrame
 Row │ ID     USUBJID  DV      
     │ Int64  String   Float64 
─────┼─────────────────────────
   1 │     1  S001        10.5
   2 │     2  S002        20.3

julia> validate_column_structure(df)
ERROR: Column structure validation failed with 3 error(s):
1. "ID": Column label is missing (label_required)
2. "USUBJID": Column label is missing (label_required)
3. "DV": Column label is missing (label_required)

julia> validate_column_structure(df, silence_errors=true)
7×7 DataFrame
 Row │ check                              tag             column   status   level    rows     details                       
     │ String                             Symbol?         String?  String?  String?  Array…?  String?                       
─────┼──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ Column label is missing            label_required  ID       fail     error    missing  All columns must have a label
   2 │ Column label is missing            label_required  USUBJID  fail     error    missing  All columns must have a label
   3 │ Column label is missing            label_required  DV       fail     error    missing  All columns must have a label
   4 │ All column names start with a le…  name_start      missing  pass     missing  missing  Checked: 3/3 column(s)
   5 │ All column names are valid format  name_format     missing  pass     missing  missing  Checked: 3/3 column(s)
   6 │ All column name lengths are valid  name_length     missing  pass     missing  missing  Checked: 3/3 column(s)
   7 │ All character column lengths are…  char_length     missing  pass     missing  missing  Checked: 1/3 column(s)

julia> validate_column_structure(df, ignore = [:label_required])
5×7 DataFrame
 Row │ check                              tag             column   status   level    rows     details                
     │ String                             Symbol?         String?  String?  String?  Array…?  String?                
─────┼───────────────────────────────────────────────────────────────────────────────────────────────────────────────
   1 │ Column labels must be present      label_required  missing  missing  ignored  missing  Check ignored by user
   2 │ All column names start with a le…  name_start      missing  pass     missing  missing  Checked: 3/3 column(s)
   3 │ All column names are valid format  name_format     missing  pass     missing  missing  Checked: 3/3 column(s)
   4 │ All column name lengths are valid  name_length     missing  pass     missing  missing  Checked: 3/3 column(s)
   5 │ All character column lengths are…  char_length     missing  pass     missing  missing  Checked: 1/3 column(s)