Introduction to ADaM

This tutorial demonstrates ADaM.jl, a Julia package for generating ADaM (Analysis Data Model) datasets from SDTM (Study Data Tabulation Model) data. ADaM.jl supports creation of various analysis datasets including:

  • ADPPK (ADaM Dataset for Pop-PK)
  • ADNCA (ADaM Dataset for NCA)

The basic workflow of ADaM dataset creation is:

  1. Read SDTM source data.
  2. Transform and derive analysis variables.
  3. Apply data quality checks and exclusions.
  4. Generate the final analysis-ready dataset.

In this tutorial we will show how to build a complete ADPPK dataset from SDTM data, demonstrating the key functions and transformations available in ADaM.jl. The workflow involves reading concentration (PC), dosing (EX), demographics (DM), vital signs (VS), and lab (LB) data, then systematically deriving all required analysis variables.

Setup and Data Loading

First, let's import the required packages and load the SDTM source data:

# Import Required Packages
using DataFramesMeta
using Dates
using StatsBase
using PharmaDatasets
using ADaM

# Read SDTM source data
pc = @chain dataset("SDTM/CDISCPILOT01/pc") convert_to_missing(["", nothing])
ex = @chain dataset("SDTM/CDISCPILOT01/ex") convert_to_missing(["", nothing])
dm = @chain dataset("SDTM/CDISCPILOT01/dm") convert_to_missing(["", nothing])
vs = @chain dataset("SDTM/CDISCPILOT01/vs") convert_to_missing(["", nothing])
lb = @chain dataset("SDTM/CDISCPILOT01/lb") convert_to_missing(["", nothing])

# Drug lookup table
drug_lookup = unique(@select(ex, :USUBJID, :DRUG = :EXTRT))

first(select(pc, :USUBJID, :PCDTC, :PCTPT, :PCTPTNUM, :PCSTRESN, :PCSTRESU), 2)
2×6 DataFrame
RowUSUBJIDPCDTCPCTPTPCTPTNUMPCSTRESNPCSTRESU
String15String31String31Float64Float64?String7
101-701-10152014-01-01T23:30:00Pre-dose-0.50.0ug/ml
201-701-10152014-01-02T00:05:005 Min Post-dose0.08missingug/ml

convert_to_missing

The convert_to_missing function converts problematic values to proper missing values. Common values to convert include empty strings (""), nothing, special markers like "NA", ".", "BLQ", and NaN for numeric data. This ensures consistent missing data handling throughout the workflow and prevents issues with downstream operations.

Datetime Processing

Clinical trial data contains datetime information in various string formats that need to be parsed and transformed.

Preparing Concentration Data

# Prepare PC data with datetime conversions and derived variables
pc_prep = @chain pc begin
    make_dtm(_, :PCDTC)
    make_dtm_to_dt(_, :ADTM)
    make_dtm_to_tm(_, :ADTM)

    @rtransform @astable begin
        :NFRLT = (:PCTPTNUM < 0) ? 0 : :PCTPTNUM
        :AVISITN = Int(div(:NFRLT, 24)) + 1
        :AVISIT = "Day " * string(:AVISITN)
        :EVID = 0
        :II = 0
    end

    leftjoin(drug_lookup, on = :USUBJID)

    transform(
        groupby(_, [:USUBJID, :DRUG]),
        :ADTM => (x -> maximum(skipmissing(x))) => :LOBSDTM,
    )

    @orderby :USUBJID :DRUG :ADTM
end

first(
    select(pc_prep, :USUBJID, :DRUG, :EVID, :PCDTC, :ADTM, :ADT, :ATM, :NFRLT, :AVISITN),
    2,
)
2×9 DataFrame
RowUSUBJIDDRUGEVIDPCDTCADTMADTATMNFRLTAVISITN
String15String15?Int64String31DateTimeDateTimeRealInt64
101-701-1015PLACEBO02014-01-01T23:30:002014-01-01T23:30:002014-01-0123:30:0001
201-701-1015PLACEBO02014-01-02T00:05:002014-01-02T00:05:002014-01-0200:05:000.081

make_dtm

The make_dtm function converts character datetime columns to Julia DateTime objects. The prefix parameter controls the output column name (default is "A"), and fmt specifies the datetime format string.

make_dtm_to_dt and make_dtm_to_tm

These functions extract components from datetime objects:

Both functions use the same prefix logic as make_dtm to name output columns.

Preparing Dosing Data

# Prepare EX data with datetime conversions
ex_prep = @chain ex begin
    make_dtm(_, :EXSTDTC, prefix = "AST", fmt = "yyyy-mm-dd")
    make_dtm(_, :EXENDTC, prefix = "AEN", fmt = "yyyy-mm-dd")
    @rtransform :AENDTM = ismissing(:AENDTM) ? :ASTDTM : :AENDTM
    make_dtm_to_dt(_, :ASTDTM, prefix = "AST")
    make_dtm_to_dt(_, :AENDTM, prefix = "AEN")

    @rtransform @astable begin
        :ASTDTM = :ASTDTM + Minute(1)
        :AENDTM = :AENDTM + Minute(1)
        :II = interdose_interval(:EXDOSFRQ)
        :NFRLT = 24 * (:VISITDY - 1)
        :AVISITN = Int(div(:NFRLT, 24)) + 1
        :AVISIT = "Day " * string(:AVISITN)
        :EVID = 1
        :DRUG = :EXTRT
    end

    transform(
        groupby(_, [:USUBJID, :DRUG]),
        :ASTDTM => (x -> minimum(skipmissing(x))) => :FANLDTM,
        :EXDOSE => (x -> minimum(skipmissing(x))) => :EXDOSE_first,
    )

    @orderby :USUBJID :DRUG :ASTDTM
end

Dose Event Expansion

One of the most powerful transformations for ADPPK datasets is expanding dose intervals.

expand_dose_events

# Expand dose events
ex_exp = @chain ex_prep begin
    expand_dose_events
    transform(groupby(_, :USUBJID), eachindex => :EXSEQ)

    @rtransform @astable begin
        :AVISITN = Int(div(:NFRLT, 24)) + 1
        :AVISIT = "Day " * string(:AVISITN)
        :ADTM = :ASTDTM
        :EVID = 1
    end

    make_dtm_to_dt(_, :ADTM, prefix = "A")
    make_dtm_to_tm(_, :ADTM, prefix = "A")
end

first(select(ex_exp, :USUBJID, :DRUG, :EVID, :ASTDTM, :EXDOSE, :NFRLT), 3)
3×6 DataFrame
RowUSUBJIDDRUGEVIDASTDTMEXDOSENFRLT
String15String15Int64DateTimeFloat64Float64
101-701-1015PLACEBO12014-01-02T00:01:000.00.0
201-701-1015PLACEBO12014-01-03T00:01:000.024.0
301-701-1015PLACEBO12014-01-04T00:01:000.048.0

In SDTM EX data, multiple doses are often represented as intervals with a frequency code (e.g., QD = once daily, BID = twice daily). The expand_dose_events function creates a separate row for each individual dose by:

  1. Identifying dose intervals using start (ASTDT) and end (AENDT) dates
  2. Parsing the frequency code (EXDOSFRQ) to determine inter-dose interval
  3. Creating individual dose records with correct timing (NFRLT)
  4. Maintaining all other EX variables for each expanded record

This is essential for pharmacometric analysis, which requires one record per actual dose administration.

Dose Compression

After expanding individual doses, we can optionally compress consecutive identical doses into NONMEM-style records using the ADDL (additional doses) column for more compact representation.

# Compress dose events
adppk_cmp = @chain pc_prep begin
    vcat(ex_exp, cols = :union)
    @orderby :USUBJID :DRUG :ADTM :EVID

    compress_dose_events
end

first(select(adppk_cmp, :USUBJID, :DRUG, :EVID, :ADTM, :ADDL, :II), 3)
3×6 DataFrame
RowUSUBJIDDRUGEVIDADTMADDLII
String15String15Int64DateTimeInt64Int64
101-701-1015PLACEBO02014-01-01T23:30:0000
201-701-1015PLACEBO12014-01-02T00:01:00024
301-701-1015PLACEBO02014-01-02T00:05:0000

Data Quality and Exclusions

set_exclusion

# Combine PC and EX data with exclusion flags
adppk_flag = @chain adppk_cmp begin
    transform(eachindex => :SEQ)
    transform(
        groupby(_, [:USUBJID, :DRUG]),
        :LOBSDTM => (x -> minimum(skipmissing(x))) => :LOBSDTM,
    )

    set_exclusion(
        "Subjects with missing conc.",
        excl_func = group -> all(ismissing, replace(group.PCSTRESN, 0 => missing)),
        group = [:USUBJID, :DRUG],
    )
    set_exclusion(
        "Subjects with no dose records",
        excl_func = group -> all((==)(0), group.EVID),
        group = [:USUBJID, :DRUG],
    )
    set_exclusion(
        "Subjects with no conc. records",
        excl_func = group -> all((==)(1), group.EVID),
        group = [:USUBJID, :DRUG],
    )
    set_exclusion(
        "Dosing records after last observation",
        excl_func = group -> group.ADTM >= group.LOBSDTM && all((==)(1), group.EVID),
        group = [:SEQ],
    )

    @orderby :USUBJID :DRUG :ADTM :EVID
end

first(select(adppk_flag, :USUBJID, :DRUG, :EVID, :ADTM, :EXCLF, :EXCLFCOM), 3)
3×6 DataFrame
RowUSUBJIDDRUGEVIDADTMEXCLFEXCLFCOM
String15String15Int64DateTimeInt64String?
101-701-1015PLACEBO02014-01-01T23:30:001Subjects with missing conc.
201-701-1015PLACEBO12014-01-02T00:01:001Subjects with missing conc.
301-701-1015PLACEBO02014-01-02T00:05:001Subjects with missing conc.

Data quality is critical for pharmacometric analysis. The set_exclusion function provides a systematic way to flag records that should be excluded. It creates two columns:

  • EXCLF: Exclusion Flag (0 = include, 1 = exclude)
  • EXCLFCOM: Exclusion Flag Comment (text description of exclusion reason)

Key parameters:

  • excl_func: Anonymous function defining the exclusion condition
  • group: Grouping columns to apply the function (e.g., by subject, by record)

You can apply multiple exclusion criteria sequentially - the function accumulates flags and appends comments.

Reference Data Joins

join_columns

# Derive reference data and time variables
adppk_nom_prev = @chain adppk_flag begin
    join_columns(
        ex_exp,
        on = [:USUBJID],
        order = [:ADTM],
        keep = [:ADTM => :ADTM_prev, :EXDOSE => :EXDOSE_prev],
        filter_join = (t, r) -> t.ADTM > r.ADTM,
        mode = "last",
    )
    join_columns(
        ex_exp,
        on = [:USUBJID],
        order = [:NFRLT],
        keep = [:NFRLT => :NFRLT_prev],
        filter_join = (t, r) -> t.NFRLT > r.NFRLT,
        mode = "last",
    )

    @orderby :USUBJID :DRUG :ADTM :EVID
end

first(
    select(
        adppk_nom_prev,
        :USUBJID,
        :DRUG,
        :EVID,
        :ADTM,
        :ADTM_prev,
        :EXDOSE_prev,
        :NFRLT,
        :NFRLT_prev,
    ),
    3,
)
3×8 DataFrame
RowUSUBJIDDRUGEVIDADTMADTM_prevEXDOSE_prevNFRLTNFRLT_prev
String15String15Int64DateTimeDateTime?Float64?Float64Float64?
101-701-1015PLACEBO02014-01-01T23:30:00missingmissing0.0missing
201-701-1015PLACEBO12014-01-02T00:01:00missingmissing0.0missing
301-701-1015PLACEBO02014-01-02T00:05:002014-01-02T00:01:000.00.080.0

A critical step in ADPPK creation is joining reference data from prior records to derive relative time variables. The join_columns function is a specialized join that retrieves values from reference records based on conditions.

Parameters:

  • on: Columns to join by (typically subject ID)
  • order: Column(s) defining the temporal sequence
  • keep: Dictionary mapping source columns to target column names
  • filter_join: Condition (as anonymous function) defining valid reference records
  • mode: "last" for most recent, "first" for earliest reference

Time Duration Calculations

make_duration

# Derive relative time variables
adppk_aprlt = @chain adppk_nom_prev begin
    transform(
        groupby(_, [:USUBJID, :DRUG]),
        :FANLDTM => (x -> minimum(skipmissing(x))) => :FANLDTM,
        :NFRLT => (x -> minimum(skipmissing(x))) => :min_NFRLT,
        :EXDOSE_first => (x -> minimum(skipmissing(x))) => :EXDOSE_first,
    )

    make_duration(:AFRLT, start_dtm = :FANLDTM, end_dtm = :ADTM)
    make_duration(:APRLT, start_dtm = :ADTM_prev, end_dtm = :ADTM)

    @rtransform :APRLT = :EVID == 1 ? 0 : ismissing(:APRLT) ? :AFRLT : :APRLT
    @rtransform :NPRLT =
        (:EVID == 1) ? 0 :
        ismissing(:NFRLT_prev) ? (:NFRLT - :min_NFRLT) : (:NFRLT - :NFRLT_prev)

    @orderby :USUBJID :DRUG :ADTM :EVID
end

first(
    select(
        adppk_aprlt,
        :USUBJID,
        :DRUG,
        :EVID,
        :ADTM,
        :FANLDTM,
        :AFRLT,
        :APRLT,
        :NFRLT,
        :NPRLT,
    ),
    3,
)
3×9 DataFrame
RowUSUBJIDDRUGEVIDADTMFANLDTMAFRLTAPRLTNFRLTNPRLT
String15String15Int64DateTimeDateTimeFloat64RealFloat64Real
101-701-1015PLACEBO02014-01-01T23:30:002014-01-02T00:01:00-0.516667-0.5166670.00.0
201-701-1015PLACEBO12014-01-02T00:01:002014-01-02T00:01:000.000.00
301-701-1015PLACEBO02014-01-02T00:05:002014-01-02T00:01:000.06666670.06666670.080.08

With reference data available, we can calculate relative time variables. The make_duration function computes time differences in hours (fractional values allowed) and handles missing values appropriately.

Standard relative time variables in ADPPK:

VariableDescriptionReference Point
AFRLTActual Time Relative to First DoseFANLDTM (First Dose DateTime)
APRLTActual Time Relative to Previous DoseADTM_prev (Previous Dose DateTime)
NFRLTNominal Time Relative to First DosePlanned timepoints
NPRLTNominal Time Relative to Previous DoseNFRLT_prev

Demographics and Analysis Variables

# Add demographics
adppk_dm = leftjoin(adppk_aprlt, dm, on = :USUBJID, makeunique = true)

# Derive analysis variables
adppk_aval = @chain adppk_dm begin
    @orderby :USUBJID :ADTM

    @rtransform @astable begin
        :DOSEA =
            (:EVID == 1) ? :EXDOSE : ismissing(:EXDOSE_prev) ? :EXDOSE_first : :EXDOSE_prev
        :CMT = (:EVID == 1) ? 1 : 2
        :AMT = (:EVID == 1) ? :EXDOSE : missing
    end

    transform(
        groupby(_, [:USUBJID, :DRUG]),
        :PCLLOQ => (x -> only(unique(skipmissing(x)))) => :ALLOQ,
    )
    @rtransform :BLQFL = (coalesce(:PCSTRESN, 999) <= :ALLOQ) ? "Y" : "N"
    @rtransform :BLQFN = (coalesce(:PCSTRESN, 999) <= :ALLOQ) ? 1 : 0
    @rtransform :DV = (:EVID == 1) ? missing : :PCSTRESN
    @rtransform @passmissing :DVL = (:DV > 0) ? log(:DV) : missing
    @rtransform :MDV = (:EVID == 1) ? 1 : ismissing(:DV) ? 1 : 0
    @rtransform :AVALU = (:EVID == 1) ? :EXDOSU : :PCSTRESU

    @orderby :USUBJID :DRUG :ADTM :EVID
end

first(select(adppk_aval, :USUBJID, :DRUG, :EVID, :ADTM, :DV, :DOSEA, :AMT, :BLQFL, :MDV), 3)
3×9 DataFrame
RowUSUBJIDDRUGEVIDADTMDVDOSEAAMTBLQFLMDV
String15String15Int64DateTimeFloat64?Float64Float64?StringInt64
101-701-1015PLACEBO02014-01-01T23:30:000.00.0missingY0
201-701-1015PLACEBO12014-01-02T00:01:00missing0.00.0N1
301-701-1015PLACEBO02014-01-02T00:05:00missing0.0missingN1

This step adds demographics and derives key PK analysis variables including:

  • DOSEA: Actual dose amount carried forward to observations
  • CMT: Compartment (1=central for doses, 2=observation for concentrations)
  • AMT: Dose amount
  • DV: Dependent variable (concentration)
  • DVL: Log-transformed concentration
  • MDV: Missing DV flag
  • BLQFL/BLQFN: Below limit of quantification flags

Covariate Lookup Tables

merge_columns

# Create covariate lookup tables
lb_val = @chain lb begin
    @rsubset(:LBBLFL == "Y")
    @rtransform :LBTESTCDB = string(:LBTESTCD, "BL")
    unstack(:USUBJID, :LBTESTCDB, :LBSTRESN)
end

lb_uni = @chain lb begin
    @rsubset(:LBBLFL == "Y")
    @rtransform :LBTESTCDU = string(:LBTESTCD, "BLU")
    unstack(:USUBJID, :LBTESTCDU, :LBSTRESU)
end

lb_lookup = @chain dm begin
    @select :USUBJID
    leftjoin(lb_val, on = :USUBJID)
    leftjoin(lb_uni, on = :USUBJID)
end

vs_lookup = @chain dm begin
    @select :USUBJID
    merge_columns(
        vs,
        filter_func = r -> r.VSTESTCD == "HEIGHT",
        on = [:USUBJID],
        keep = ["VSSTRESN" => "HTBL", "VSSTRESU" => "HTBLU"],
    )
    merge_columns(
        vs,
        filter_func = r ->
            coalesce(r.VSTESTCD, "") == "WEIGHT" && coalesce(r.VSBLFL, "") == "Y",
        on = [:USUBJID],
        keep = ["VSSTRESN" => "WTBL", "VSSTRESU" => "WTBLU"],
    )
end

first(vs_lookup, 2)
2×5 DataFrame
RowUSUBJIDHTBLHTBLUWTBLWTBLU
String15Float64?String15?Float64?String15?
101-701-1015147.32cm54.43kg
201-701-1023162.56cm80.29kg

The merge_columns function combines leftjoin, subset, and select operations into a single convenient call. It's especially useful for creating lookup tables from vertical (long) datasets like VS and LB.

Parameters:

  • filter_func: Anonymous function to subset the secondary dataset before joining
  • on: Join key columns (typically subject ID)
  • keep: Dictionary mapping source column names to target column names

Derived Covariates

# Create ADPPK master dataset with all covariates
adppk_master = @chain adppk_aval begin
    @rtransform :AGEU = "yr"

    leftjoin(lb_lookup, on = :USUBJID)
    leftjoin(vs_lookup, on = :USUBJID)

    body_mass_index
    body_surface_area

    transform(
        groupby(_, [:USUBJID, :DRUG]),
        :AGE => (x -> first(skipmissing(x))) => :AGE,
        :SEX => (x -> first(skipmissing(x))) => :SEX,
    )

    creatinine_clearance
    est_glomerular_filtration_rate
end

first(
    select(
        adppk_master,
        :USUBJID,
        :DRUG,
        :EVID,
        :ADTM,
        :DV,
        :DOSEA,
        :AGE,
        :SEX,
        :WTBL,
        :BMIBL,
        :CRCLBL,
    ),
    2,
)
2×11 DataFrame
RowUSUBJIDDRUGEVIDADTMDVDOSEAAGESEXWTBLBMIBLCRCLBL
String15String15Int64DateTimeFloat64?Float64Float64String3Float64?Float64?Float64?
101-701-1015PLACEBO02014-01-01T23:30:000.00.063.0F54.4325.079354.977
201-701-1015PLACEBO12014-01-02T00:01:00missing0.063.0F54.4325.079354.977

ADaM.jl provides validated functions to compute standard derived covariates:

body_mass_index

Computes Body Mass Index: $\text{BMI} = \frac{\text{weight (kg)}}{\text{height (m)}^2}$

body_surface_area

Computes Body Surface Area using the Mosteller formula: $\text{BSA} = \sqrt{\frac{\text{height (cm)} \times \text{weight (kg)}}{3600}}$

creatinine_clearance

Computes Creatinine Clearance using the Cockcroft-Gault equation: $\text{CrCL} = \frac{(140 - \text{age}) \times \text{weight} \times (0.85 \text{ if female})}{72 \times \text{serum creatinine}}$

est_glomerular_filtration_rate

Computes estimated Glomerular Filtration Rate using the CKD-EPI equation, which accounts for age, sex, race, and creatinine.

All these functions automatically detect required input columns with units, apply validated formulas, and create appropriately named output columns with units.

Final Dataset

round_columns

# Generate final ADPPK dataset
adppk = @chain adppk_master begin
    @orderby :EXCLF :USUBJID :DRUG :ADTM :EVID

    transform(eachindex => :RECSEQ)
    transform(groupby(_, [:USUBJID, :DRUG]), eachindex => :ASEQ)

    transform(
        groupby(_, [:USUBJID, :DRUG]),
        :EXROUTE => (x -> first(skipmissing(x))) => :EXROUTE,
        :EXDOSFRM => (x -> first(skipmissing(x))) => :EXDOSFRM,
        :EXDOSFRQ => (x -> first(skipmissing(x))) => :EXDOSFRQ,
    )

    rename(:DRUG => :PROJID, :EXDOSFRQ => :DOSEFRQ, :EXROUTE => :ROUTE, :EXDOSFRM => :FORM)

    select(
        :EXCLF,
        :EXCLFCOM,
        :RECSEQ,
        :ASEQ,
        :STUDYID,
        :USUBJID,
        :PROJID,
        :DOSEFRQ,
        :ROUTE,
        :FORM,
        :DOSEA,
        :AMT,
        :CMT,
        :ADDL,
        :EVID,
        :AVISIT,
        :AVISITN,
        :AFRLT,
        :APRLT,
        :NFRLT,
        :NPRLT,
        :ADTM,
        :ATM,
        :FANLDTM,
        :DV,
        :DVL,
        :MDV,
        :ALLOQ,
        :BLQFL,
        :BLQFN,
        :AGE,
        :SEX,
        :RACE,
        :COUNTRY,
        :WTBL,
        :HTBL,
        :BMIBL,
        :BSABL,
        :CREATBL,
        :CRCLBL,
        :EGFRBL,
        :ASTBL,
        :ALTBL,
        :ALBBL,
    )

    round_columns(3)
end

first(adppk, 16)
16×44 DataFrame
RowEXCLFEXCLFCOMRECSEQASEQSTUDYIDUSUBJIDPROJIDDOSEFRQROUTEFORMDOSEAAMTCMTADDLEVIDAVISITAVISITNAFRLTAPRLTNFRLTNPRLTADTMATMFANLDTMDVDVLMDVALLOQBLQFLBLQFNAGESEXRACECOUNTRYWTBLHTBLBMIBLBSABLCREATBLCRCLBLEGFRBLASTBLALTBLALBBL
Float64String?Float64Float64String15String15String15String3String15String7Float64Float64?Float64Float64Float64StringFloat64Float64Float64Float64Float64DateTimeTimeDateTimeFloat64?Float64?Float64Float64StringFloat64Float64String3String?String3?Float64?Float64Float64?Float64?Float64?Float64?Float64?Float64?Float64?Float64?
10.0missing1.01.0CDISCPILOT0101-701-1028XANOMELINEQDTRANSDERMALPATCH54.0missing2.00.00.0Day 11.0-0.517-0.5170.00.02013-07-18T23:30:0023:30:002013-07-19T00:01:000.0missing0.00.01Y1.071.0MWHITEUSA99.34177.831.4242.215123.7668.00253.73624.026.044.0
20.0missing2.02.0CDISCPILOT0101-701-1028XANOMELINEQDTRANSDERMALPATCH54.054.01.00.01.0Day 11.00.00.00.00.02013-07-19T00:01:0000:01:002013-07-19T00:01:00missingmissing1.00.01N0.071.0MWHITEUSA99.34177.831.4242.215123.7668.00253.73624.026.044.0
30.0missing3.03.0CDISCPILOT0101-701-1028XANOMELINEQDTRANSDERMALPATCH54.0missing2.00.00.0Day 11.00.0670.0670.080.082013-07-19T00:05:0000:05:002013-07-19T00:01:000.102-2.2870.00.01N0.071.0MWHITEUSA99.34177.831.4242.215123.7668.00253.73624.026.044.0
40.0missing4.04.0CDISCPILOT0101-701-1028XANOMELINEQDTRANSDERMALPATCH54.0missing2.00.00.0Day 11.00.4830.4830.50.52013-07-19T00:30:0000:30:002013-07-19T00:01:000.547-0.6030.00.01N0.071.0MWHITEUSA99.34177.831.4242.215123.7668.00253.73624.026.044.0
50.0missing5.05.0CDISCPILOT0101-701-1028XANOMELINEQDTRANSDERMALPATCH54.0missing2.00.00.0Day 11.00.9830.9831.01.02013-07-19T01:00:0001:00:002013-07-19T00:01:000.925-0.0770.00.01N0.071.0MWHITEUSA99.34177.831.4242.215123.7668.00253.73624.026.044.0
60.0missing6.06.0CDISCPILOT0101-701-1028XANOMELINEQDTRANSDERMALPATCH54.0missing2.00.00.0Day 11.01.4831.4831.51.52013-07-19T01:30:0001:30:002013-07-19T00:01:001.1880.1720.00.01N0.071.0MWHITEUSA99.34177.831.4242.215123.7668.00253.73624.026.044.0
70.0missing7.07.0CDISCPILOT0101-701-1028XANOMELINEQDTRANSDERMALPATCH54.0missing2.00.00.0Day 11.01.9831.9832.02.02013-07-19T02:00:0002:00:002013-07-19T00:01:001.3690.3140.00.01N0.071.0MWHITEUSA99.34177.831.4242.215123.7668.00253.73624.026.044.0
80.0missing8.08.0CDISCPILOT0101-701-1028XANOMELINEQDTRANSDERMALPATCH54.0missing2.00.00.0Day 11.03.9833.9834.04.02013-07-19T04:00:0004:00:002013-07-19T00:01:001.6830.5210.00.01N0.071.0MWHITEUSA99.34177.831.4242.215123.7668.00253.73624.026.044.0
90.0missing9.09.0CDISCPILOT0101-701-1028XANOMELINEQDTRANSDERMALPATCH54.0missing2.00.00.0Day 11.05.9835.9836.06.02013-07-19T06:00:0006:00:002013-07-19T00:01:001.7550.5630.00.01N0.071.0MWHITEUSA99.34177.831.4242.215123.7668.00253.73624.026.044.0
100.0missing10.010.0CDISCPILOT0101-701-1028XANOMELINEQDTRANSDERMALPATCH54.0missing2.00.00.0Day 11.07.9837.9838.08.02013-07-19T08:00:0008:00:002013-07-19T00:01:001.7720.5720.00.01N0.071.0MWHITEUSA99.34177.831.4242.215123.7668.00253.73624.026.044.0
110.0missing11.011.0CDISCPILOT0101-701-1028XANOMELINEQDTRANSDERMALPATCH54.0missing2.00.00.0Day 11.011.98311.98312.012.02013-07-19T12:00:0012:00:002013-07-19T00:01:000.495-0.7030.00.01N0.071.0MWHITEUSA99.34177.831.4242.215123.7668.00253.73624.026.044.0
120.0missing12.012.0CDISCPILOT0101-701-1028XANOMELINEQDTRANSDERMALPATCH54.0missing2.00.00.0Day 11.015.98315.98316.016.02013-07-19T16:00:0016:00:002013-07-19T00:01:000.138-1.9810.00.01N0.071.0MWHITEUSA99.34177.831.4242.215123.7668.00253.73624.026.044.0
130.0missing13.013.0CDISCPILOT0101-701-1028XANOMELINEQDTRANSDERMALPATCH54.0missing2.00.00.0Day 22.023.98323.98324.024.02013-07-20T00:00:0000:00:002013-07-19T00:01:000.011-4.5370.00.01N0.071.0MWHITEUSA99.34177.831.4242.215123.7668.00253.73624.026.044.0
140.0missing14.014.0CDISCPILOT0101-701-1028XANOMELINEQDTRANSDERMALPATCH54.054.01.00.01.0Day 22.024.00.024.00.02013-07-20T00:01:0000:01:002013-07-19T00:01:00missingmissing1.00.01N0.071.0MWHITEUSA99.34177.831.4242.215123.7668.00253.73624.026.044.0
150.0missing15.015.0CDISCPILOT0101-701-1028XANOMELINEQDTRANSDERMALPATCH54.0missing2.00.00.0Day 22.035.98311.98336.012.02013-07-20T12:00:0012:00:002013-07-19T00:01:00missingmissing1.00.01N0.071.0MWHITEUSA99.34177.831.4242.215123.7668.00253.73624.026.044.0
160.0missing16.016.0CDISCPILOT0101-701-1028XANOMELINEQDTRANSDERMALPATCH54.0missing2.00.00.0Day 33.047.98323.98348.024.02013-07-21T00:00:0000:00:002013-07-19T00:01:00missingmissing1.00.01N0.071.0MWHITEUSA99.34177.831.4242.215123.7668.00253.73624.026.044.0

This final step:

  1. Generates sequence numbers (RECSEQ for all records, ASEQ within analysis groups)
  2. Fills constant dose characteristics across subject groups
  3. Renames columns to ADaM standards
  4. Selects and orders columns logically
  5. Rounds all numeric columns using round_columns

The sequence numbers are important:

  • RECSEQ: Unique identifier for every record in the dataset
  • ASEQ: Sequential number within analysis groups (subject + drug)

Summary

The complete ADPPK workflow demonstrates how ADaM.jl functions work together:

  1. convert_to_missing: Clean source data
  2. make_dtm, make_dtm_to_dt, make_dtm_to_tm: Process datetime variables
  3. expand_dose_events: Create individual dose records
  4. compress_dose_events: Compress consecutive dose records
  5. set_exclusion: Flag data quality issues systematically
  6. join_columns: Link to reference data for relative calculations
  7. make_duration: Compute time differences
  8. merge_columns: Integrate covariate data efficiently
  9. body_mass_index, body_surface_area, creatinine_clearance, est_glomerular_filtration_rate: Derive standard covariates
  10. round_columns: rounds numeric values in the dataframe

Each function is designed to be composable (works seamlessly in DataFramesMeta @chain pipelines), robust (handles missing data and edge cases), validated (uses established formulas), and documented (clear parameters and behavior).

Key Analysis Variables

The final ADPPK dataset includes critical variables for pharmacometric analysis:

Identification and Sequencing

  • STUDYID: Study identifier
  • USUBJID: Unique subject identifier
  • PROJID: Project/treatment identifier
  • RECSEQ: Record sequence number
  • ASEQ: Analysis sequence number

Dosing Information

  • EVID: Event ID (0=observation, 1=dose)
  • CMT: Compartment (1=central, 2=observation)
  • AMT: Dose amount
  • DOSEA: Actual dose amount (carried forward to observations)
  • ROUTE: Route of administration
  • FORM: Formulation
  • DOSEFRQ: Dose frequency

Time Variables

  • ADTM: Analysis datetime
  • ATM: Analysis time
  • FANLDTM: First Analyte dose datetime
  • NFRLT: Nominal time relative to first dose
  • AFRLT: Actual time relative to first dose
  • NPRLT: Nominal time relative to previous dose
  • APRLT: Actual time relative to previous dose
  • AVISITN: Analysis visit number
  • AVISIT: Analysis visit

Concentration Variables

  • DV: Dependent variable (concentration)
  • DVL: Log-transformed concentration
  • MDV: Missing DV flag (0=observed, 1=missing)
  • ALLOQ: Analysis lower limit of quantification
  • BLQFL: Below LLOQ flag ("Y"/"N")
  • BLQFN: Below LLOQ flag numeric (1/0)

Covariates

  • AGE: Age in years
  • SEX: Sex
  • RACE: Race
  • COUNTRY: Country
  • WTBL: Baseline weight
  • HTBL: Baseline height
  • BMIBL: Baseline BMI
  • BSABL: Baseline BSA
  • CREATBL: Baseline creatinine
  • CRCLBL: Baseline creatinine clearance
  • EGFRBL: Baseline eGFR
  • ASTBL, ALTBL, ALBBL: Baseline liver function biomarkers

Exclusion Flags

  • EXCLF: Exclusion flag (0=include, 1=exclude)
  • EXCLFCOM: Exclusion comment

Conclusion

This tutorial demonstrated the complete workflow for creating an ADPPK dataset using ADaM.jl. The resulting dataset is ready for population PK modeling.

You can customize this template for your specific study by modifying exclusion criteria, adding study-specific covariates, adjusting column selection, or extending the workflow to create related datasets (ADPC, ADNCA, etc.).

For more detailed information on each function, see the ADaM Docstrings.