Introduction to ADaM

This tutorial demonstrates ADaM.jl, a Julia package for generating ADaM (Analysis Data Model) datasets from SDTM (Study Data Tabulation Model) data. ADaM.jl supports creation of various analysis datasets including:

ADPPK (ADaM Dataset for Pop-PK)
ADNCA (ADaM Dataset for NCA)

The basic workflow of ADaM dataset creation is:

Read SDTM source data.
Transform and derive analysis variables.
Apply data quality checks and exclusions.
Generate the final analysis-ready dataset.

In this tutorial we will show how to build a complete ADPPK dataset from SDTM data, demonstrating the key functions and transformations available in ADaM.jl. The workflow involves reading concentration (PC), dosing (EX), demographics (DM), vital signs (VS), and lab (LB) data, then systematically deriving all required analysis variables.

Experimental API

The ADaM.jl API is currently experimental and subject to change in future releases. Please be aware that function signatures, behavior, and output formats may be modified as the package evolves.

Setup and Data Loading

First, let's import the required packages and load the SDTM source data:

# Import Required Packages
using DataFramesMeta
using Dates
using StatsBase
using PharmaDatasets
using ADaM

# Read SDTM source data
pc = @chain dataset("SDTM/CDISCPILOT01/pc") convert_to_missing(["", nothing])
ex = @chain dataset("SDTM/CDISCPILOT01/ex") convert_to_missing(["", nothing])
dm = @chain dataset("SDTM/CDISCPILOT01/dm") convert_to_missing(["", nothing])
vs = @chain dataset("SDTM/CDISCPILOT01/vs") convert_to_missing(["", nothing])
lb = @chain dataset("SDTM/CDISCPILOT01/lb") convert_to_missing(["", nothing])

# Drug lookup table
drug_lookup = unique(@select(ex, :USUBJID, :DRUG = :EXTRT))

first(select(pc, :USUBJID, :PCDTC, :PCTPT, :PCTPTNUM, :PCSTRESN, :PCSTRESU), 2)

2×6 DataFrame

Row	USUBJID	PCDTC	PCTPT	PCTPTNUM	PCSTRESN	PCSTRESU
	String15	String31	String31	Float64	Float64?	String7
1	01-701-1015	2014-01-01T23:30:00	Pre-dose	-0.5	0.0	ug/ml
2	01-701-1015	2014-01-02T00:05:00	5 Min Post-dose	0.08	missing	ug/ml

`convert_to_missing`

The convert_to_missing function converts problematic values to proper missing values. Common values to convert include empty strings (""), nothing, special markers like "NA", ".", "BLQ", and NaN for numeric data. This ensures consistent missing data handling throughout the workflow and prevents issues with downstream operations.

Datetime Processing

Clinical trial data contains datetime information in various string formats that need to be parsed and transformed.

Preparing Concentration Data

# Prepare PC data with datetime conversions and derived variables
pc_prep = @chain pc begin
    make_dtm(_, :PCDTC)
    make_dtm_to_dt(_, :ADTM)
    make_dtm_to_tm(_, :ADTM)

    @rtransform @astable begin
        :NFRLT = (:PCTPTNUM < 0) ? 0 : :PCTPTNUM
        :AVISITN = Int(div(:NFRLT, 24)) + 1
        :AVISIT = "Day " * string(:AVISITN)
        :EVID = 0
        :II = 0
    end

    leftjoin(drug_lookup, on = :USUBJID)

    transform(
        groupby(_, [:USUBJID, :DRUG]),
        :ADTM => (x -> maximum(skipmissing(x))) => :LOBSDTM,
    )

    @orderby :USUBJID :DRUG :ADTM
end

first(
    select(pc_prep, :USUBJID, :DRUG, :EVID, :PCDTC, :ADTM, :ADT, :ATM, :NFRLT, :AVISITN),
    2,
)

2×9 DataFrame

Row	USUBJID	DRUG	EVID	PCDTC	ADTM	ADT	ATM	NFRLT	AVISITN
	String15	String15?	Int64	String31	DateTime	Date	Time	Real	Int64
1	01-701-1015	PLACEBO	0	2014-01-01T23:30:00	2014-01-01T23:30:00	2014-01-01	23:30:00	0	1
2	01-701-1015	PLACEBO	0	2014-01-02T00:05:00	2014-01-02T00:05:00	2014-01-02	00:05:00	0.08	1

`make_dtm`

The make_dtm function converts character datetime columns to Julia DateTime objects. The prefix parameter controls the output column name (default is "A"), and fmt specifies the datetime format string.

`make_dtm_to_dt` and `make_dtm_to_tm`

These functions extract components from datetime objects:

make_dtm_to_dt: Extracts the date portion (year-month-day)
make_dtm_to_tm: Extracts the time portion (hour:minute:second)

Both functions use the same prefix logic as make_dtm to name output columns.

Preparing Dosing Data

# Prepare EX data with datetime conversions
ex_prep = @chain ex begin
    make_dtm(_, :EXSTDTC, prefix = "AST", fmt = "yyyy-mm-dd")
    make_dtm(_, :EXENDTC, prefix = "AEN", fmt = "yyyy-mm-dd")
    @rtransform :AENDTM = ismissing(:AENDTM) ? :ASTDTM : :AENDTM
    make_dtm_to_dt(_, :ASTDTM, prefix = "AST")
    make_dtm_to_dt(_, :AENDTM, prefix = "AEN")

    @rtransform @astable begin
        :ASTDTM = :ASTDTM + Minute(1)
        :AENDTM = :AENDTM + Minute(1)
        :II = interdose_interval(:EXDOSFRQ)
        :NFRLT = 24 * (:VISITDY - 1)
        :AVISITN = Int(div(:NFRLT, 24)) + 1
        :AVISIT = "Day " * string(:AVISITN)
        :EVID = 1
        :DRUG = :EXTRT
    end

    transform(
        groupby(_, [:USUBJID, :DRUG]),
        :ASTDTM => (x -> minimum(skipmissing(x))) => :FANLDTM,
        :EXDOSE => (x -> minimum(skipmissing(x))) => :EXDOSE_first,
    )

    @orderby :USUBJID :DRUG :ASTDTM
end

Dose Event Expansion

One of the most powerful transformations for ADPPK datasets is expanding dose intervals.

`expand_dose_events`

# Expand dose events
ex_exp = @chain ex_prep begin
    expand_dose_events
    transform(groupby(_, :USUBJID), eachindex => :EXSEQ)

    @rtransform @astable begin
        :AVISITN = Int(div(:NFRLT, 24)) + 1
        :AVISIT = "Day " * string(:AVISITN)
        :ADTM = :ASTDTM
        :EVID = 1
    end

    make_dtm_to_dt(_, :ADTM, prefix = "A")
    make_dtm_to_tm(_, :ADTM, prefix = "A")
end

first(select(ex_exp, :USUBJID, :DRUG, :EVID, :ASTDTM, :EXDOSE, :NFRLT), 3)

3×6 DataFrame

Row	USUBJID	DRUG	EVID	ASTDTM	EXDOSE	NFRLT
	String15	String15	Int64	DateTime	Float64	Float64
1	01-701-1015	PLACEBO	1	2014-01-02T00:01:00	0.0	0.0
2	01-701-1015	PLACEBO	1	2014-01-03T00:01:00	0.0	24.0
3	01-701-1015	PLACEBO	1	2014-01-04T00:01:00	0.0	48.0

In SDTM EX data, multiple doses are often represented as intervals with a frequency code (e.g., QD = once daily, BID = twice daily). The expand_dose_events function creates a separate row for each individual dose by:

Identifying dose intervals using start (ASTDT) and end (AENDT) dates
Parsing the frequency code (EXDOSFRQ) to determine inter-dose interval
Creating individual dose records with correct timing (NFRLT)
Maintaining all other EX variables for each expanded record

This is essential for pharmacometric analysis, which requires one record per actual dose administration.

Dose Compression

After expanding individual doses, we can optionally compress consecutive identical doses into NONMEM-style records using the ADDL (additional doses) column for more compact representation.

# Compress dose events
adppk_cmp = @chain pc_prep begin
    vcat(ex_exp, cols = :union)
    @orderby :USUBJID :DRUG :ADTM :EVID

    compress_dose_events
end

first(select(adppk_cmp, :USUBJID, :DRUG, :EVID, :ADTM, :ADDL, :II), 3)

3×6 DataFrame

Row	USUBJID	DRUG	EVID	ADTM	ADDL	II
	String15	String15	Int64	DateTime	Int64	Int64
1	01-701-1015	PLACEBO	0	2014-01-01T23:30:00	0	0
2	01-701-1015	PLACEBO	1	2014-01-02T00:01:00	0	24
3	01-701-1015	PLACEBO	0	2014-01-02T00:05:00	0	0

Data Quality and Exclusions

`set_exclusion`

# Combine PC and EX data with exclusion flags
adppk_flag = @chain adppk_cmp begin
    transform(eachindex => :SEQ)
    transform(
        groupby(_, [:USUBJID, :DRUG]),
        :LOBSDTM => (x -> minimum(skipmissing(x))) => :LOBSDTM,
    )

    set_exclusion(
        "Subjects with missing conc.",
        excl_func = group -> all(ismissing, replace(group.PCSTRESN, 0 => missing)),
        group = [:USUBJID, :DRUG],
    )
    set_exclusion(
        "Subjects with no dose records",
        excl_func = group -> all((==)(0), group.EVID),
        group = [:USUBJID, :DRUG],
    )
    set_exclusion(
        "Subjects with no conc. records",
        excl_func = group -> all((==)(1), group.EVID),
        group = [:USUBJID, :DRUG],
    )
    set_exclusion(
        "Dosing records after last observation",
        excl_func = group -> group.ADTM >= group.LOBSDTM && all((==)(1), group.EVID),
        group = [:SEQ],
    )

    @orderby :USUBJID :DRUG :ADTM :EVID
end

first(select(adppk_flag, :USUBJID, :DRUG, :EVID, :ADTM, :EXCLF, :EXCLFCOM), 3)

3×6 DataFrame

Row	USUBJID	DRUG	EVID	ADTM	EXCLF	EXCLFCOM
	String15	String15	Int64	DateTime	Int64	String?
1	01-701-1015	PLACEBO	0	2014-01-01T23:30:00	1	Subjects with missing conc.
2	01-701-1015	PLACEBO	1	2014-01-02T00:01:00	1	Subjects with missing conc.
3	01-701-1015	PLACEBO	0	2014-01-02T00:05:00	1	Subjects with missing conc.

Data quality is critical for pharmacometric analysis. The set_exclusion function provides a systematic way to flag records that should be excluded. It creates two columns:

EXCLF: Exclusion Flag (0 = include, 1 = exclude)
EXCLFCOM: Exclusion Flag Comment (text description of exclusion reason)

Key parameters:

excl_func: Anonymous function defining the exclusion condition
group: Grouping columns to apply the function (e.g., by subject, by record)

You can apply multiple exclusion criteria sequentially - the function accumulates flags and appends comments.

Reference Data Joins

`join_columns`

# Derive reference data and time variables
adppk_nom_prev = @chain adppk_flag begin
    join_columns(
        ex_exp,
        on = [:USUBJID],
        order = [:ADTM],
        keep = [:ADTM => :ADTM_prev, :EXDOSE => :EXDOSE_prev],
        filter_join = (t, r) -> t.ADTM > r.ADTM,
        mode = "last",
    )
    join_columns(
        ex_exp,
        on = [:USUBJID],
        order = [:NFRLT],
        keep = [:NFRLT => :NFRLT_prev],
        filter_join = (t, r) -> t.NFRLT > r.NFRLT,
        mode = "last",
    )

    @orderby :USUBJID :DRUG :ADTM :EVID
end

first(
    select(
        adppk_nom_prev,
        :USUBJID,
        :DRUG,
        :EVID,
        :ADTM,
        :ADTM_prev,
        :EXDOSE_prev,
        :NFRLT,
        :NFRLT_prev,
    ),
    3,
)

3×8 DataFrame

Row	USUBJID	DRUG	EVID	ADTM	ADTM_prev	EXDOSE_prev	NFRLT	NFRLT_prev
	String15	String15	Int64	DateTime	DateTime?	Float64?	Float64	Float64?
1	01-701-1015	PLACEBO	0	2014-01-01T23:30:00	missing	missing	0.0	missing
2	01-701-1015	PLACEBO	1	2014-01-02T00:01:00	missing	missing	0.0	missing
3	01-701-1015	PLACEBO	0	2014-01-02T00:05:00	2014-01-02T00:01:00	0.0	0.08	0.0

A critical step in ADPPK creation is joining reference data from prior records to derive relative time variables. The join_columns function is a specialized join that retrieves values from reference records based on conditions.

Parameters:

on: Columns to join by (typically subject ID)
order: Column(s) defining the temporal sequence
keep: Dictionary mapping source columns to target column names
filter_join: Condition (as anonymous function) defining valid reference records
mode: "last" for most recent, "first" for earliest reference

Time Duration Calculations

`make_duration`

# Derive relative time variables
adppk_aprlt = @chain adppk_nom_prev begin
    transform(
        groupby(_, [:USUBJID, :DRUG]),
        :FANLDTM => (x -> minimum(skipmissing(x))) => :FANLDTM,
        :NFRLT => (x -> minimum(skipmissing(x))) => :min_NFRLT,
        :EXDOSE_first => (x -> minimum(skipmissing(x))) => :EXDOSE_first,
    )

    make_duration(:AFRLT, start_dtm = :FANLDTM, end_dtm = :ADTM)
    make_duration(:APRLT, start_dtm = :ADTM_prev, end_dtm = :ADTM)

    @rtransform :APRLT = :EVID == 1 ? 0 : ismissing(:APRLT) ? :AFRLT : :APRLT
    @rtransform :NPRLT =
        (:EVID == 1) ? 0 :
        ismissing(:NFRLT_prev) ? (:NFRLT - :min_NFRLT) : (:NFRLT - :NFRLT_prev)

    @orderby :USUBJID :DRUG :ADTM :EVID
end

first(
    select(
        adppk_aprlt,
        :USUBJID,
        :DRUG,
        :EVID,
        :ADTM,
        :FANLDTM,
        :AFRLT,
        :APRLT,
        :NFRLT,
        :NPRLT,
    ),
    3,
)

3×9 DataFrame

Row	USUBJID	DRUG	EVID	ADTM	FANLDTM	AFRLT	APRLT	NFRLT	NPRLT
	String15	String15	Int64	DateTime	DateTime	Float64	Real	Float64	Real
1	01-701-1015	PLACEBO	0	2014-01-01T23:30:00	2014-01-02T00:01:00	-0.516667	-0.516667	0.0	0.0
2	01-701-1015	PLACEBO	1	2014-01-02T00:01:00	2014-01-02T00:01:00	0.0	0	0.0	0
3	01-701-1015	PLACEBO	0	2014-01-02T00:05:00	2014-01-02T00:01:00	0.0666667	0.0666667	0.08	0.08

With reference data available, we can calculate relative time variables. The make_duration function computes time differences in hours (fractional values allowed) and handles missing values appropriately.

Standard relative time variables in ADPPK:

Variable	Description	Reference Point
AFRLT	Actual Time Relative to First Dose	FANLDTM (First Dose DateTime)
APRLT	Actual Time Relative to Previous Dose	ADTM_prev (Previous Dose DateTime)
NFRLT	Nominal Time Relative to First Dose	Planned timepoints
NPRLT	Nominal Time Relative to Previous Dose	NFRLT_prev

Demographics and Analysis Variables

# Add demographics
adppk_dm = leftjoin(adppk_aprlt, dm, on = :USUBJID, makeunique = true)

# Derive analysis variables
adppk_aval = @chain adppk_dm begin
    @orderby :USUBJID :ADTM

    @rtransform @astable begin
        :DOSEA =
            (:EVID == 1) ? :EXDOSE : ismissing(:EXDOSE_prev) ? :EXDOSE_first : :EXDOSE_prev
        :CMT = (:EVID == 1) ? 1 : 2
        :AMT = (:EVID == 1) ? :EXDOSE : missing
    end

    transform(
        groupby(_, [:USUBJID, :DRUG]),
        :PCLLOQ => (x -> only(unique(skipmissing(x)))) => :ALLOQ,
    )
    @rtransform :BLQFL = (coalesce(:PCSTRESN, 999) <= :ALLOQ) ? "Y" : "N"
    @rtransform :BLQFN = (coalesce(:PCSTRESN, 999) <= :ALLOQ) ? 1 : 0
    @rtransform :DV = (:EVID == 1) ? missing : :PCSTRESN
    @rtransform @passmissing :DVL = (:DV > 0) ? log(:DV) : missing
    @rtransform :MDV = (:EVID == 1) ? 1 : ismissing(:DV) ? 1 : 0
    @rtransform :AVALU = (:EVID == 1) ? :EXDOSU : :PCSTRESU

    @orderby :USUBJID :DRUG :ADTM :EVID
end

first(select(adppk_aval, :USUBJID, :DRUG, :EVID, :ADTM, :DV, :DOSEA, :AMT, :BLQFL, :MDV), 3)

3×9 DataFrame

Row	USUBJID	DRUG	EVID	ADTM	DV	DOSEA	AMT	BLQFL	MDV
	String15	String15	Int64	DateTime	Float64?	Float64	Float64?	String	Int64
1	01-701-1015	PLACEBO	0	2014-01-01T23:30:00	0.0	0.0	missing	Y	0
2	01-701-1015	PLACEBO	1	2014-01-02T00:01:00	missing	0.0	0.0	N	1
3	01-701-1015	PLACEBO	0	2014-01-02T00:05:00	missing	0.0	missing	N	1

This step adds demographics and derives key PK analysis variables including:

DOSEA: Actual dose amount carried forward to observations
CMT: Compartment (1=central for doses, 2=observation for concentrations)
AMT: Dose amount
DV: Dependent variable (concentration)
DVL: Log-transformed concentration
MDV: Missing DV flag
BLQFL/BLQFN: Below limit of quantification flags

Covariate Lookup Tables

`merge_columns`

# Create covariate lookup tables
lb_val = @chain lb begin
    @rsubset(:LBBLFL == "Y")
    @rtransform :LBTESTCDB = string(:LBTESTCD, "BL")
    unstack(:USUBJID, :LBTESTCDB, :LBSTRESN)
end

lb_uni = @chain lb begin
    @rsubset(:LBBLFL == "Y")
    @rtransform :LBTESTCDU = string(:LBTESTCD, "BLU")
    unstack(:USUBJID, :LBTESTCDU, :LBSTRESU)
end

lb_lookup = @chain dm begin
    @select :USUBJID
    leftjoin(lb_val, on = :USUBJID)
    leftjoin(lb_uni, on = :USUBJID)
end

vs_lookup = @chain dm begin
    @select :USUBJID
    merge_columns(
        vs,
        filter_func = r -> r.VSTESTCD == "HEIGHT",
        on = [:USUBJID],
        keep = ["VSSTRESN" => "HTBL", "VSSTRESU" => "HTBLU"],
    )
    merge_columns(
        vs,
        filter_func = r ->
            coalesce(r.VSTESTCD, "") == "WEIGHT" && coalesce(r.VSBLFL, "") == "Y",
        on = [:USUBJID],
        keep = ["VSSTRESN" => "WTBL", "VSSTRESU" => "WTBLU"],
    )
end

first(vs_lookup, 2)

2×5 DataFrame

Row	USUBJID	HTBL	HTBLU	WTBL	WTBLU
	String15	Float64?	String15?	Float64?	String15?
1	01-701-1015	147.32	cm	54.43	kg
2	01-701-1023	162.56	cm	80.29	kg

The merge_columns function combines leftjoin, subset, and select operations into a single convenient call. It's especially useful for creating lookup tables from vertical (long) datasets like VS and LB.

Parameters:

filter_func: Anonymous function to subset the secondary dataset before joining
on: Join key columns (typically subject ID)
keep: Dictionary mapping source column names to target column names

Derived Covariates

# Create ADPPK master dataset with all covariates
adppk_master = @chain adppk_aval begin
    @rtransform :AGEU = "yr"

    leftjoin(lb_lookup, on = :USUBJID)
    leftjoin(vs_lookup, on = :USUBJID)

    body_mass_index
    body_surface_area

    transform(
        groupby(_, [:USUBJID, :DRUG]),
        :AGE => (x -> first(skipmissing(x))) => :AGE,
        :SEX => (x -> first(skipmissing(x))) => :SEX,
    )

    creatinine_clearance
    est_glomerular_filtration_rate
end

first(
    select(
        adppk_master,
        :USUBJID,
        :DRUG,
        :EVID,
        :ADTM,
        :DV,
        :DOSEA,
        :AGE,
        :SEX,
        :WTBL,
        :BMIBL,
        :CRCLBL,
    ),
    2,
)

2×11 DataFrame

Row	USUBJID	DRUG	EVID	ADTM	DV	DOSEA	AGE	SEX	WTBL	BMIBL	CRCLBL
	String15	String15	Int64	DateTime	Float64?	Float64	Float64	String3	Float64?	Float64?	Float64?
1	01-701-1015	PLACEBO	0	2014-01-01T23:30:00	0.0	0.0	63.0	F	54.43	25.0793	54.977
2	01-701-1015	PLACEBO	1	2014-01-02T00:01:00	missing	0.0	63.0	F	54.43	25.0793	54.977

ADaM.jl provides validated functions to compute standard derived covariates:

`body_mass_index`

Computes Body Mass Index: $\text{BMI} = \frac{\text{weight (kg)}}{\text{height (m)}^2}$

`body_surface_area`

Computes Body Surface Area using the Mosteller formula: $\text{BSA} = \sqrt{\frac{\text{height (cm)} \times \text{weight (kg)}}{3600}}$

`creatinine_clearance`

Computes Creatinine Clearance using the Cockcroft-Gault equation: $\text{CrCL} = \frac{(140 - \text{age}) \times \text{weight} \times (0.85 \text{ if female})}{72 \times \text{serum creatinine}}$

`est_glomerular_filtration_rate`

Computes estimated Glomerular Filtration Rate using the CKD-EPI equation, which accounts for age, sex, race, and creatinine.

All these functions automatically detect required input columns with units, apply validated formulas, and create appropriately named output columns with units.

Final Dataset

`round_columns`

# Generate final ADPPK dataset
adppk = @chain adppk_master begin
    @orderby :EXCLF :USUBJID :DRUG :ADTM :EVID

    transform(eachindex => :RECSEQ)
    transform(groupby(_, [:USUBJID, :DRUG]), eachindex => :ASEQ)

    transform(
        groupby(_, [:USUBJID, :DRUG]),
        :EXROUTE => (x -> first(skipmissing(x))) => :EXROUTE,
        :EXDOSFRM => (x -> first(skipmissing(x))) => :EXDOSFRM,
        :EXDOSFRQ => (x -> first(skipmissing(x))) => :EXDOSFRQ,
    )

    rename(:DRUG => :PROJID, :EXDOSFRQ => :DOSEFRQ, :EXROUTE => :ROUTE, :EXDOSFRM => :FORM)

    select(
        :EXCLF,
        :EXCLFCOM,
        :RECSEQ,
        :ASEQ,
        :STUDYID,
        :USUBJID,
        :PROJID,
        :DOSEFRQ,
        :ROUTE,
        :FORM,
        :DOSEA,
        :AMT,
        :CMT,
        :ADDL,
        :EVID,
        :AVISIT,
        :AVISITN,
        :AFRLT,
        :APRLT,
        :NFRLT,
        :NPRLT,
        :ADTM,
        :ATM,
        :FANLDTM,
        :DV,
        :DVL,
        :MDV,
        :ALLOQ,
        :BLQFL,
        :BLQFN,
        :AGE,
        :SEX,
        :RACE,
        :COUNTRY,
        :WTBL,
        :HTBL,
        :BMIBL,
        :BSABL,
        :CREATBL,
        :CRCLBL,
        :EGFRBL,
        :ASTBL,
        :ALTBL,
        :ALBBL,
    )

    round_columns(3)
end

first(adppk, 16)

16×44 DataFrame

Row	EXCLF	EXCLFCOM	RECSEQ	ASEQ	STUDYID	USUBJID	PROJID	DOSEFRQ	ROUTE	FORM	DOSEA	AMT	CMT	ADDL	EVID	AVISIT	AVISITN	AFRLT	APRLT	NFRLT	NPRLT	ADTM	ATM	FANLDTM	DV	DVL	MDV	ALLOQ	BLQFL	BLQFN	AGE	SEX	RACE	COUNTRY	WTBL	HTBL	BMIBL	BSABL	CREATBL	CRCLBL	EGFRBL	ASTBL	ALTBL	ALBBL
	Float64	String?	Float64	Float64	String15	String15	String15	String3	String15	String7	Float64	Float64?	Float64	Float64	Float64	String	Float64	Float64	Float64	Float64	Float64	DateTime	Time	DateTime	Float64?	Float64?	Float64	Float64	String	Float64	Float64	String3	String?	String3?	Float64?	Float64	Float64?	Float64?	Float64?	Float64?	Float64?	Float64?	Float64?	Float64?
1	0.0	missing	1.0	1.0	CDISCPILOT01	01-701-1028	XANOMELINE	QD	TRANSDERMAL	PATCH	54.0	missing	2.0	0.0	0.0	Day 1	1.0	-0.517	-0.517	0.0	0.0	2013-07-18T23:30:00	23:30:00	2013-07-19T00:01:00	0.0	missing	0.0	0.01	Y	1.0	71.0	M	WHITE	USA	99.34	177.8	31.424	2.215	123.76	68.002	53.736	24.0	26.0	44.0
2	0.0	missing	2.0	2.0	CDISCPILOT01	01-701-1028	XANOMELINE	QD	TRANSDERMAL	PATCH	54.0	54.0	1.0	0.0	1.0	Day 1	1.0	0.0	0.0	0.0	0.0	2013-07-19T00:01:00	00:01:00	2013-07-19T00:01:00	missing	missing	1.0	0.01	N	0.0	71.0	M	WHITE	USA	99.34	177.8	31.424	2.215	123.76	68.002	53.736	24.0	26.0	44.0
3	0.0	missing	3.0	3.0	CDISCPILOT01	01-701-1028	XANOMELINE	QD	TRANSDERMAL	PATCH	54.0	missing	2.0	0.0	0.0	Day 1	1.0	0.067	0.067	0.08	0.08	2013-07-19T00:05:00	00:05:00	2013-07-19T00:01:00	0.102	-2.287	0.0	0.01	N	0.0	71.0	M	WHITE	USA	99.34	177.8	31.424	2.215	123.76	68.002	53.736	24.0	26.0	44.0
4	0.0	missing	4.0	4.0	CDISCPILOT01	01-701-1028	XANOMELINE	QD	TRANSDERMAL	PATCH	54.0	missing	2.0	0.0	0.0	Day 1	1.0	0.483	0.483	0.5	0.5	2013-07-19T00:30:00	00:30:00	2013-07-19T00:01:00	0.547	-0.603	0.0	0.01	N	0.0	71.0	M	WHITE	USA	99.34	177.8	31.424	2.215	123.76	68.002	53.736	24.0	26.0	44.0
5	0.0	missing	5.0	5.0	CDISCPILOT01	01-701-1028	XANOMELINE	QD	TRANSDERMAL	PATCH	54.0	missing	2.0	0.0	0.0	Day 1	1.0	0.983	0.983	1.0	1.0	2013-07-19T01:00:00	01:00:00	2013-07-19T00:01:00	0.925	-0.077	0.0	0.01	N	0.0	71.0	M	WHITE	USA	99.34	177.8	31.424	2.215	123.76	68.002	53.736	24.0	26.0	44.0
6	0.0	missing	6.0	6.0	CDISCPILOT01	01-701-1028	XANOMELINE	QD	TRANSDERMAL	PATCH	54.0	missing	2.0	0.0	0.0	Day 1	1.0	1.483	1.483	1.5	1.5	2013-07-19T01:30:00	01:30:00	2013-07-19T00:01:00	1.188	0.172	0.0	0.01	N	0.0	71.0	M	WHITE	USA	99.34	177.8	31.424	2.215	123.76	68.002	53.736	24.0	26.0	44.0
7	0.0	missing	7.0	7.0	CDISCPILOT01	01-701-1028	XANOMELINE	QD	TRANSDERMAL	PATCH	54.0	missing	2.0	0.0	0.0	Day 1	1.0	1.983	1.983	2.0	2.0	2013-07-19T02:00:00	02:00:00	2013-07-19T00:01:00	1.369	0.314	0.0	0.01	N	0.0	71.0	M	WHITE	USA	99.34	177.8	31.424	2.215	123.76	68.002	53.736	24.0	26.0	44.0
8	0.0	missing	8.0	8.0	CDISCPILOT01	01-701-1028	XANOMELINE	QD	TRANSDERMAL	PATCH	54.0	missing	2.0	0.0	0.0	Day 1	1.0	3.983	3.983	4.0	4.0	2013-07-19T04:00:00	04:00:00	2013-07-19T00:01:00	1.683	0.521	0.0	0.01	N	0.0	71.0	M	WHITE	USA	99.34	177.8	31.424	2.215	123.76	68.002	53.736	24.0	26.0	44.0
9	0.0	missing	9.0	9.0	CDISCPILOT01	01-701-1028	XANOMELINE	QD	TRANSDERMAL	PATCH	54.0	missing	2.0	0.0	0.0	Day 1	1.0	5.983	5.983	6.0	6.0	2013-07-19T06:00:00	06:00:00	2013-07-19T00:01:00	1.755	0.563	0.0	0.01	N	0.0	71.0	M	WHITE	USA	99.34	177.8	31.424	2.215	123.76	68.002	53.736	24.0	26.0	44.0
10	0.0	missing	10.0	10.0	CDISCPILOT01	01-701-1028	XANOMELINE	QD	TRANSDERMAL	PATCH	54.0	missing	2.0	0.0	0.0	Day 1	1.0	7.983	7.983	8.0	8.0	2013-07-19T08:00:00	08:00:00	2013-07-19T00:01:00	1.772	0.572	0.0	0.01	N	0.0	71.0	M	WHITE	USA	99.34	177.8	31.424	2.215	123.76	68.002	53.736	24.0	26.0	44.0
11	0.0	missing	11.0	11.0	CDISCPILOT01	01-701-1028	XANOMELINE	QD	TRANSDERMAL	PATCH	54.0	missing	2.0	0.0	0.0	Day 1	1.0	11.983	11.983	12.0	12.0	2013-07-19T12:00:00	12:00:00	2013-07-19T00:01:00	0.495	-0.703	0.0	0.01	N	0.0	71.0	M	WHITE	USA	99.34	177.8	31.424	2.215	123.76	68.002	53.736	24.0	26.0	44.0
12	0.0	missing	12.0	12.0	CDISCPILOT01	01-701-1028	XANOMELINE	QD	TRANSDERMAL	PATCH	54.0	missing	2.0	0.0	0.0	Day 1	1.0	15.983	15.983	16.0	16.0	2013-07-19T16:00:00	16:00:00	2013-07-19T00:01:00	0.138	-1.981	0.0	0.01	N	0.0	71.0	M	WHITE	USA	99.34	177.8	31.424	2.215	123.76	68.002	53.736	24.0	26.0	44.0
13	0.0	missing	13.0	13.0	CDISCPILOT01	01-701-1028	XANOMELINE	QD	TRANSDERMAL	PATCH	54.0	missing	2.0	0.0	0.0	Day 2	2.0	23.983	23.983	24.0	24.0	2013-07-20T00:00:00	00:00:00	2013-07-19T00:01:00	0.011	-4.537	0.0	0.01	N	0.0	71.0	M	WHITE	USA	99.34	177.8	31.424	2.215	123.76	68.002	53.736	24.0	26.0	44.0
14	0.0	missing	14.0	14.0	CDISCPILOT01	01-701-1028	XANOMELINE	QD	TRANSDERMAL	PATCH	54.0	54.0	1.0	0.0	1.0	Day 2	2.0	24.0	0.0	24.0	0.0	2013-07-20T00:01:00	00:01:00	2013-07-19T00:01:00	missing	missing	1.0	0.01	N	0.0	71.0	M	WHITE	USA	99.34	177.8	31.424	2.215	123.76	68.002	53.736	24.0	26.0	44.0
15	0.0	missing	15.0	15.0	CDISCPILOT01	01-701-1028	XANOMELINE	QD	TRANSDERMAL	PATCH	54.0	missing	2.0	0.0	0.0	Day 2	2.0	35.983	11.983	36.0	12.0	2013-07-20T12:00:00	12:00:00	2013-07-19T00:01:00	missing	missing	1.0	0.01	N	0.0	71.0	M	WHITE	USA	99.34	177.8	31.424	2.215	123.76	68.002	53.736	24.0	26.0	44.0
16	0.0	missing	16.0	16.0	CDISCPILOT01	01-701-1028	XANOMELINE	QD	TRANSDERMAL	PATCH	54.0	missing	2.0	0.0	0.0	Day 3	3.0	47.983	23.983	48.0	24.0	2013-07-21T00:00:00	00:00:00	2013-07-19T00:01:00	missing	missing	1.0	0.01	N	0.0	71.0	M	WHITE	USA	99.34	177.8	31.424	2.215	123.76	68.002	53.736	24.0	26.0	44.0

This final step:

Generates sequence numbers (RECSEQ for all records, ASEQ within analysis groups)
Fills constant dose characteristics across subject groups
Renames columns to ADaM standards
Selects and orders columns logically
Rounds all numeric columns using round_columns

The sequence numbers are important:

RECSEQ: Unique identifier for every record in the dataset
ASEQ: Sequential number within analysis groups (subject + drug)

Summary

The complete ADPPK workflow demonstrates how ADaM.jl functions work together:

convert_to_missing: Clean source data
make_dtm, make_dtm_to_dt, make_dtm_to_tm: Process datetime variables
expand_dose_events: Create individual dose records
compress_dose_events: Compress consecutive dose records
set_exclusion: Flag data quality issues systematically
join_columns: Link to reference data for relative calculations
make_duration: Compute time differences
merge_columns: Integrate covariate data efficiently
body_mass_index, body_surface_area, creatinine_clearance, est_glomerular_filtration_rate: Derive standard covariates
round_columns: rounds numeric values in the dataframe

Each function is designed to be composable (works seamlessly in DataFramesMeta @chain pipelines), robust (handles missing data and edge cases), validated (uses established formulas), and documented (clear parameters and behavior).

Key Analysis Variables

The final ADPPK dataset includes critical variables for pharmacometric analysis:

Identification and Sequencing

STUDYID: Study identifier
USUBJID: Unique subject identifier
PROJID: Project/treatment identifier
RECSEQ: Record sequence number
ASEQ: Analysis sequence number

Dosing Information

EVID: Event ID (0=observation, 1=dose)
CMT: Compartment (1=central, 2=observation)
AMT: Dose amount
DOSEA: Actual dose amount (carried forward to observations)
ROUTE: Route of administration
FORM: Formulation
DOSEFRQ: Dose frequency

Time Variables

ADTM: Analysis datetime
ATM: Analysis time
FANLDTM: First Analyte dose datetime
NFRLT: Nominal time relative to first dose
AFRLT: Actual time relative to first dose
NPRLT: Nominal time relative to previous dose
APRLT: Actual time relative to previous dose
AVISITN: Analysis visit number
AVISIT: Analysis visit

Concentration Variables

DV: Dependent variable (concentration)
DVL: Log-transformed concentration
MDV: Missing DV flag (0=observed, 1=missing)
ALLOQ: Analysis lower limit of quantification
BLQFL: Below LLOQ flag ("Y"/"N")
BLQFN: Below LLOQ flag numeric (1/0)

Covariates

AGE: Age in years
SEX: Sex
RACE: Race
COUNTRY: Country
WTBL: Baseline weight
HTBL: Baseline height
BMIBL: Baseline BMI
BSABL: Baseline BSA
CREATBL: Baseline creatinine
CRCLBL: Baseline creatinine clearance
EGFRBL: Baseline eGFR
ASTBL, ALTBL, ALBBL: Baseline liver function biomarkers

Exclusion Flags

EXCLF: Exclusion flag (0=include, 1=exclude)
EXCLFCOM: Exclusion comment

Conclusion

This tutorial demonstrated the complete workflow for creating an ADPPK dataset using ADaM.jl. The resulting dataset is ready for population PK modeling.

You can customize this template for your specific study by modifying exclusion criteria, adding study-specific covariates, adjusting column selection, or extending the workflow to create related datasets (ADPC, ADNCA, etc.).

For more detailed information on each function, see the ADaM Docstrings.

make_dtm_to_dt and make_dtm_to_tm

`make_dtm_to_dt` and `make_dtm_to_tm`