NCA Analysis Walkthrough

A guided tutorial demonstrating how to perform non-compartmental analysis using PumasAide. This walkthrough follows a realistic conversation between a pharmacometrician new to Pumas and the AI assistant.

Introduction

This tutorial demonstrates a complete NCA workflow using PumasAide's MCP (Model Context Protocol) tools. You'll see how to:

  • Load and explore PK data
  • Create concentration-time visualizations
  • Build an NCA population with proper column mappings
  • Calculate NCA parameters with quality metrics
  • Generate summary tables and plots

The conversation format shows practical questions new users often have.

Getting Started

User: I'd like to learn how to do NCA analysis. I'm familiar with NCA concepts but I've never used Pumas before. Can you walk me through it?

Assistant: I'll guide you through the complete NCA workflow. Before we start, let me explain how PumasAide works.

PumasAide connects me to your Julia session through three tools. I look up documentation and API references with pumas_knowledge, write Julia scripts to the programs/ directory, and execute them with include_julia_file. Every step produces real, reusable Julia code that you can review and edit.

Let me first read the NCA workflow guide, then we'll work through it step by step.

Reads the NCA workflow via pumas_knowledge with doc_id: ["workflows/nca_analysis.md"]

The workflow has these main steps:

  1. Load and validate data - Import your PK dataset
  2. Explore and visualize - Plot concentration-time profiles
  3. Build NCA population - Map your columns to NCA requirements
  4. Calculate parameters - Run the analysis with quality metrics
  5. Review results - Summary tables and visualizations

For this tutorial, I have a sample IV PK dataset we can use. It has concentration-time data from a single-dose study with multiple dose levels - perfect for learning NCA. Shall I load it?

User: Yes, let's use that sample data.

Loading the Data

Assistant: I'll write a script to load the sample dataset. Julia's CSV.read handles CSV, and PumasAide also supports Arrow, SAS, and other formats.

Writes programs/01-data/01-load-pk-data.jl and executes with include_julia_file:

using CSV, DataFrames

pk_data = CSV.read(
    "../../datasets/sample_pk_data.csv",
    DataFrame;
    missingstring = ["NA", ".", ""],
)

Result:

Summary of `DataFrame` `pk_data`: 480 rows × 6 columns

| Column | Type    | Unique | Missing | Values          |
|:-------|:--------|-------:|--------:|:----------------|
| id     | Int64   |     30 |       0 |                 |
| dose   | Int64   |      3 |       0 | 250, 500, 750   |
| time   | Float64 |     16 |       0 |                 |
| conc   | Float64 |    459 |       0 |                 |
| amt    | Int64   |      3 |       0 | 250, 500, 750   |
| route  | String  |      1 |       0 | "iv"            |

Preview: http://localhost:8081/preview/pk_data

The data has 30 subjects across 3 dose groups (250, 500, 750 mg), all IV dosing. The columns are clean with no missing values. You can click the preview link to see the full dataset in your browser.

User: What's the difference between dose and amt? They seem to have the same values.

Assistant: Good question. In NMTRAN/NONMEM-style data:

  • dose is typically a grouping variable indicating the dose level or cohort
  • amt is the actual amount administered at a dosing event

In this dataset they're identical because each subject received their labeled dose. When we build the NCA population, we'll use amt for calculations (it's what NCA needs for clearance) and dose for grouping the results.

Data Exploration

User: Can we see what the concentration-time profiles look like first?

Assistant: Good idea. I'll plot the individual profiles by dose group on a log scale.

Writes programs/02-exploration/01-conc-time-profiles.jl and executes with include_julia_file:

using AlgebraOfGraphics, CairoMakie

plt =
    data(pk_data) *
    mapping(:time, :conc; group = :id => nonnumeric, layout = :dose => nonnumeric) *
    visual(ScatterLines)
conc_time_plot = draw(
    plt;
    axis = (
        xlabel = "Time (h)",
        ylabel = "Concentration (ng/mL)",
        title = "Individual Concentration-Time Profiles",
        yscale = Makie.pseudolog10,
    ),
)

Result:

Preview: http://localhost:8081/preview/conc_time_plot

The plot shows individual profiles for all 30 subjects, faceted by dose group. A few things to note:

  • pseudolog10 scale handles zero values gracefully (unlike log10 which would error)
  • group=:id => nonnumeric connects each subject's points with lines
  • layout=:dose => nonnumeric creates separate panels per dose

Click the preview link to see the plot. Do the profiles look as expected?

User: Yes, that looks right. I can see clear dose-proportional increases and typical IV elimination. Let's proceed with the NCA.

Building the NCA Population

Assistant: Now we need to build an NCA population. This maps your DataFrame columns to what NCA expects.

I'll include amt and route in the mappings - these are essential for clearance and volume calculations:

  • amt - the dose amount for calculating CL = Dose/AUC
  • route - determines whether we report IV parameters (CL, Vz) or extravascular (CL/F, Vz/F)

Without dose information, NCA can only compute exposure metrics like AUC and Cmax.

Writes programs/03-population/01-nca-population.jl and executes with include_julia_file:

using NCA

nca_pop = NCA.read_nca(
    pk_data;
    id = :id,
    time = :time,
    observations = :conc,
    amt = :amt,
    route = :route,
    group = [:dose],
)

Result:

NCAPopulation (30 subjects):
  Group: [["dose" => 250], ["dose" => 500], ["dose" => 750]]
  Number of missing observations: 0
  Number of blq observations: 0

The population is ready with all 30 subjects grouped by dose. No missing or BLQ (below limit of quantification) observations to worry about.

Running NCA Analysis

User: What parameters should we calculate?

Assistant: For a standard single-dose IV study, we'll want exposure metrics (Cmax, Tmax, AUC), elimination parameters (half-life, clearance, volume), and quality metrics to assess reliability:

  • rsq_adj_kel - Adjusted R-squared of the terminal slope fit (should be >= 0.8)
  • auc_extrap_obs - Percent of AUC extrapolated to infinity (should be <= 20%)
  • n_samples_kel - Number of points used for terminal slope (should be >= 3)

Writes programs/05-fitting/01-run-nca.jl and executes with include_julia_file:

using NCA

nca_report = NCA.run_nca(
    nca_pop;
    parameters = [
        :cmax,
        :tmax,
        :auclast,
        :aucinf_obs,
        :half_life,
        :cl_obs,
        :vz_obs,
        :rsq_adj_kel,
        :auc_extrap_obs,
        :n_samples_kel,
    ],
)
nca_results = nca_report.reportdf

Result:

Summary of `DataFrame` `nca_results`: 30 rows × 12 columns

| Column         | Type    | Unique | Missing | Values                     |
|:---------------|:--------|-------:|--------:|:---------------------------|
| id             | String  |     30 |       0 |                            |
| dose           | Int64   |      3 |       0 | 250, 500, 750              |
| cmax           | Float64 |     30 |       0 |                            |
| tmax           | Float64 |      5 |       0 | 0.0, 0.25, 0.5, 0.75, 1.0 |
| auclast        | Float64 |     30 |       0 |                            |
| aucinf_obs     | Float64 |     30 |       0 |                            |
| half_life      | Float64 |     30 |       0 |                            |
| cl_obs         | Float64 |     30 |       0 |                            |
| vz_obs         | Float64 |     30 |       0 |                            |
| rsq_adj_kel    | Float64 |     30 |       0 |                            |
| auc_extrap_obs | Float64 |     30 |       0 |                            |
| n_samples_kel  | Int64   |     13 |       0 |                            |

Preview: http://localhost:8081/preview/nca_results

All 30 subjects have complete results. Click the preview link to review individual values.

User: How do I know if the terminal slope fits are good enough?

Assistant: Check the quality metrics against these thresholds:

MetricThresholdMeaning
rsq_adj_kel>= 0.8Terminal slope fit is reliable
auc_extrap_obs<= 20%AUCinf not overly dependent on extrapolation
n_samples_kel>= 3Enough points to characterize elimination

You should flag and review any subjects failing these criteria. In regulatory submissions, subjects with poor terminal slope fits are often excluded from AUCinf summaries.

Summary Tables

User: Can you create a summary table of the main PK parameters by dose group?

Assistant: I'll create a "Table 1" style summary with statistics by dose group using SummaryTables.

Writes programs/07-results/01-nca-summary-table.jl and executes with include_julia_file:

using SummaryTables

nca_summary = table_one(
    nca_results,
    [:cmax, :tmax, :auclast, :aucinf_obs, :half_life];
    groupby = [:dose],
)

Result:

─────────────────────────────────────────────────────────────────────────────────
                                              dose
                    Total            250             500            750
─────────────────────────────────────────────────────────────────────────────────
cmax
Mean (SD)        67.5 (31.4)     34.4 (6.79)    64.2 (9.94)    104 (18.3)
Median [Min,Max] 66.2 [24,140]   33.9 [24,43.9] 66.2 [46.9,75.1] 102 [80,140]

tmax
Mean (SD)        0.258 (0.282)   0.3 (0.197)    0.225 (0.299)  0.25 (0.354)
Median [Min,Max] 0.25 [0,1]      0.25 [0,0.5]   0.125 [0,0.75] 0.125 [0,1]

auclast
Mean (SD)        329 (143)       173 (18.1)     318 (38.4)     497 (74.5)
Median [Min,Max] 315 [146,605]   179 [146,195]  315 [252,374]  487 [401,605]

aucinf_obs
Mean (SD)        337 (150)       178 (19.8)     323 (38.4)     511 (91.4)
Median [Min,Max] 321 [147,687]   181 [147,201]  321 [254,380]  491 [405,687]

half_life
Mean (SD)        3.98 (1.09)     4.13 (1.23)    3.85 (0.653)   3.97 (1.36)
Median [Min,Max] 3.73 [2.58,7.5] 4.12 [2.58,6.48] 3.73 [2.94,5.42] 3.42 [3.14,7.5]
─────────────────────────────────────────────────────────────────────────────────

Preview: http://localhost:8081/preview/nca_summary

Key observations:

  • Cmax increases proportionally with dose (34 -> 64 -> 104 ng/mL)
  • AUClast shows dose-proportional increases (173 -> 318 -> 497 ng·h/mL)
  • Half-life is consistent across doses (~4 hours), as expected for linear PK
  • Tmax is rapid (~0.25 h) consistent with IV bolus administration

Parameter Visualization

User: A boxplot of AUC by dose would be helpful for my report.

Assistant: Sure, I'll create one showing AUClast across the dose groups.

Writes programs/07-results/02-auc-boxplot.jl and executes with include_julia_file:

using AlgebraOfGraphics, CairoMakie

plt = data(nca_results) * mapping(:dose => nonnumeric, :auclast) * visual(BoxPlot)
pk_params_boxplot = draw(
    plt;
    axis = (
        xlabel = "Dose (mg)",
        ylabel = "AUClast (ng·h/mL)",
        title = "AUClast by Dose Group",
    ),
)

Result:

Preview: http://localhost:8081/preview/pk_params_boxplot

The boxplot clearly shows the dose-proportional increase in exposure. The variability (box width) is reasonable and consistent across dose groups.

Generated Files

User: Where did all the code go? You mentioned files being written.

Assistant: Every step generated an executable Julia script in the programs/ directory:

programs/
  01-data/
    01-load-pk-data.jl         # Data loading
  02-exploration/
    01-conc-time-profiles.jl   # Concentration-time plot
  03-population/
    01-nca-population.jl       # NCA population building
  05-fitting/
    01-run-nca.jl              # NCA parameter calculation
  07-results/
    01-nca-summary-table.jl    # Summary table
    02-auc-boxplot.jl          # Boxplot

These files are:

  • Reproducible - Run them again to recreate results
  • Editable - Modify parameters and re-execute with include_julia_file
  • Auditable - Each file uses standard Pumas packages you can verify independently

Summary

User: So what did we end up with?

Assistant: Here's what we completed:

  1. Loaded 480 concentration-time observations from 30 subjects
  2. Visualized individual profiles across 3 dose groups (250, 500, 750 mg)
  3. Built an NCA population with proper dose and route mappings
  4. Calculated key PK parameters: Cmax, Tmax, AUClast, AUCinf, half-life, CL, Vz
  5. Assessed quality using rsqadjkel, aucextrapobs, and nsampleskel
  6. Generated summary statistics by dose group
  7. Created a boxplot showing dose-proportional exposure

The data shows linear pharmacokinetics with:

  • Dose-proportional Cmax and AUC
  • Consistent half-life (~4 hours) across doses
  • Rapid Tmax consistent with IV bolus administration

For your next analysis, start a fresh conversation to keep context focused. You can reference the generated scripts in programs/ to understand the code patterns.

Key Takeaways

Workflow

  1. The AI reads the NCA workflow guide via pumas_knowledge for step-by-step guidance
  2. Each step produces a Julia script in programs/ using standard Pumas packages
  3. Visualize before analyzing to catch data issues early
  4. Map columns correctly in NCA.read_nca() - include amt and route for CL/Vz
  5. Always include quality metrics in NCA.run_nca()

Quality Thresholds

MetricGoodAction if Failed
rsqadjkel >= 0.8Reliable terminal slopeReview/exclude subject
aucextrapobs <= 20%Reliable AUCinfUse AUClast instead
nsampleskel >= 3Adequate dataCollect more samples

Common Mistakes to Avoid

  • Forgetting to map amt and route (can't calculate CL/Vz without them)
  • Skipping quality metrics (won't know if results are reliable)
  • Not visualizing data before analysis (miss outliers and data issues)