NCA Analysis Walkthrough

A guided tutorial demonstrating how to perform non-compartmental analysis using PumasAide. This walkthrough follows a realistic conversation between a pharmacometrician new to Pumas and the AI assistant.

Introduction

This tutorial demonstrates a complete NCA workflow using PumasAide's MCP (Model Context Protocol) tools. You'll see how to:

Load and explore PK data
Create concentration-time visualizations
Build an NCA population with proper column mappings
Calculate NCA parameters with quality metrics
Generate summary tables and plots

The conversation format shows practical questions new users often have.

Getting Started

User: I'd like to learn how to do NCA analysis. I'm familiar with NCA concepts but I've never used Pumas before. Can you walk me through it?

Assistant: I'll guide you through the complete NCA workflow. Before we start, let me explain how PumasAide works.

PumasAide provides specialized tools for pharmacometric analysis. Each tool performs a specific task - loading data, creating plots, running analyses, etc. When I use a tool, you'll see the parameters I'm sending and the results that come back. You can always ask me to adjust my choices before I proceed.

Let me first get the NCA workflow guidance, then we'll work through it step by step.

Calls analysis_workflows with analysis_type: "nca_analysis"

The workflow has these main steps:

Load and validate data - Import your PK dataset
Explore and visualize - Plot concentration-time profiles
Build NCA population - Map your columns to NCA requirements
Calculate parameters - Run the analysis with quality metrics
Review results - Summary tables and visualizations

For this tutorial, I have a sample IV PK dataset we can use. It has concentration-time data from a single-dose study with multiple dose levels - perfect for learning NCA. Shall I load it?

User: Yes, let's use that sample data.

Loading the Data

Assistant: I'll load the sample dataset using load_dataset. This tool reads CSV, Arrow, SAS, and other formats into a DataFrame.

Calls load_dataset:

{
  "working_directory": "/path/to/project",
  "relative_path": "sample_pk_data.csv",
  "dataframe_name": "pk_data"
}

Tool Result:

# Summary of DataFrame `pk_data`
480 rows and 6 columns.
 - id::Int64, 30 unique, 0 missing
 - dose::Int64, 3 unique, 0 missing, values: 250, 500, 750
 - time::Float64, 16 unique, 0 missing
 - conc::Float64, 459 unique, 0 missing
 - amt::Int64, 3 unique, 0 missing, values: 250, 500, 750
 - route::String, 1 unique, 0 missing, values: "iv"

DataFrame can be viewed at http://localhost:8081/preview/pk_data
Code written to: programs/data_loading/pk_data.jl

The data has 30 subjects across 3 dose groups (250, 500, 750 mg), all IV dosing. The columns are clean with no missing values. You can click the preview link to see the full dataset in your browser.

User: What's the difference between dose and amt? They seem to have the same values.

Assistant: Good question. In NMTRAN/NONMEM-style data:

dose is typically a grouping variable indicating the dose level or cohort
amt is the actual amount administered at a dosing event

In this dataset they're identical because each subject received their labeled dose. When we build the NCA population, we'll use amt for calculations (it's what NCA needs for clearance) and dose for grouping the results.

Data Exploration

User: Can we see what the concentration-time profiles look like first?

Assistant: Good idea. I'll plot the individual profiles by dose group on a log scale.

Calls build_plot:

{
  "plot_name": "conc_time_plot",
  "layers": [{
    "data": "pk_data",
    "mapping": ":time, :conc; group=:id => nonnumeric, layout=:dose => nonnumeric",
    "visual": ["visual(ScatterLines)"]
  }],
  "title": "Individual Concentration-Time Profiles",
  "xlabel": "Time (h)",
  "ylabel": "Concentration (ng/mL)",
  "yscale": "pseudolog10"
}

Tool Result:

Plot can be viewed at http://localhost:8081/preview/conc_time_plot
Code written to: programs/results/conc_time_plot.jl

The plot shows individual profiles for all 30 subjects, faceted by dose group. A few things to note:

pseudolog10 scale handles zero values gracefully (unlike log10 which would error)
group=:id => nonnumeric connects each subject's points with lines
layout=:dose => nonnumeric creates separate panels per dose

Click the preview link to see the plot. Do the profiles look as expected?

User: Yes, that looks right. I can see clear dose-proportional increases and typical IV elimination. Let's proceed with the NCA.

Building the NCA Population

Assistant: Now we need to build an NCA population. This maps your DataFrame columns to what NCA expects.

I'll include amt and route in the mappings - these are essential for clearance and volume calculations:

amt - the dose amount for calculating CL = Dose/AUC
route - determines whether we report IV parameters (CL, Vz) or extravascular (CL/F, Vz/F)

Without dose information, NCA can only compute exposure metrics like AUC and Cmax.

Calls build_nca_population:

{
  "population_name": "nca_pop",
  "dataframe_name": "pk_data",
  "mappings": {
    "id": "id",
    "time": "time",
    "observations": "conc",
    "amt": "amt",
    "route": "route",
    "group": ["dose"]
  }
}

Tool Result:

NCAPopulation (30 subjects):
  Group: [["dose" => 250], ["dose" => 500], ["dose" => 750]]
  Number of missing observations: 0
  Number of blq observations: 0

Code written to: programs/data_loading/nca_pop.jl

The population is ready with all 30 subjects grouped by dose. No missing or BLQ (below limit of quantification) observations to worry about.

Running NCA Analysis

User: What parameters should we calculate?

Assistant: For a standard single-dose IV study, we'll want exposure metrics (Cmax, Tmax, AUC), elimination parameters (half-life, clearance, volume), and quality metrics to assess reliability:

rsq_adj_kel - Adjusted R-squared of the terminal slope fit (should be >= 0.8)
auc_extrap_obs - Percent of AUC extrapolated to infinity (should be <= 20%)
n_samples_kel - Number of points used for terminal slope (should be >= 3)

Calls run_nca_analysis:

{
  "population_name": "nca_pop",
  "analysis_result_name": "nca_results",
  "parameters": [
    "cmax",
    "tmax",
    "auclast",
    "aucinf_obs",
    "half_life",
    "cl_obs",
    "vz_obs",
    "rsq_adj_kel",
    "auc_extrap_obs",
    "n_samples_kel"
  ]
}

Tool Result:

# Summary of DataFrame `nca_results`
30 rows and 12 columns.
 - id::String, 30 unique, 0 missing
 - dose::Int64, 3 unique, 0 missing, values: 250, 500, 750
 - cmax::Float64, 30 unique, 0 missing
 - tmax::Float64, 5 unique, 0 missing, values: 0.0, 0.25, 0.5, 0.75, 1.0
 - auclast::Float64, 30 unique, 0 missing
 - aucinf_obs::Float64, 30 unique, 0 missing
 - half_life::Float64, 30 unique, 0 missing
 - cl_obs::Float64, 30 unique, 0 missing
 - vz_obs::Float64, 30 unique, 0 missing
 - rsq_adj_kel::Float64, 30 unique, 0 missing
 - auc_extrap_obs::Float64, 30 unique, 0 missing
 - n_samples_kel::Int64, 13 unique, 0 missing

NCA results can be viewed at http://localhost:8081/preview/nca_results
Code written to: programs/results/nca_results.jl

All 30 subjects have complete results. Click the preview link to review individual values.

User: How do I know if the terminal slope fits are good enough?

Assistant: Check the quality metrics against these thresholds:

Metric	Threshold	Meaning
`rsq_adj_kel`	>= 0.8	Terminal slope fit is reliable
`auc_extrap_obs`	<= 20%	AUCinf not overly dependent on extrapolation
`n_samples_kel`	>= 3	Enough points to characterize elimination

You should flag and review any subjects failing these criteria. In regulatory submissions, subjects with poor terminal slope fits are often excluded from AUCinf summaries.

Summary Tables

User: Can you create a summary table of the main PK parameters by dose group?

Assistant: I'll use table_summary to create a "Table 1" style summary with statistics by dose group.

Calls table_summary:

{
  "table_name": "nca_summary",
  "table": {
    "type": "table_one",
    "options": {
      "dataframe_name": "nca_results",
      "groupby_columns": ["dose"],
      "columns_to_analyze": ["cmax", "tmax", "auclast", "aucinf_obs", "half_life"]
    }
  }
}

Tool Result:

─────────────────────────────────────────────────────────────────────────────────
                                              dose
                    Total            250             500            750
─────────────────────────────────────────────────────────────────────────────────
cmax
Mean (SD)        67.5 (31.4)     34.4 (6.79)    64.2 (9.94)    104 (18.3)
Median [Min,Max] 66.2 [24,140]   33.9 [24,43.9] 66.2 [46.9,75.1] 102 [80,140]

tmax
Mean (SD)        0.258 (0.282)   0.3 (0.197)    0.225 (0.299)  0.25 (0.354)
Median [Min,Max] 0.25 [0,1]      0.25 [0,0.5]   0.125 [0,0.75] 0.125 [0,1]

auclast
Mean (SD)        329 (143)       173 (18.1)     318 (38.4)     497 (74.5)
Median [Min,Max] 315 [146,605]   179 [146,195]  315 [252,374]  487 [401,605]

aucinf_obs
Mean (SD)        337 (150)       178 (19.8)     323 (38.4)     511 (91.4)
Median [Min,Max] 321 [147,687]   181 [147,201]  321 [254,380]  491 [405,687]

half_life
Mean (SD)        3.98 (1.09)     4.13 (1.23)    3.85 (0.653)   3.97 (1.36)
Median [Min,Max] 3.73 [2.58,7.5] 4.12 [2.58,6.48] 3.73 [2.94,5.42] 3.42 [3.14,7.5]
─────────────────────────────────────────────────────────────────────────────────

Table can be viewed at http://localhost:8081/preview/nca_summary
Code written to: programs/analysis/nca_summary.jl

Key observations:

Cmax increases proportionally with dose (34 -> 64 -> 104 ng/mL)
AUClast shows dose-proportional increases (173 -> 318 -> 497 ng·h/mL)
Half-life is consistent across doses (~4 hours), as expected for linear PK
Tmax is rapid (~0.25 h) consistent with IV bolus administration

Parameter Visualization

User: A boxplot of AUC by dose would be helpful for my report.

Assistant: Sure, I'll create one showing AUClast across the dose groups.

Calls build_plot:

{
  "plot_name": "pk_params_boxplot",
  "layers": [{
    "data": "nca_results",
    "mapping": ":dose => nonnumeric, :auclast",
    "visual": ["visual(BoxPlot)"]
  }],
  "title": "AUClast by Dose Group",
  "xlabel": "Dose (mg)",
  "ylabel": "AUClast (ng*h/mL)"
}

Tool Result:

Plot can be viewed at http://localhost:8081/preview/pk_params_boxplot
Code written to: programs/results/pk_params_boxplot.jl

The boxplot clearly shows the dose-proportional increase in exposure. The variability (box width) is reasonable and consistent across dose groups.

Generated Files

User: Where did all the code go? You mentioned files being written.

Assistant: Every tool operation generates executable Julia code in the programs/ directory:

programs/
  data_loading/
    pk_data.jl          # Data loading code
    nca_pop.jl          # NCA population building
  results/
    conc_time_plot.jl   # Concentration-time plot
    nca_results.jl      # NCA calculation
    pk_params_boxplot.jl # Boxplot
  analysis/
    nca_summary.jl      # Summary table

These files are:

Reproducible - Run them again to recreate results
Editable - Modify parameters and re-execute
Auditable - Each file includes metadata documenting the tool call

You can use include_julia_file to re-run any of these scripts.

Summary

User: So what did we end up with?

Assistant: Here's what we completed:

Loaded 480 concentration-time observations from 30 subjects
Visualized individual profiles across 3 dose groups (250, 500, 750 mg)
Built an NCA population with proper dose and route mappings
Calculated key PK parameters: Cmax, Tmax, AUClast, AUCinf, half-life, CL, Vz
Assessed quality using rsqadjkel, aucextrapobs, and nsampleskel
Generated summary statistics by dose group
Created a boxplot showing dose-proportional exposure

The data shows linear pharmacokinetics with:

Dose-proportional Cmax and AUC
Consistent half-life (~4 hours) across doses
Rapid Tmax consistent with IV bolus administration

For your next analysis, start a fresh conversation to keep context focused. You can reference the generated scripts in programs/ to understand the code patterns.

Key Takeaways

Tool Workflow

Always call analysis_workflows first to get step-by-step guidance
Load data with load_dataset and check the summary
Visualize before analyzing with build_plot
Map columns correctly in build_nca_population - include amt and route for CL/Vz
Include quality metrics in run_nca_analysis

Quality Thresholds

Metric	Good	Action if Failed
rsqadjkel >= 0.8	Reliable terminal slope	Review/exclude subject
aucextrapobs <= 20%	Reliable AUCinf	Use AUClast instead
nsampleskel >= 3	Adequate data	Collect more samples

Common Mistakes to Avoid

Forgetting to map amt and route (can't calculate CL/Vz without them)
Skipping quality metrics (won't know if results are reliable)
Not visualizing data before analysis (miss outliers and data issues)