Simulation Walkthrough

A guided tutorial demonstrating how to run pharmacokinetic simulations using PumasAide. This walkthrough follows a realistic conversation between a pharmacometrician new to Pumas and the AI assistant.

Introduction

This tutorial demonstrates a complete simulation workflow using PumasAide's MCP (Model Context Protocol) tools. You'll see how to:

Build a PK model for simulation
Create dosing regimens (single and multiple dose)
Generate a virtual population with covariates
Run simulations with variability
Visualize concentration-time profiles with prediction intervals

The conversation format shows practical questions new users often have. Simulation is often a great starting point because you can explore PK concepts without needing real data.

Getting Started

User: I want to learn how to run PK simulations. I understand basic PK concepts but I've never used Pumas. Can you walk me through it?

Assistant: I'll guide you through the complete simulation workflow. Let me explain how PumasAide works first.

PumasAide provides specialized tools for pharmacometric analysis. Each tool performs a specific task - building models, creating doses, running simulations, making plots. When I use a tool, you'll see the parameters I'm sending and the results that come back.

Let me get the simulation workflow guidance first.

Calls analysis_workflows with analysis_type: "simulation"

The workflow has these main steps:

Build or select a model - Define the PK structure
Set model parameters - Typical values and variability
Create dosing regimens - What doses to simulate
Generate population - Virtual subjects with covariates
Run simulation - Execute and collect results
Visualize - Plot concentration-time profiles

Let's simulate a dose-ranging study for a typical oral drug. This is a common scenario for early clinical development. Ready to start?

User: Yes, let's do it.

Building the Model

Assistant: First, we need a PK model. For a typical small molecule oral drug, a one-compartment model with first-order absorption is a good starting point.

Calls build_pumas_model:

{
  "model_name": "pk_model",
  "explicit_model": "Depots1Central1",
  "observations": {
    "names": ["cp"],
    "error": "Proportional"
  }
}

Tool Result:

PumasModel
  Parameters: tvKa, tvCL, tvVc, omega_Ka, omega_CL, omega_Vc, sigma_prop
  Random effects: eta_Ka, eta_CL, eta_Vc
  Dynamical system variables: Depot, Central
  Derived: cp
  Observed: cp

Code written to: programs/models/pk_model.jl

User: What do all those parameters mean?

Assistant: Great question! Let me break them down:

Parameter	Meaning
`tvKa`	Typical value of absorption rate constant (1/h). How fast drug is absorbed.
`tvCL`	Typical value of clearance (L/h). How fast drug is eliminated.
`tvVc`	Typical value of central volume (L). Distribution space for the drug.
`omega_*`	Between-subject variability (BSV). How much individuals differ from typical.
`sigma_prop`	Residual error. Measurement noise and model misspecification.

The eta_* are the random effects - each simulated subject gets their own random deviation from the typical values.

Setting Parameters

User: How do we choose parameter values?

Assistant: For this example, I'll use typical values for a small molecule oral drug:

Ka = 1 hr^-1: Absorption mostly complete within 3-4 hours
CL = 10 L/h: Moderate clearance
Vc = 100 L: Volume larger than plasma (~3L) due to tissue distribution
30% CV for between-subject variability on each parameter
15% proportional error for residual variability

Calls pumas_model_parameters:

{
  "model_name": "pk_model",
  "parameters_name": "sim_params",
  "parameters": "(tvKa=1.0, tvCL=10.0, tvVc=100.0, omega_Ka=0.3, omega_CL=0.3, omega_Vc=0.3, sigma_prop=0.15)"
}

Tool Result:

(tvKa = 1.0, tvCL = 10.0, tvVc = 100.0, omega_Ka = 0.3, omega_CL = 0.3, omega_Vc = 0.3, sigma_prop = 0.15)

Code written to: programs/models/sim_params.jl

User: What does "30% CV" mean?

Assistant: CV stands for Coefficient of Variation - it's a way to express variability as a percentage of the mean.

If CL has a typical value of 10 L/h with 30% CV:

Most subjects (68%) will have CL between 7-13 L/h
A few subjects might have CL as low as 4 or as high as 16 L/h

Technically, omega = 0.3 is the standard deviation on the log scale. For moderate variability, this approximates 30% CV on the original scale.

Creating Dosing Regimens

Assistant: Now let's create dosing regimens. For a dose-ranging study, we'll simulate three dose levels: 50, 100, and 200 mg given once daily for 5 days.

Calls create_dosage_regimen:

{
  "regimen_name": "dose_50mg",
  "amt": 50,
  "route": "ev",
  "ii": 24,
  "addl": 4,
  "time": 0,
  "cmt": 1
}

Tool Result:

DosageRegimen
  1 dosing event
  Total dose: 50 mg to compartment 1 (extravascular)
  Repeated 5 times at 24 hour intervals

Code written to: programs/regimens/dose_50mg.jl

User: What do ii and addl mean?

Assistant: These are NMTRAN-style dosing parameters:

Parameter	Meaning	Example
`amt`	Dose amount	50 mg
`ii`	Interdose interval	24 hours (once daily)
`addl`	Additional doses	4 more doses (total of 5)
`cmt`	Compartment	1 = depot (oral dosing)
`route`	Administration route	"ev" = extravascular (oral)

So amt=50, ii=24, addl=4 means: "Give 50 mg at time 0, then repeat 4 more times every 24 hours."

Let me create the other two dose levels:

Calls create_dosage_regimen twice more:

{
  "regimen_name": "dose_100mg",
  "amt": 100,
  "route": "ev",
  "ii": 24,
  "addl": 4,
  "time": 0,
  "cmt": 1
}

{
  "regimen_name": "dose_200mg",
  "amt": 200,
  "route": "ev",
  "ii": 24,
  "addl": 4,
  "time": 0,
  "cmt": 1
}

Tool Results:

DosageRegimen (dose_100mg): 100 mg × 5 doses at 24h intervals
DosageRegimen (dose_200mg): 200 mg × 5 doses at 24h intervals

Creating the Simulation Population

Assistant: Now we need virtual subjects to simulate. I'll create 30 subjects per dose group (90 total) with realistic weight distributions.

Calls create_simulation_population:

{
  "population_name": "sim_pop",
  "subpopulations": [
    {
      "n_subjects": 30,
      "regimen_name": "dose_50mg",
      "label": "50 mg"
    },
    {
      "n_subjects": 30,
      "regimen_name": "dose_100mg",
      "label": "100 mg"
    },
    {
      "n_subjects": 30,
      "regimen_name": "dose_200mg",
      "label": "200 mg"
    }
  ],
  "observation_times": "0:0.5:120",
  "covariates": {
    "wt": {"distribution": "Normal", "mean": 70, "std": 15}
  }
}

Tool Result:

Population with 90 subjects:
  - 50 mg: 30 subjects
  - 100 mg: 30 subjects
  - 200 mg: 30 subjects
Observation times: 0, 0.5, 1, ... 120 hours (241 timepoints)
Covariates: wt ~ Normal(70, 15)

Code written to: programs/simulations/sim_pop.jl

User: What does 0:0.5:120 mean?

Assistant: That's Julia syntax for creating a range:

Start: 0
Step: 0.5 (every half hour)
End: 120

So we're collecting simulated concentrations from time 0 to 120 hours (5 days) at 30-minute intervals. This gives us smooth concentration-time curves.

For sparse sampling (like a real study), you might use specific times like [0, 0.5, 1, 2, 4, 8, 12, 24, 48, 72, 96, 120].

Running the Simulation

Assistant: Now we run the simulation. This generates concentration-time profiles for all 90 subjects.

Calls run_simulation:

{
  "simulation_name": "sim_results",
  "model_name": "pk_model",
  "population_name": "sim_pop",
  "parameters_name": "sim_params",
  "include_residual_error": true,
  "seed": 12345
}

Tool Result:

# Summary of DataFrame `sim_results`
21690 rows and 5 columns.
 - id::Int64, 90 unique, 0 missing
 - time::Float64, 241 unique, 0 missing
 - cp::Float64, 21690 unique, 0 missing
 - evid::Int64, 1 unique, 0 missing, values: 0
 - dose_group::String, 3 unique, 0 missing, values: "50 mg", "100 mg", "200 mg"

Simulation results can be viewed at http://localhost:8081/preview/sim_results
Code written to: programs/simulations/sim_results.jl

User: Why did you set a seed?

Assistant: The seed ensures reproducibility. Random number generators use seeds to start their sequence:

Same seed → same random numbers → same simulated subjects
Different seed → different random numbers → different subjects

Without a seed, you'd get different results each time you run the simulation. For reports and publications, setting a seed lets others reproduce your exact results.

Visualizing Results

User: Can we see what the concentration profiles look like?

Assistant: I'll create a plot showing individual profiles with a population summary.

Calls build_plot:

{
  "plot_name": "conc_time_plot",
  "layers": [
    {
      "data": "sim_results",
      "mapping": ":time, :cp; group=:id => nonnumeric, color=:dose_group",
      "visual": ["visual(Lines, alpha=0.2)"]
    },
    {
      "data": "sim_results",
      "mapping": ":time, :cp; color=:dose_group",
      "visual": ["visual(Lines, linewidth=2)"],
      "statistical_transformations": ["expectation()"]
    }
  ],
  "title": "Simulated Concentration-Time Profiles",
  "xlabel": "Time (h)",
  "ylabel": "Concentration (ng/mL)",
  "facet": {"layout": ":dose_group"}
}

Tool Result:

Plot can be viewed at http://localhost:8081/preview/conc_time_plot
Code written to: programs/plots/conc_time_plot.jl

The plot shows:

Faint lines: Individual subject profiles (showing variability)
Bold lines: Population mean concentration
Panels: One for each dose group

You can see dose-proportional increases in exposure and the accumulation pattern over 5 days of dosing.

User: Can we also show prediction intervals instead of all the individual lines?

Assistant: Sure, prediction intervals are cleaner for presentations. I'll show the median with 90% prediction interval.

Calls build_plot:

{
  "plot_name": "conc_time_pi",
  "layers": [
    {
      "data": "sim_results",
      "mapping": ":time, :cp; color=:dose_group",
      "visual": ["visual(Band, alpha=0.3)"],
      "statistical_transformations": ["interval(coverage=0.9)"]
    },
    {
      "data": "sim_results",
      "mapping": ":time, :cp; color=:dose_group",
      "visual": ["visual(Lines, linewidth=2)"],
      "statistical_transformations": ["expectation()"]
    }
  ],
  "title": "Simulated Concentrations with 90% Prediction Interval",
  "xlabel": "Time (h)",
  "ylabel": "Concentration (ng/mL)"
}

Tool Result:

Plot can be viewed at http://localhost:8081/preview/conc_time_pi
Code written to: programs/plots/conc_time_pi.jl

This version shows:

Shaded bands: 90% of simulated subjects fall within this range
Lines: Population mean for each dose

This is the typical format for simulation results in clinical pharmacology reports.

Exploring Exposure Metrics

User: What about summary statistics like Cmax and AUC?

Assistant: Good question. For simulations, we often want to summarize exposure metrics. Let me calculate Cmax and trough concentrations at steady state (day 5).

Calls data_wrangler:

{
  "transformations": [
    {
      "type": "filter",
      "dataframe_name": "sim_results",
      "condition": ":time >= 96"
    },
    {
      "type": "group_summarize",
      "by": ["id", "dose_group"],
      "aggregations": {
        "cmax_ss": "maximum(:cp)",
        "cmin_ss": "minimum(:cp)"
      }
    }
  ],
  "output_name": "exposure_metrics"
}

Tool Result:

# Summary of DataFrame `exposure_metrics`
90 rows and 4 columns.
 - id::Int64, 90 unique
 - dose_group::String, 3 unique
 - cmax_ss::Float64, 90 unique
 - cmin_ss::Float64, 90 unique

Code written to: programs/data/exposure_metrics.jl

Now let's visualize the exposure distributions:

Calls build_plot:

{
  "plot_name": "cmax_boxplot",
  "layers": [{
    "data": "exposure_metrics",
    "mapping": ":dose_group, :cmax_ss",
    "visual": ["visual(BoxPlot)"]
  }],
  "title": "Steady-State Cmax by Dose Group",
  "xlabel": "Dose",
  "ylabel": "Cmax,ss (ng/mL)"
}

Tool Result:

Plot can be viewed at http://localhost:8081/preview/cmax_boxplot
Code written to: programs/plots/cmax_boxplot.jl

The boxplot shows dose-proportional increases in Cmax, with the expected overlap in distributions due to between-subject variability.

Generated Files

User: Where can I find all the code that was generated?

Assistant: Every tool operation generates executable Julia code in the programs/ directory:

programs/
  models/
    pk_model.jl           # Model definition
    sim_params.jl         # Parameter values
  regimens/
    dose_50mg.jl          # 50 mg regimen
    dose_100mg.jl         # 100 mg regimen
    dose_200mg.jl         # 200 mg regimen
  simulations/
    sim_pop.jl            # Population definition
    sim_results.jl        # Simulation execution
  data/
    exposure_metrics.jl   # Exposure calculations
  plots/
    conc_time_plot.jl     # Individual profiles
    conc_time_pi.jl       # Prediction intervals
    cmax_boxplot.jl       # Cmax comparison

These files are:

Reproducible: Run them again to recreate results
Editable: Change parameters and re-execute
Educational: Read them to learn Pumas syntax

Summary

User: What did we accomplish?

Assistant: Here's what we completed:

Built a one-compartment oral PK model with variability
Set typical PK parameters (Ka, CL, Vc) with 30% BSV
Created dosing regimens for 50, 100, and 200 mg once daily
Generated 90 virtual subjects (30 per dose group)
Simulated 5 days of dosing with residual error
Visualized individual profiles and prediction intervals
Calculated steady-state Cmax across dose groups

The simulation shows:

Dose-proportional exposure increases
Typical accumulation pattern for once-daily dosing
Reasonable between-subject variability (~30% CV)

This type of simulation is commonly used for:

Dose selection in early development
Predicting exposure at untested doses
Designing clinical trials

Key Takeaways

Tool Workflow

build_pumas_model - Define the PK structure
pumas_model_parameters - Set typical values and variability
create_dosage_regimen - Define doses (can create multiple)
create_simulation_population - Generate virtual subjects
run_simulation - Execute and collect results
build_plot - Visualize concentration-time profiles

Parameter Guidelines

Parameter	Typical Range	Notes
Ka	0.5-2 hr^-1	Faster = sharper Cmax
CL	1-100 L/h	Drug-specific
Vc	10-500 L	Usually > plasma volume
omega	0.2-0.5	~20-50% CV
sigma_prop	0.1-0.2	~10-20% residual error

Common Variations

IV dosing: Use route: "iv" and cmt: 2 (central compartment)
Two-compartment: Use explicit_model: "Depots1Central1Periph1"
Infusion: Add rate parameter for infusion duration
Loading dose: Create separate regimen for first dose

Common Mistakes to Avoid

Forgetting to set seed for reproducibility
Using unrealistic parameter values (check literature)
Too few subjects to see variability patterns (use 20+ per group)
Not including residual error (makes predictions look too precise)