Simulation Walkthrough
A guided tutorial demonstrating how to run pharmacokinetic simulations using PumasAide. This walkthrough follows a realistic conversation between a pharmacometrician new to Pumas and the AI assistant.
Introduction
This tutorial demonstrates a complete simulation workflow using PumasAide's MCP (Model Context Protocol) tools. You'll see how to:
- Build a PK model for simulation
- Create dosing regimens (single and multiple dose)
- Generate a virtual population with covariates
- Run simulations with variability
- Visualize concentration-time profiles with prediction intervals
The conversation format shows practical questions new users often have. Simulation is often a great starting point because you can explore PK concepts without needing real data.
Getting Started
User: I want to learn how to run PK simulations. I understand basic PK concepts but I've never used Pumas. Can you walk me through it?
Assistant: I'll guide you through the complete simulation workflow. Let me explain how PumasAide works first.
PumasAide connects me to your Julia session through three tools. I look up documentation and API references with pumas_knowledge, write Julia scripts to the programs/ directory, and execute them with include_julia_file. Every step produces real, reusable Julia code.
Let me read the simulation workflow guide first.
Reads the simulation workflow via pumas_knowledge with doc_id: ["workflows/simulation.md"]
The workflow has these main steps:
- Build or select a model - Define the PK structure
- Set model parameters - Typical values and variability
- Create dosing regimens - What doses to simulate
- Generate population - Virtual subjects with covariates
- Run simulation - Execute and collect results
- Visualize - Plot concentration-time profiles
Let's simulate a dose-ranging study for a typical oral drug. This is a common scenario for early clinical development. Ready to start?
User: Yes, let's do it.
Building the Model
Assistant: First, we need a PK model. For a typical small molecule oral drug, a one-compartment model with first-order absorption is a good starting point.
Writes programs/04-models/01-sim-1cmt-ka.jl and executes with include_julia_file:
using Pumas
pk_model = @model begin
@param begin
tvKa ∈ RealDomain(lower = 0.0)
tvCL ∈ RealDomain(lower = 0.0)
tvVc ∈ RealDomain(lower = 0.0)
Ω ∈ PDiagDomain(3)
σ_prop ∈ RealDomain(lower = 0.0)
end
@random begin
η ~ MvNormal(Ω)
end
@pre begin
Ka = tvKa * exp(η[1])
CL = tvCL * exp(η[2])
Vc = tvVc * exp(η[3])
end
@dynamics Depots1Central1
@derived begin
μ := @. Central / Vc
cp ~ @. Normal(μ, abs(μ) * σ_prop)
end
endResult:
PumasModel
Parameters: tvKa, tvCL, tvVc, Ω, σ_prop
Random effects: η
Dynamical system: Depots1Central1
Derived: cp
Observed: cpUser: What do all those parameters mean?
Assistant: Great question! Let me break them down:
| Parameter | Meaning |
|---|---|
tvKa | Typical value of absorption rate constant (1/h). How fast drug is absorbed. |
tvCL | Typical value of clearance (L/h). How fast drug is eliminated. |
tvVc | Typical value of central volume (L). Distribution space for the drug. |
Ω | Between-subject variability (BSV). How much individuals differ from typical. |
σ_prop | Residual error. Measurement noise and model misspecification. |
The η are the random effects - each simulated subject gets their own random deviation from the typical values.
Setting Parameters
User: How do we choose parameter values?
Assistant: For this example, I'll use typical values for a small molecule oral drug:
- Ka = 1 hr^-1: Absorption mostly complete within 3-4 hours
- CL = 10 L/h: Moderate clearance
- Vc = 100 L: Volume larger than plasma (~3L) due to tissue distribution
- 30% CV for between-subject variability on each parameter
- 15% proportional error for residual variability
Adds parameters to the same model file:
sim_params =
(tvKa = 1.0, tvCL = 10.0, tvVc = 100.0, Ω = Diagonal([0.09, 0.09, 0.09]), σ_prop = 0.15)User: What does "30% CV" mean?
Assistant: CV stands for Coefficient of Variation - it's a way to express variability as a percentage of the mean.
If CL has a typical value of 10 L/h with 30% CV:
- Most subjects (68%) will have CL between 7-13 L/h
- A few subjects might have CL as low as 4 or as high as 16 L/h
Technically, omega = 0.3 is the standard deviation on the log scale. For moderate variability, this approximates 30% CV on the original scale. In the covariance matrix, we store variances: 0.3² = 0.09.
Creating Dosing Regimens
Assistant: Now let's create dosing regimens. For a dose-ranging study, we'll simulate three dose levels: 50, 100, and 200 mg given once daily for 5 days.
Writes programs/04-models/02-dose-regimens.jl and executes with include_julia_file:
using Pumas
dose_50mg = DosageRegimen(50; time = 0, cmt = 1, ii = 24, addl = 4)
dose_100mg = DosageRegimen(100; time = 0, cmt = 1, ii = 24, addl = 4)
dose_200mg = DosageRegimen(200; time = 0, cmt = 1, ii = 24, addl = 4)Result:
3 DosageRegimen objects created:
dose_50mg: 50 mg to cmt 1, repeated 5 times at 24h intervals
dose_100mg: 100 mg to cmt 1, repeated 5 times at 24h intervals
dose_200mg: 200 mg to cmt 1, repeated 5 times at 24h intervalsUser: What do ii and addl mean?
Assistant: These are NMTRAN-style dosing parameters:
| Parameter | Meaning | Example |
|---|---|---|
amt | Dose amount | 50 mg |
ii | Interdose interval | 24 hours (once daily) |
addl | Additional doses | 4 more doses (total of 5) |
cmt | Compartment | 1 = depot (oral dosing) |
So amt=50, ii=24, addl=4 means: "Give 50 mg at time 0, then repeat 4 more times every 24 hours."
Creating the Simulation Population
Assistant: Now we need virtual subjects to simulate. I'll create 30 subjects per dose group (90 total) with realistic weight distributions.
Writes programs/05-fitting/01-sim-population.jl and executes with include_julia_file:
import Random
using Pumas
Random.seed!(12345)
sim_pop = vcat(
[
Subject(;
id = i,
events = dose_50mg,
covariates = (wt = rand(Normal(70, 15)), dose_group = "50 mg"),
observations = (cp = nothing,),
time = 0:0.5:120,
) for i = 1:30
],
[
Subject(;
id = 30 + i,
events = dose_100mg,
covariates = (wt = rand(Normal(70, 15)), dose_group = "100 mg"),
observations = (cp = nothing,),
time = 0:0.5:120,
) for i = 1:30
],
[
Subject(;
id = 60 + i,
events = dose_200mg,
covariates = (wt = rand(Normal(70, 15)), dose_group = "200 mg"),
observations = (cp = nothing,),
time = 0:0.5:120,
) for i = 1:30
],
)Result:
Population with 90 subjects:
- 50 mg: 30 subjects
- 100 mg: 30 subjects
- 200 mg: 30 subjects
Observation times: 0, 0.5, 1, ... 120 hours (241 timepoints)
Covariates: wt ~ Normal(70, 15), dose_groupUser: What does 0:0.5:120 mean?
Assistant: That's Julia syntax for creating a range:
- Start: 0
- Step: 0.5 (every half hour)
- End: 120
So we're collecting simulated concentrations from time 0 to 120 hours (5 days) at 30-minute intervals. This gives us smooth concentration-time curves.
For sparse sampling (like a real study), you might use specific times like [0, 0.5, 1, 2, 4, 8, 12, 24, 48, 72, 96, 120].
Running the Simulation
Assistant: Now we run the simulation. This generates concentration-time profiles for all 90 subjects.
Writes programs/05-fitting/02-run-simulation.jl and executes with include_julia_file:
import Random
using Pumas, DataFrames, DataFramesMeta
Random.seed!(12345)
sims = simobs(pk_model, sim_pop, sim_params)
sim_results = @rsubset(DataFrame(sims), :evid == 0)Result:
Summary of `DataFrame` `sim_results`: 21690 rows × 5 columns
| Column | Type | Unique | Missing | Values |
|:-----------|:--------|-------:|--------:|:--------------------------|
| id | Int64 | 90 | 0 | |
| time | Float64 | 241 | 0 | |
| cp | Float64 | 21690 | 0 | |
| dose_group | String | 3 | 0 | "50 mg", "100 mg", "200 mg" |
Preview: http://localhost:8081/preview/sim_resultsUser: Why did you set a seed?
Assistant: The seed ensures reproducibility. Random number generators use seeds to start their sequence:
- Same seed → same random numbers → same simulated subjects
- Different seed → different random numbers → different subjects
Without a seed, you'd get different results each time you run the simulation. For reports and publications, setting a seed lets others reproduce your exact results.
Visualizing Results
User: Can we see what the concentration profiles look like?
Assistant: I'll create a plot showing individual profiles with a population mean overlay, faceted by dose group.
Writes programs/07-results/01-sim-profiles.jl and executes with include_julia_file:
using AlgebraOfGraphics, CairoMakie, DataFramesMeta, Statistics
# Individual profiles (faint) with mean overlay (bold)
plt_indiv =
data(sim_results) *
mapping(:time, :cp; group = :id => nonnumeric, layout = :dose_group) *
visual(Lines; alpha = 0.2)
# Compute mean profile per dose group
sim_mean = @chain sim_results begin
groupby([:dose_group, :time])
@combine :mean_cp = mean(:cp)
end
plt_mean =
data(sim_mean) *
mapping(:time, :mean_cp; layout = :dose_group) *
visual(Lines; linewidth = 2, color = :red)
conc_time_plot = draw(
plt_indiv + plt_mean;
axis = (
xlabel = "Time (h)",
ylabel = "Concentration (ng/mL)",
title = "Simulated Concentration-Time Profiles",
),
)Result:
Preview: http://localhost:8081/preview/conc_time_plotThe plot shows:
- Faint lines: Individual subject profiles (showing variability)
- Bold red lines: Population mean concentration
- Panels: One for each dose group
You can see dose-proportional increases in exposure and the accumulation pattern over 5 days of dosing.
User: Can we also show prediction intervals instead of all the individual lines?
Assistant: Sure, prediction intervals are cleaner for presentations. I'll show the median with 90% prediction interval.
Writes programs/07-results/02-sim-prediction-intervals.jl and executes with include_julia_file:
using AlgebraOfGraphics, CairoMakie, DataFramesMeta, Statistics
# Compute quantiles per dose group and time
sim_quantiles = @chain sim_results begin
groupby([:dose_group, :time])
@combine begin
:median_cp = median(:cp)
:lo_cp = quantile(:cp, 0.05)
:hi_cp = quantile(:cp, 0.95)
end
end
plt_band =
data(sim_quantiles) *
mapping(:time, :lo_cp, :hi_cp; layout = :dose_group) *
visual(Band; alpha = 0.3)
plt_median =
data(sim_quantiles) *
mapping(:time, :median_cp; layout = :dose_group) *
visual(Lines; linewidth = 2)
conc_time_pi = draw(
plt_band + plt_median;
axis = (
xlabel = "Time (h)",
ylabel = "Concentration (ng/mL)",
title = "Simulated Concentrations with 90% Prediction Interval",
),
)Result:
Preview: http://localhost:8081/preview/conc_time_piThis version shows:
- Shaded bands: 90% of simulated subjects fall within this range
- Lines: Population median for each dose
This is the typical format for simulation results in clinical pharmacology reports.
Exploring Exposure Metrics
User: What about summary statistics like Cmax and AUC?
Assistant: Good question. For simulations, we often want to summarize exposure metrics. Let me calculate Cmax and trough concentrations at steady state (day 5).
Writes programs/07-results/03-exposure-metrics.jl and executes with include_julia_file:
using DataFramesMeta, AlgebraOfGraphics, CairoMakie
# Steady-state is the last dosing interval (96-120h)
exposure_metrics = @chain sim_results begin
@rsubset :time >= 96
groupby([:id, :dose_group])
@combine begin
:cmax_ss = maximum(:cp)
:cmin_ss = minimum(:cp)
end
end
# Boxplot of steady-state Cmax
plt = data(exposure_metrics) * mapping(:dose_group, :cmax_ss) * visual(BoxPlot)
cmax_boxplot = draw(
plt;
axis = (
xlabel = "Dose",
ylabel = "Cmax,ss (ng/mL)",
title = "Steady-State Cmax by Dose Group",
),
)Result:
Preview: http://localhost:8081/preview/cmax_boxplotThe boxplot shows dose-proportional increases in Cmax, with the expected overlap in distributions due to between-subject variability.
Generated Files
User: Where can I find all the code that was generated?
Assistant: Every step generated an executable Julia script in the programs/ directory:
programs/
04-models/
01-sim-1cmt-ka.jl # Model definition + parameters
02-dose-regimens.jl # Dosing regimens
05-fitting/
01-sim-population.jl # Virtual population creation
02-run-simulation.jl # Simulation execution
07-results/
01-sim-profiles.jl # Individual profiles plot
02-sim-prediction-intervals.jl # Prediction interval plot
03-exposure-metrics.jl # Exposure calculations + boxplotThese files are:
- Reproducible: Run them again to recreate results
- Editable: Change parameters and re-execute with
include_julia_file - Educational: Read them to learn Pumas syntax
Summary
User: What did we accomplish?
Assistant: Here's what we completed:
- Built a one-compartment oral PK model with variability
- Set typical PK parameters (Ka, CL, Vc) with 30% BSV
- Created dosing regimens for 50, 100, and 200 mg once daily
- Generated 90 virtual subjects (30 per dose group)
- Simulated 5 days of dosing with residual error
- Visualized individual profiles and prediction intervals
- Calculated steady-state Cmax across dose groups
The simulation shows:
- Dose-proportional exposure increases
- Typical accumulation pattern for once-daily dosing
- Reasonable between-subject variability (~30% CV)
This type of simulation is commonly used for:
- Dose selection in early development
- Predicting exposure at untested doses
- Designing clinical trials
Key Takeaways
Workflow
- Define the model with the
@modelmacro - Set typical parameter values and variability
- Create dosing regimens with
DosageRegimen() - Build virtual subjects with
Subject() - Run simulations with
simobs() - Convert results to DataFrame and visualize with AlgebraOfGraphics
Parameter Guidelines
| Parameter | Typical Range | Notes |
|---|---|---|
| Ka | 0.5-2 hr^-1 | Faster = sharper Cmax |
| CL | 1-100 L/h | Drug-specific |
| Vc | 10-500 L | Usually > plasma volume |
| omega | 0.2-0.5 | ~20-50% CV |
| sigma_prop | 0.1-0.2 | ~10-20% residual error |
Common Variations
- IV dosing: Use
cmt=2(central compartment) instead of depot - Two-compartment: Use
Depots1Central1Periph1dynamics - Infusion: Add
durationparameter:DosageRegimen(500; duration=1) - Loading dose: Combine regimens:
DosageRegimen(loading, maintenance)
Common Mistakes to Avoid
- Forgetting to set
Random.seed!()for reproducibility - Using unrealistic parameter values (check literature)
- Too few subjects to see variability patterns (use 20+ per group)
- Not including residual error (makes predictions look too precise)