Simulation Walkthrough
A guided tutorial demonstrating how to run pharmacokinetic simulations using PumasAide. This walkthrough follows a realistic conversation between a pharmacometrician new to Pumas and the AI assistant.
Introduction
This tutorial demonstrates a complete simulation workflow using PumasAide's MCP (Model Context Protocol) tools. You'll see how to:
- Build a PK model for simulation
- Create dosing regimens (single and multiple dose)
- Generate a virtual population with covariates
- Run simulations with variability
- Visualize concentration-time profiles with prediction intervals
The conversation format shows practical questions new users often have. Simulation is often a great starting point because you can explore PK concepts without needing real data.
Getting Started
User: I want to learn how to run PK simulations. I understand basic PK concepts but I've never used Pumas. Can you walk me through it?
Assistant: I'll guide you through the complete simulation workflow. Let me explain how PumasAide works first.
PumasAide provides specialized tools for pharmacometric analysis. Each tool performs a specific task - building models, creating doses, running simulations, making plots. When I use a tool, you'll see the parameters I'm sending and the results that come back.
Let me get the simulation workflow guidance first.
Calls analysis_workflows with analysis_type: "simulation"
The workflow has these main steps:
- Build or select a model - Define the PK structure
- Set model parameters - Typical values and variability
- Create dosing regimens - What doses to simulate
- Generate population - Virtual subjects with covariates
- Run simulation - Execute and collect results
- Visualize - Plot concentration-time profiles
Let's simulate a dose-ranging study for a typical oral drug. This is a common scenario for early clinical development. Ready to start?
User: Yes, let's do it.
Building the Model
Assistant: First, we need a PK model. For a typical small molecule oral drug, a one-compartment model with first-order absorption is a good starting point.
Calls build_pumas_model:
{
"model_name": "pk_model",
"explicit_model": "Depots1Central1",
"observations": {
"names": ["cp"],
"error": "Proportional"
}
}Tool Result:
PumasModel
Parameters: tvKa, tvCL, tvVc, omega_Ka, omega_CL, omega_Vc, sigma_prop
Random effects: eta_Ka, eta_CL, eta_Vc
Dynamical system variables: Depot, Central
Derived: cp
Observed: cp
Code written to: programs/models/pk_model.jlUser: What do all those parameters mean?
Assistant: Great question! Let me break them down:
| Parameter | Meaning |
|---|---|
tvKa | Typical value of absorption rate constant (1/h). How fast drug is absorbed. |
tvCL | Typical value of clearance (L/h). How fast drug is eliminated. |
tvVc | Typical value of central volume (L). Distribution space for the drug. |
omega_* | Between-subject variability (BSV). How much individuals differ from typical. |
sigma_prop | Residual error. Measurement noise and model misspecification. |
The eta_* are the random effects - each simulated subject gets their own random deviation from the typical values.
Setting Parameters
User: How do we choose parameter values?
Assistant: For this example, I'll use typical values for a small molecule oral drug:
- Ka = 1 hr^-1: Absorption mostly complete within 3-4 hours
- CL = 10 L/h: Moderate clearance
- Vc = 100 L: Volume larger than plasma (~3L) due to tissue distribution
- 30% CV for between-subject variability on each parameter
- 15% proportional error for residual variability
Calls pumas_model_parameters:
{
"model_name": "pk_model",
"parameters_name": "sim_params",
"parameters": "(tvKa=1.0, tvCL=10.0, tvVc=100.0, omega_Ka=0.3, omega_CL=0.3, omega_Vc=0.3, sigma_prop=0.15)"
}Tool Result:
(tvKa = 1.0, tvCL = 10.0, tvVc = 100.0, omega_Ka = 0.3, omega_CL = 0.3, omega_Vc = 0.3, sigma_prop = 0.15)
Code written to: programs/models/sim_params.jlUser: What does "30% CV" mean?
Assistant: CV stands for Coefficient of Variation - it's a way to express variability as a percentage of the mean.
If CL has a typical value of 10 L/h with 30% CV:
- Most subjects (68%) will have CL between 7-13 L/h
- A few subjects might have CL as low as 4 or as high as 16 L/h
Technically, omega = 0.3 is the standard deviation on the log scale. For moderate variability, this approximates 30% CV on the original scale.
Creating Dosing Regimens
Assistant: Now let's create dosing regimens. For a dose-ranging study, we'll simulate three dose levels: 50, 100, and 200 mg given once daily for 5 days.
Calls create_dosage_regimen:
{
"regimen_name": "dose_50mg",
"amt": 50,
"route": "ev",
"ii": 24,
"addl": 4,
"time": 0,
"cmt": 1
}Tool Result:
DosageRegimen
1 dosing event
Total dose: 50 mg to compartment 1 (extravascular)
Repeated 5 times at 24 hour intervals
Code written to: programs/regimens/dose_50mg.jlUser: What do ii and addl mean?
Assistant: These are NMTRAN-style dosing parameters:
| Parameter | Meaning | Example |
|---|---|---|
amt | Dose amount | 50 mg |
ii | Interdose interval | 24 hours (once daily) |
addl | Additional doses | 4 more doses (total of 5) |
cmt | Compartment | 1 = depot (oral dosing) |
route | Administration route | "ev" = extravascular (oral) |
So amt=50, ii=24, addl=4 means: "Give 50 mg at time 0, then repeat 4 more times every 24 hours."
Let me create the other two dose levels:
Calls create_dosage_regimen twice more:
{
"regimen_name": "dose_100mg",
"amt": 100,
"route": "ev",
"ii": 24,
"addl": 4,
"time": 0,
"cmt": 1
}{
"regimen_name": "dose_200mg",
"amt": 200,
"route": "ev",
"ii": 24,
"addl": 4,
"time": 0,
"cmt": 1
}Tool Results:
DosageRegimen (dose_100mg): 100 mg × 5 doses at 24h intervals
DosageRegimen (dose_200mg): 200 mg × 5 doses at 24h intervalsCreating the Simulation Population
Assistant: Now we need virtual subjects to simulate. I'll create 30 subjects per dose group (90 total) with realistic weight distributions.
Calls create_simulation_population:
{
"population_name": "sim_pop",
"subpopulations": [
{
"n_subjects": 30,
"regimen_name": "dose_50mg",
"label": "50 mg"
},
{
"n_subjects": 30,
"regimen_name": "dose_100mg",
"label": "100 mg"
},
{
"n_subjects": 30,
"regimen_name": "dose_200mg",
"label": "200 mg"
}
],
"observation_times": "0:0.5:120",
"covariates": {
"wt": {"distribution": "Normal", "mean": 70, "std": 15}
}
}Tool Result:
Population with 90 subjects:
- 50 mg: 30 subjects
- 100 mg: 30 subjects
- 200 mg: 30 subjects
Observation times: 0, 0.5, 1, ... 120 hours (241 timepoints)
Covariates: wt ~ Normal(70, 15)
Code written to: programs/simulations/sim_pop.jlUser: What does 0:0.5:120 mean?
Assistant: That's Julia syntax for creating a range:
- Start: 0
- Step: 0.5 (every half hour)
- End: 120
So we're collecting simulated concentrations from time 0 to 120 hours (5 days) at 30-minute intervals. This gives us smooth concentration-time curves.
For sparse sampling (like a real study), you might use specific times like [0, 0.5, 1, 2, 4, 8, 12, 24, 48, 72, 96, 120].
Running the Simulation
Assistant: Now we run the simulation. This generates concentration-time profiles for all 90 subjects.
Calls run_simulation:
{
"simulation_name": "sim_results",
"model_name": "pk_model",
"population_name": "sim_pop",
"parameters_name": "sim_params",
"include_residual_error": true,
"seed": 12345
}Tool Result:
# Summary of DataFrame `sim_results`
21690 rows and 5 columns.
- id::Int64, 90 unique, 0 missing
- time::Float64, 241 unique, 0 missing
- cp::Float64, 21690 unique, 0 missing
- evid::Int64, 1 unique, 0 missing, values: 0
- dose_group::String, 3 unique, 0 missing, values: "50 mg", "100 mg", "200 mg"
Simulation results can be viewed at http://localhost:8081/preview/sim_results
Code written to: programs/simulations/sim_results.jlUser: Why did you set a seed?
Assistant: The seed ensures reproducibility. Random number generators use seeds to start their sequence:
- Same seed → same random numbers → same simulated subjects
- Different seed → different random numbers → different subjects
Without a seed, you'd get different results each time you run the simulation. For reports and publications, setting a seed lets others reproduce your exact results.
Visualizing Results
User: Can we see what the concentration profiles look like?
Assistant: I'll create a plot showing individual profiles with a population summary.
Calls build_plot:
{
"plot_name": "conc_time_plot",
"layers": [
{
"data": "sim_results",
"mapping": ":time, :cp; group=:id => nonnumeric, color=:dose_group",
"visual": ["visual(Lines, alpha=0.2)"]
},
{
"data": "sim_results",
"mapping": ":time, :cp; color=:dose_group",
"visual": ["visual(Lines, linewidth=2)"],
"statistical_transformations": ["expectation()"]
}
],
"title": "Simulated Concentration-Time Profiles",
"xlabel": "Time (h)",
"ylabel": "Concentration (ng/mL)",
"facet": {"layout": ":dose_group"}
}Tool Result:
Plot can be viewed at http://localhost:8081/preview/conc_time_plot
Code written to: programs/plots/conc_time_plot.jlThe plot shows:
- Faint lines: Individual subject profiles (showing variability)
- Bold lines: Population mean concentration
- Panels: One for each dose group
You can see dose-proportional increases in exposure and the accumulation pattern over 5 days of dosing.
User: Can we also show prediction intervals instead of all the individual lines?
Assistant: Sure, prediction intervals are cleaner for presentations. I'll show the median with 90% prediction interval.
Calls build_plot:
{
"plot_name": "conc_time_pi",
"layers": [
{
"data": "sim_results",
"mapping": ":time, :cp; color=:dose_group",
"visual": ["visual(Band, alpha=0.3)"],
"statistical_transformations": ["interval(coverage=0.9)"]
},
{
"data": "sim_results",
"mapping": ":time, :cp; color=:dose_group",
"visual": ["visual(Lines, linewidth=2)"],
"statistical_transformations": ["expectation()"]
}
],
"title": "Simulated Concentrations with 90% Prediction Interval",
"xlabel": "Time (h)",
"ylabel": "Concentration (ng/mL)"
}Tool Result:
Plot can be viewed at http://localhost:8081/preview/conc_time_pi
Code written to: programs/plots/conc_time_pi.jlThis version shows:
- Shaded bands: 90% of simulated subjects fall within this range
- Lines: Population mean for each dose
This is the typical format for simulation results in clinical pharmacology reports.
Exploring Exposure Metrics
User: What about summary statistics like Cmax and AUC?
Assistant: Good question. For simulations, we often want to summarize exposure metrics. Let me calculate Cmax and trough concentrations at steady state (day 5).
Calls data_wrangler:
{
"transformations": [
{
"type": "filter",
"dataframe_name": "sim_results",
"condition": ":time >= 96"
},
{
"type": "group_summarize",
"by": ["id", "dose_group"],
"aggregations": {
"cmax_ss": "maximum(:cp)",
"cmin_ss": "minimum(:cp)"
}
}
],
"output_name": "exposure_metrics"
}Tool Result:
# Summary of DataFrame `exposure_metrics`
90 rows and 4 columns.
- id::Int64, 90 unique
- dose_group::String, 3 unique
- cmax_ss::Float64, 90 unique
- cmin_ss::Float64, 90 unique
Code written to: programs/data/exposure_metrics.jlNow let's visualize the exposure distributions:
Calls build_plot:
{
"plot_name": "cmax_boxplot",
"layers": [{
"data": "exposure_metrics",
"mapping": ":dose_group, :cmax_ss",
"visual": ["visual(BoxPlot)"]
}],
"title": "Steady-State Cmax by Dose Group",
"xlabel": "Dose",
"ylabel": "Cmax,ss (ng/mL)"
}Tool Result:
Plot can be viewed at http://localhost:8081/preview/cmax_boxplot
Code written to: programs/plots/cmax_boxplot.jlThe boxplot shows dose-proportional increases in Cmax, with the expected overlap in distributions due to between-subject variability.
Generated Files
User: Where can I find all the code that was generated?
Assistant: Every tool operation generates executable Julia code in the programs/ directory:
programs/
models/
pk_model.jl # Model definition
sim_params.jl # Parameter values
regimens/
dose_50mg.jl # 50 mg regimen
dose_100mg.jl # 100 mg regimen
dose_200mg.jl # 200 mg regimen
simulations/
sim_pop.jl # Population definition
sim_results.jl # Simulation execution
data/
exposure_metrics.jl # Exposure calculations
plots/
conc_time_plot.jl # Individual profiles
conc_time_pi.jl # Prediction intervals
cmax_boxplot.jl # Cmax comparisonThese files are:
Reproducible: Run them again to recreate results
Editable: Change parameters and re-execute
Educational: Read them to learn Pumas syntax
Summary
User: What did we accomplish?
Assistant: Here's what we completed:
- Built a one-compartment oral PK model with variability
- Set typical PK parameters (Ka, CL, Vc) with 30% BSV
- Created dosing regimens for 50, 100, and 200 mg once daily
- Generated 90 virtual subjects (30 per dose group)
- Simulated 5 days of dosing with residual error
- Visualized individual profiles and prediction intervals
- Calculated steady-state Cmax across dose groups
The simulation shows:
- Dose-proportional exposure increases
- Typical accumulation pattern for once-daily dosing
- Reasonable between-subject variability (~30% CV)
This type of simulation is commonly used for:
Dose selection in early development
Predicting exposure at untested doses
Designing clinical trials
Key Takeaways
Tool Workflow
build_pumas_model- Define the PK structurepumas_model_parameters- Set typical values and variabilitycreate_dosage_regimen- Define doses (can create multiple)create_simulation_population- Generate virtual subjectsrun_simulation- Execute and collect resultsbuild_plot- Visualize concentration-time profiles
Parameter Guidelines
| Parameter | Typical Range | Notes |
|---|---|---|
| Ka | 0.5-2 hr^-1 | Faster = sharper Cmax |
| CL | 1-100 L/h | Drug-specific |
| Vc | 10-500 L | Usually > plasma volume |
| omega | 0.2-0.5 | ~20-50% CV |
| sigma_prop | 0.1-0.2 | ~10-20% residual error |
Common Variations
- IV dosing: Use
route: "iv"andcmt: 2(central compartment) - Two-compartment: Use
explicit_model: "Depots1Central1Periph1" - Infusion: Add
rateparameter for infusion duration - Loading dose: Create separate regimen for first dose
Common Mistakes to Avoid
- Forgetting to set
seedfor reproducibility - Using unrealistic parameter values (check literature)
- Too few subjects to see variability patterns (use 20+ per group)
- Not including residual error (makes predictions look too precise)