Simulation Walkthrough

A guided tutorial demonstrating how to run pharmacokinetic simulations using PumasAide. This walkthrough follows a realistic conversation between a pharmacometrician new to Pumas and the AI assistant.

Introduction

This tutorial demonstrates a complete simulation workflow using PumasAide's MCP (Model Context Protocol) tools. You'll see how to:

  • Build a PK model for simulation
  • Create dosing regimens (single and multiple dose)
  • Generate a virtual population with covariates
  • Run simulations with variability
  • Visualize concentration-time profiles with prediction intervals

The conversation format shows practical questions new users often have. Simulation is often a great starting point because you can explore PK concepts without needing real data.

Getting Started

User: I want to learn how to run PK simulations. I understand basic PK concepts but I've never used Pumas. Can you walk me through it?

Assistant: I'll guide you through the complete simulation workflow. Let me explain how PumasAide works first.

PumasAide provides specialized tools for pharmacometric analysis. Each tool performs a specific task - building models, creating doses, running simulations, making plots. When I use a tool, you'll see the parameters I'm sending and the results that come back.

Let me get the simulation workflow guidance first.

Calls analysis_workflows with analysis_type: "simulation"

The workflow has these main steps:

  1. Build or select a model - Define the PK structure
  2. Set model parameters - Typical values and variability
  3. Create dosing regimens - What doses to simulate
  4. Generate population - Virtual subjects with covariates
  5. Run simulation - Execute and collect results
  6. Visualize - Plot concentration-time profiles

Let's simulate a dose-ranging study for a typical oral drug. This is a common scenario for early clinical development. Ready to start?

User: Yes, let's do it.

Building the Model

Assistant: First, we need a PK model. For a typical small molecule oral drug, a one-compartment model with first-order absorption is a good starting point.

Calls build_pumas_model:

{
  "model_name": "pk_model",
  "explicit_model": "Depots1Central1",
  "observations": {
    "names": ["cp"],
    "error": "Proportional"
  }
}

Tool Result:

PumasModel
  Parameters: tvKa, tvCL, tvVc, omega_Ka, omega_CL, omega_Vc, sigma_prop
  Random effects: eta_Ka, eta_CL, eta_Vc
  Dynamical system variables: Depot, Central
  Derived: cp
  Observed: cp

Code written to: programs/models/pk_model.jl

User: What do all those parameters mean?

Assistant: Great question! Let me break them down:

ParameterMeaning
tvKaTypical value of absorption rate constant (1/h). How fast drug is absorbed.
tvCLTypical value of clearance (L/h). How fast drug is eliminated.
tvVcTypical value of central volume (L). Distribution space for the drug.
omega_*Between-subject variability (BSV). How much individuals differ from typical.
sigma_propResidual error. Measurement noise and model misspecification.

The eta_* are the random effects - each simulated subject gets their own random deviation from the typical values.

Setting Parameters

User: How do we choose parameter values?

Assistant: For this example, I'll use typical values for a small molecule oral drug:

  • Ka = 1 hr^-1: Absorption mostly complete within 3-4 hours
  • CL = 10 L/h: Moderate clearance
  • Vc = 100 L: Volume larger than plasma (~3L) due to tissue distribution
  • 30% CV for between-subject variability on each parameter
  • 15% proportional error for residual variability

Calls pumas_model_parameters:

{
  "model_name": "pk_model",
  "parameters_name": "sim_params",
  "parameters": "(tvKa=1.0, tvCL=10.0, tvVc=100.0, omega_Ka=0.3, omega_CL=0.3, omega_Vc=0.3, sigma_prop=0.15)"
}

Tool Result:

(tvKa = 1.0, tvCL = 10.0, tvVc = 100.0, omega_Ka = 0.3, omega_CL = 0.3, omega_Vc = 0.3, sigma_prop = 0.15)

Code written to: programs/models/sim_params.jl

User: What does "30% CV" mean?

Assistant: CV stands for Coefficient of Variation - it's a way to express variability as a percentage of the mean.

If CL has a typical value of 10 L/h with 30% CV:

  • Most subjects (68%) will have CL between 7-13 L/h
  • A few subjects might have CL as low as 4 or as high as 16 L/h

Technically, omega = 0.3 is the standard deviation on the log scale. For moderate variability, this approximates 30% CV on the original scale.

Creating Dosing Regimens

Assistant: Now let's create dosing regimens. For a dose-ranging study, we'll simulate three dose levels: 50, 100, and 200 mg given once daily for 5 days.

Calls create_dosage_regimen:

{
  "regimen_name": "dose_50mg",
  "amt": 50,
  "route": "ev",
  "ii": 24,
  "addl": 4,
  "time": 0,
  "cmt": 1
}

Tool Result:

DosageRegimen
  1 dosing event
  Total dose: 50 mg to compartment 1 (extravascular)
  Repeated 5 times at 24 hour intervals

Code written to: programs/regimens/dose_50mg.jl

User: What do ii and addl mean?

Assistant: These are NMTRAN-style dosing parameters:

ParameterMeaningExample
amtDose amount50 mg
iiInterdose interval24 hours (once daily)
addlAdditional doses4 more doses (total of 5)
cmtCompartment1 = depot (oral dosing)
routeAdministration route"ev" = extravascular (oral)

So amt=50, ii=24, addl=4 means: "Give 50 mg at time 0, then repeat 4 more times every 24 hours."

Let me create the other two dose levels:

Calls create_dosage_regimen twice more:

{
  "regimen_name": "dose_100mg",
  "amt": 100,
  "route": "ev",
  "ii": 24,
  "addl": 4,
  "time": 0,
  "cmt": 1
}
{
  "regimen_name": "dose_200mg",
  "amt": 200,
  "route": "ev",
  "ii": 24,
  "addl": 4,
  "time": 0,
  "cmt": 1
}

Tool Results:

DosageRegimen (dose_100mg): 100 mg × 5 doses at 24h intervals
DosageRegimen (dose_200mg): 200 mg × 5 doses at 24h intervals

Creating the Simulation Population

Assistant: Now we need virtual subjects to simulate. I'll create 30 subjects per dose group (90 total) with realistic weight distributions.

Calls create_simulation_population:

{
  "population_name": "sim_pop",
  "subpopulations": [
    {
      "n_subjects": 30,
      "regimen_name": "dose_50mg",
      "label": "50 mg"
    },
    {
      "n_subjects": 30,
      "regimen_name": "dose_100mg",
      "label": "100 mg"
    },
    {
      "n_subjects": 30,
      "regimen_name": "dose_200mg",
      "label": "200 mg"
    }
  ],
  "observation_times": "0:0.5:120",
  "covariates": {
    "wt": {"distribution": "Normal", "mean": 70, "std": 15}
  }
}

Tool Result:

Population with 90 subjects:
  - 50 mg: 30 subjects
  - 100 mg: 30 subjects
  - 200 mg: 30 subjects
Observation times: 0, 0.5, 1, ... 120 hours (241 timepoints)
Covariates: wt ~ Normal(70, 15)

Code written to: programs/simulations/sim_pop.jl

User: What does 0:0.5:120 mean?

Assistant: That's Julia syntax for creating a range:

  • Start: 0
  • Step: 0.5 (every half hour)
  • End: 120

So we're collecting simulated concentrations from time 0 to 120 hours (5 days) at 30-minute intervals. This gives us smooth concentration-time curves.

For sparse sampling (like a real study), you might use specific times like [0, 0.5, 1, 2, 4, 8, 12, 24, 48, 72, 96, 120].

Running the Simulation

Assistant: Now we run the simulation. This generates concentration-time profiles for all 90 subjects.

Calls run_simulation:

{
  "simulation_name": "sim_results",
  "model_name": "pk_model",
  "population_name": "sim_pop",
  "parameters_name": "sim_params",
  "include_residual_error": true,
  "seed": 12345
}

Tool Result:

# Summary of DataFrame `sim_results`
21690 rows and 5 columns.
 - id::Int64, 90 unique, 0 missing
 - time::Float64, 241 unique, 0 missing
 - cp::Float64, 21690 unique, 0 missing
 - evid::Int64, 1 unique, 0 missing, values: 0
 - dose_group::String, 3 unique, 0 missing, values: "50 mg", "100 mg", "200 mg"

Simulation results can be viewed at http://localhost:8081/preview/sim_results
Code written to: programs/simulations/sim_results.jl

User: Why did you set a seed?

Assistant: The seed ensures reproducibility. Random number generators use seeds to start their sequence:

  • Same seed → same random numbers → same simulated subjects
  • Different seed → different random numbers → different subjects

Without a seed, you'd get different results each time you run the simulation. For reports and publications, setting a seed lets others reproduce your exact results.

Visualizing Results

User: Can we see what the concentration profiles look like?

Assistant: I'll create a plot showing individual profiles with a population summary.

Calls build_plot:

{
  "plot_name": "conc_time_plot",
  "layers": [
    {
      "data": "sim_results",
      "mapping": ":time, :cp; group=:id => nonnumeric, color=:dose_group",
      "visual": ["visual(Lines, alpha=0.2)"]
    },
    {
      "data": "sim_results",
      "mapping": ":time, :cp; color=:dose_group",
      "visual": ["visual(Lines, linewidth=2)"],
      "statistical_transformations": ["expectation()"]
    }
  ],
  "title": "Simulated Concentration-Time Profiles",
  "xlabel": "Time (h)",
  "ylabel": "Concentration (ng/mL)",
  "facet": {"layout": ":dose_group"}
}

Tool Result:

Plot can be viewed at http://localhost:8081/preview/conc_time_plot
Code written to: programs/plots/conc_time_plot.jl

The plot shows:

  • Faint lines: Individual subject profiles (showing variability)
  • Bold lines: Population mean concentration
  • Panels: One for each dose group

You can see dose-proportional increases in exposure and the accumulation pattern over 5 days of dosing.

User: Can we also show prediction intervals instead of all the individual lines?

Assistant: Sure, prediction intervals are cleaner for presentations. I'll show the median with 90% prediction interval.

Calls build_plot:

{
  "plot_name": "conc_time_pi",
  "layers": [
    {
      "data": "sim_results",
      "mapping": ":time, :cp; color=:dose_group",
      "visual": ["visual(Band, alpha=0.3)"],
      "statistical_transformations": ["interval(coverage=0.9)"]
    },
    {
      "data": "sim_results",
      "mapping": ":time, :cp; color=:dose_group",
      "visual": ["visual(Lines, linewidth=2)"],
      "statistical_transformations": ["expectation()"]
    }
  ],
  "title": "Simulated Concentrations with 90% Prediction Interval",
  "xlabel": "Time (h)",
  "ylabel": "Concentration (ng/mL)"
}

Tool Result:

Plot can be viewed at http://localhost:8081/preview/conc_time_pi
Code written to: programs/plots/conc_time_pi.jl

This version shows:

  • Shaded bands: 90% of simulated subjects fall within this range
  • Lines: Population mean for each dose

This is the typical format for simulation results in clinical pharmacology reports.

Exploring Exposure Metrics

User: What about summary statistics like Cmax and AUC?

Assistant: Good question. For simulations, we often want to summarize exposure metrics. Let me calculate Cmax and trough concentrations at steady state (day 5).

Calls data_wrangler:

{
  "transformations": [
    {
      "type": "filter",
      "dataframe_name": "sim_results",
      "condition": ":time >= 96"
    },
    {
      "type": "group_summarize",
      "by": ["id", "dose_group"],
      "aggregations": {
        "cmax_ss": "maximum(:cp)",
        "cmin_ss": "minimum(:cp)"
      }
    }
  ],
  "output_name": "exposure_metrics"
}

Tool Result:

# Summary of DataFrame `exposure_metrics`
90 rows and 4 columns.
 - id::Int64, 90 unique
 - dose_group::String, 3 unique
 - cmax_ss::Float64, 90 unique
 - cmin_ss::Float64, 90 unique

Code written to: programs/data/exposure_metrics.jl

Now let's visualize the exposure distributions:

Calls build_plot:

{
  "plot_name": "cmax_boxplot",
  "layers": [{
    "data": "exposure_metrics",
    "mapping": ":dose_group, :cmax_ss",
    "visual": ["visual(BoxPlot)"]
  }],
  "title": "Steady-State Cmax by Dose Group",
  "xlabel": "Dose",
  "ylabel": "Cmax,ss (ng/mL)"
}

Tool Result:

Plot can be viewed at http://localhost:8081/preview/cmax_boxplot
Code written to: programs/plots/cmax_boxplot.jl

The boxplot shows dose-proportional increases in Cmax, with the expected overlap in distributions due to between-subject variability.

Generated Files

User: Where can I find all the code that was generated?

Assistant: Every tool operation generates executable Julia code in the programs/ directory:

programs/
  models/
    pk_model.jl           # Model definition
    sim_params.jl         # Parameter values
  regimens/
    dose_50mg.jl          # 50 mg regimen
    dose_100mg.jl         # 100 mg regimen
    dose_200mg.jl         # 200 mg regimen
  simulations/
    sim_pop.jl            # Population definition
    sim_results.jl        # Simulation execution
  data/
    exposure_metrics.jl   # Exposure calculations
  plots/
    conc_time_plot.jl     # Individual profiles
    conc_time_pi.jl       # Prediction intervals
    cmax_boxplot.jl       # Cmax comparison

These files are:

  • Reproducible: Run them again to recreate results

  • Editable: Change parameters and re-execute

  • Educational: Read them to learn Pumas syntax

Summary

User: What did we accomplish?

Assistant: Here's what we completed:

  1. Built a one-compartment oral PK model with variability
  2. Set typical PK parameters (Ka, CL, Vc) with 30% BSV
  3. Created dosing regimens for 50, 100, and 200 mg once daily
  4. Generated 90 virtual subjects (30 per dose group)
  5. Simulated 5 days of dosing with residual error
  6. Visualized individual profiles and prediction intervals
  7. Calculated steady-state Cmax across dose groups

The simulation shows:

  • Dose-proportional exposure increases
  • Typical accumulation pattern for once-daily dosing
  • Reasonable between-subject variability (~30% CV)

This type of simulation is commonly used for:

  • Dose selection in early development

  • Predicting exposure at untested doses

  • Designing clinical trials

Key Takeaways

Tool Workflow

  1. build_pumas_model - Define the PK structure
  2. pumas_model_parameters - Set typical values and variability
  3. create_dosage_regimen - Define doses (can create multiple)
  4. create_simulation_population - Generate virtual subjects
  5. run_simulation - Execute and collect results
  6. build_plot - Visualize concentration-time profiles

Parameter Guidelines

ParameterTypical RangeNotes
Ka0.5-2 hr^-1Faster = sharper Cmax
CL1-100 L/hDrug-specific
Vc10-500 LUsually > plasma volume
omega0.2-0.5~20-50% CV
sigma_prop0.1-0.2~10-20% residual error

Common Variations

  • IV dosing: Use route: "iv" and cmt: 2 (central compartment)
  • Two-compartment: Use explicit_model: "Depots1Central1Periph1"
  • Infusion: Add rate parameter for infusion duration
  • Loading dose: Create separate regimen for first dose

Common Mistakes to Avoid

  • Forgetting to set seed for reproducibility
  • Using unrealistic parameter values (check literature)
  • Too few subjects to see variability patterns (use 20+ per group)
  • Not including residual error (makes predictions look too precise)