Working With Agents

AI agents are powerful collaborators, but getting the most out of them requires some awareness of how they work and where humans add the most value. These practices apply broadly to working with coding agents, grounded in the pharmacometric context where PumasAide operates.

Plan Before Executing

For anything beyond a single step, ask the agent to plan the analysis before writing code. Most agents support a planning mode where they outline the approach (which workflow to follow, what model structure to start with, which parameters to calculate) and wait for your approval before generating scripts.

In pharmacometrics, a wrong turn early (incorrect column mappings, inappropriate model structure, missing quality metrics) can propagate through every subsequent step. Reviewing a plan takes seconds; redoing an analysis from scratch takes much longer. Get aligned on the approach first, then let the agent execute.

Don't expect the first plan to be perfect. Review it, push back on parts that don't look right, and ask the agent to revise. Iterative refinement is the most productive part of the process. It's far cheaper to adjust a plan than to redo executed code. Two or three rounds of feedback typically produce a much stronger analysis than accepting the first draft.

Planning mode is also the agent acting as a prompt enhancer. You provide rough intent, like "I want to do an NCA analysis on this data", and the agent produces a structured, detailed version: which columns to map, which parameters to calculate, what quality checks to include. You can apply this pattern more broadly whenever you're unsure how to phrase a request. Ask the agent to help you articulate what you want before asking it to do the work.

You Own the Code

The conversation with the agent is ephemeral. The scripts in programs/ are the real output. They're what you'd hand to a colleague, submit to QC, or re-run for a regulatory submission. Treat them as your code, not the agent's.

Read the generated scripts. Understand what each one does. If something looks wrong or unclear, edit it directly or ask the agent to revise it. The agent writes the first draft, but you're the author of record. If you wouldn't be comfortable explaining a script to a reviewer, it isn't finished yet.

Source Control Is Essential

Agents can make large, sweeping changes to your project very quickly. A single prompt can rewrite multiple files, restructure directories, or replace existing code entirely. That's a strength when it goes right and a risk when it doesn't.

Git is what makes this safe. Commit your work before starting a new analysis or asking the agent to make changes. Commit at natural checkpoints: after data loading, after model fitting, after diagnostics. Review diffs before accepting what the agent produced. If something goes wrong, you can always return to a known good state. Without source control, an agent's speed becomes a liability rather than an advantage.

Provide Context Upfront

The agent's first attempt improves significantly when you provide context about what you're working with. Before diving into an analysis, tell the agent about your study design (single ascending dose, multiple dose, crossover), any data quirks you're aware of (BLQ handling, missing covariates, unusual dosing), regulatory requirements (FDA guidance, EMA expectations), and what you already know about the compound's pharmacokinetics.

Context shapes every decision the agent makes, from column mappings to model structure to which quality metrics matter most. An agent working blind will ask these questions one at a time as they become relevant. An agent with context upfront can plan a coherent analysis from the start.

Build Incrementally

Resist the urge to ask for an entire analysis in a single prompt. The most effective workflow moves through discrete steps: load and inspect the data, explore it visually, build a base model, check diagnostics, then refine. Each step gives you a checkpoint where you can catch problems before they compound.

This mirrors good pharmacometric practice regardless of whether an agent is involved. You wouldn't fit a complex covariate model without first confirming the data looks right and a base model converges. The agent follows the same logic when you give it room to work step by step.

Manage Your Context Window

Agents have a finite context window: the amount of text they can hold in working memory during a single session. Every message you send, every tool output, every code result, and every plan revision accumulates in that window. As it fills, the agent may lose track of earlier details, repeat itself, or make less coherent decisions.

Be aware of where you are in a session. If the agent starts forgetting things you discussed earlier or its responses lose focus, the context window is likely getting full. The practical solution is to start a fresh session. Commit your work to git first so the new session can pick up where the last one left off. The code in programs/ and the git history provide all the continuity you need.

For larger tasks, consider splitting work across focused sessions rather than running one marathon conversation. Some agents also support subagents or background tasks that handle independent subtasks in their own context, keeping the main session lean. A session dedicated to data loading and exploration, followed by a separate session for model fitting, will generally produce better results than a single session that tries to do both.

Inspect Results Visually

The agent describes results in text, but text summaries can miss things your eyes won't. A concentration-time profile with an obvious outlier, a goodness-of-fit plot with systematic bias, a boxplot that reveals unexpected variability patterns: these are easier to catch visually than from a table of numbers.

PumasAide's preview system at http://localhost:8081/preview renders plots, DataFrames, and model summaries in your browser. When the agent generates a result and provides a preview URL, open it. Visual inspection is part of the analytical workflow, not an optional extra.

Verify External Content

AI agents process instructions and data without distinguishing trusted content from potentially adversarial content. This creates a vector called prompt injection: if you bring external documents, datasets, or templates into your project, any embedded instructions in that content could influence the agent's behaviour.

In practice, review external files before adding them to your workspace, particularly markdown documents, configuration files, or datasets from unfamiliar sources. So long as you vet what enters the project, modern agents operate safely within that boundary. This isn't unique to PumasAide; it applies to any agent workflow.

Stay in the Driver's Seat

Most agents offer an auto-accept mode that lets the agent write and execute tools without asking for approval at each step. This can be productive when you've agreed on a clear plan and you're confident in the content of your workspace, but it should not be your default mode of operation.

Without human checkpoints, the agent makes every decision (column mappings, model structure, quality thresholds, what to include or exclude) based on its best guess. Some of those guesses will be wrong, and without you reviewing each step, errors compound silently. The result is an analysis that runs to completion but may not reflect the choices you would have made.

Use auto-accept selectively: for well-understood steps where the plan is unambiguous and prompt injection risks are mitigated. For exploratory work, model building, or anything involving pharmacometric judgment, keep yourself in the loop. You drive the agent; don't let it drive you.

It's All Plain Text

Everything that makes agents powerful in this context is plain text. Skill files are markdown. Workflow guides are markdown. Organisation rules are markdown. Agent instructions are markdown. There is no special syntax, no compilation step, no framework to learn.

You can build complex custom workflows simply by writing markdown documents that describe the steps in natural language and reference PumasAide's three tools when execution is needed. A document that says "load the data using include_julia_file, check for missing values, then build an NCA population with these column mappings" is a workflow. Chain several such documents together and you have a multi-stage analysis pipeline, authored in the same language you'd use to explain the process to a colleague.

Set Up Standards Once

If you find yourself repeating the same thresholds, conventions, or methodology preferences across sessions ("use rsqadjkel >= 0.8", "always include AUC extrapolation checks", "our naming convention is..."), consider setting up organisation rules as skill files. The one-time authoring effort pays for itself in consistency and reduced friction across every subsequent analysis.

Use the Knowledge Base to Learn

The pumas_knowledge tool exists for your benefit as much as the agent's. Ask the agent to look up function signatures, explain what a parameter means, compare estimation methods, or walk through how a workflow step works. The knowledge base covers API references, pharmacometric concepts, and analysis workflows in detail. It's a learning resource embedded directly in your working environment.