Same science, different stories: writing papers vs writing grants

As a PhD student, you will write a lot of papers, and at a certain point, you will also start writing grants. It took me a moment to realise that these are not the same skill. While they draw on the same science and sometimes the same project, they are different genres with different rules. Treating a grant proposal like a mini-paper is one of the most common and avoidable ways in which people damage their own applications. Here’s what I’ve learnt so far, mostly through trial and error.

Step one: know the landscape before you write a word

Before you open your text editor, you need to research the funding opportunity you’re interested in. Find out when the deadline is, which documents are required, what makes you eligible or ineligible, and what are the mandatory requirements versus those desirable. It may sound obvious, but it’s surprisingly easy to miss a deadline or, even worse, to base your entire career plan on a fellowship that you don’t qualify for or don’t have the right documents for.

Not all funders play by the same rules. Read each charter carefully before assuming that your project fits.
Continue reading

OPIGlet’s First Conference: PEGS Boston 2026

Hello everyone!

For my very first blog post, I’m excited to share my experience attending and speaking at my first conference.

I was invited to present at the Protein & Antibody Engineering Summit (PEGS) Boston 2026 on behalf of OPIG. The conference ran from May 10–15, and like many antibodies-stream OPIGlets before me, I delivered the OPIG tools short course. This formed part of a three-hour session titled In silico and Machine Learning Tools for Antibody Design and Developability Predictions. The short courses took place on the Sunday, before the main conference officially began on Monday, so somewhat terrifyingly, I gave a conference talk before I had ever attended one.

I arrived in Boston on Saturday after a long day of travel and was pleasantly surprised by a free bus into downtown (thanks, Boston MBTA). After a much-needed sleep, I headed to the conference centre, checked into my hotel (thank you, PEGS team), and tried not to explode from nerves. Thankfully, the talk went well; there were plenty of questions, and I quickly settled into speaking in front of the group. With an audience of around 50 people—only slightly larger than an OPIG group meeting—it felt like familiar territory.

Continue reading

OPIG Retreat, 2026: Heatwave Edition

Last week, a sizable fraction of OPIG headed to “The Plough” near Bradford on Avon, Wiltshire, for our OPIG Retreat (a.k.a. “OPIGtreat”). Some of us travelled on Monday by train, encountering a biblical deluge, a darkness resembling a train tunnel but filled with pelting rain and bath-tubs of water washing down the train’s windows — and lightning strikes every 30s while changing in Bath Spa…

The red monster storm with black eye is what greeted us as we were arriving in Bath Spa. (Credit: UK Thunderstorm Updates: “This is what the satellite imagery is currently showing from the storm affecting Somerset/ Wiltshire. Cloud tops on this monster have exceeded 40,000ft….”)

Undeterred, and either shuttled by the fabulous Anita from Bradford on Avon station, or walking on foot, we arrived at our lovely destination and home for the week.

Continue reading

SAbDab2: The structural antibody database in the age of machine learning 

Henriette L. Capel, Odysseas Vavourakis, Benjamin H. Williams, Christopher R. Taylor, and Charlotte M. Deane

The Structural Antibody Database 

The Structural Antibody Database (SAbDab) [1] is a publicly available repository of experimentally determined antibody structures, first released in 2013. Explicit support for single-domain antibodies was added in 2021, with SAbDab-nano [2]. Detailed annotations and consistent maintenance have made SAbDab a central resource supporting important advances in the field. SAbDab has been used to study antibody-antigen interactions, including SARS-CoV-2; to predict antibody structure; to design antibodies de-novo; and to investigate antibody flexibility. 

Continue reading

Building an Agent – Practical Notes for Beginners

For the last few months, I’ve been building an agent around OPIG’s antibody analysis and design tools, and I thought I’d share some practical notes from the process.

An agent is a language model that doesn’t just answer questions but can also decide what to do, call tools, and follow workflows. I’m using Claude in these notes, but most of the ideas apply equally well to other agent frameworks.

Rather than building an agent from scratch, we’re starting with one that already comes with useful capabilities out of the box. For example, Claude Code can search files, edit code, execute commands, and run scripts. Everything below is really about adapting that behaviour to a specific domain and workflow.

How to start?

Start with the `CLAUDE.md` file. It’s a special file Claude reads at the start of every conversation, and it’s where you define the behaviour of the agent (other agents have their own equivalent — for example `AGENTS.md`). In this file, include things like bash commands, code style preferences, and workflow rules. This gives Claude a persistent context that it can’t infer from the codebase alone. Since it’s loaded every session, it sets the baseline for how the agent behaves.

Start simple – especially if it’s your first time. Define clear tools, write lightweight instructions in the markdown (md) file, and create realistic evaluations before adding complexity.

Then run a loop where the agent gathers context, takes actions, and verifies the outputs. Think about how you’ll verify them first: if you can’t tell whether a run was good, you can’t tell whether your changes helped.

In research, you don’t always know how a project will evolve, so you’ll often end up making many changes along the way. But for projects that are relatively well-defined, I’ve found it’s worth spending some time upfront with pen and paper, specifying what you want the agent to do before writing it all out.

From there, most development becomes an iterative process of improving the md files and adjusting tools when needed.

What is a tool?

A tool gives the agent a capability. It executes an action and returns a result — calling an API, running code, querying a database, and so on.

The key idea is that tools are deterministic: given the same input, they produce the same output. So if I ask, “Can you check whether this is an antibody?”, the agent will always reach for the same tool — `execute_run_anarci()` — and get the same result.

A tool can be an MCP server or simply a Python function; what matters is that it gives the agent a reliable way to perform a specific action. Both work.

For example, I implemented execute_anarci_number() as a Python function — a thin wrapper around ANARCI — and it returns a structured JSON output with the results and the execution status. All the tools follow the same general structure, which makes them easier for the agent to use consistently.

The signature and docstring are really all the agent needs to decide when to reach for it:

def execute_anarci_number(sequence: str, chain_name: str = "Chain") -> dict:
"""Identify and number an antibody/TCR sequence using ANARCI.

Returns chain type, species, numbering, and whether it's a valid antibody.
Chain types: H=Heavy, K=Kappa light, L=Lambda light, A=TCR-alpha, B=TCR-beta
"""


The function itself is simple: it runs ANARCI, parses the numbering, extracts the CDRs, and checks whether the input looks like a real, complete variable domain. Instead of returning a bare error when numbering fails, the tool returns a structured verdict the agent can reason about:

# numbering failed → the sequence just isn't an antibody (not a tool error)
return {
"success": True,
"chain_name": chain_name,
"is_antibody": False,
"is_tcr": False,
"chain_type": None,
"species": None,
"message": "ANARCI could not number this sequence. "
"It is likely not an antibody or TCR variable domain.",
"sequence_length": len(sequence),
}

One thing I found useful is having tools return an explicit verdict, not just output, so the agent knows whether it received an answer, encountered an error, or was given an invalid input.

A few things that helped:

  • Use the agent itself to help write the tools. It’s good at it, especially if you give Claude documentation for any software libraries, APIs, or SDKs you’re wrapping.
  • Don’t forget to document the tool in the markdown workflow file so the agent knows it exists and when to use it.
  • Open a fresh session and check the agent can actually call the tools correctly before building on top of them.

What is a skill?

Skills extend Claude with procedural knowledge. They teach the agent how to perform a task, not just what tools are available.

I think of tools as capabilities and skills as workflows. Tools let the agent do something; skills tell it how to approach a task. A tool might tell Claude how to number an antibody sequence. A skill tells it how to carry out an antibody analysis workflow: which tools to use, in what order, what outputs to expect, and how to interpret the results.

Without skills, the model has to rediscover that workflow from scratch each time. Skills package it once and make it reusable.

A skill is just a folder containing a SKILL.md file (instructions plus metadata) and optional scripts or reference material. One nice advantage is portability: because a skill is just a folder of markdown and scripts, you can write it once and reuse it across different projects, environments, and even different agent frameworks.

To make it concrete, here’s one of mine: ab-diversity-select. After an optimization run, I’m left with dozens of candidate antibodies and need to select a small, maximally diverse subset where the retained mutations remain structurally safe. Rather than re-explaining that workflow every time, I captured it as a skill:

ab-diversity-select/
├── SKILL.md # when to use it + the procedure
├── structural_pipeline.py
├── pipeline.py
└── config_template.py

The SKILL.md header tells Claude when the skill is relevant:


name: ab-diversity-select
description: >-
Select a structurally-validated, maximally-diverse subset of antibody candidates from a results CSV…

The rest of the file describes the procedure, while the accompanying scripts do the heavy lifting. When Claude encounters a task like “pick 20 diverse antibody candidates,” it can automatically apply my workflow instead of inventing a new selection strategy from scratch.

Practices that worked for me

There’s already a lot of useful information out there, for example:

anthropic.com/engineering

Claude Code best practices

A few things I’d highlight:

Keep the markdown files organized. `CLAUDE.md` is loaded every session, so only put things in it that apply broadly. For domain-specific knowledge or workflows that are only relevant sometimes, use skills instead. There’s no required format for `CLAUDE.md`; just keep it short and human-readable. Mine roughly covers: setup & environment, architecture & code map, and failure handling.

Use subagents to protect the context. Once the basic agent is working, most improvements come from managing context effectively. Subagents run in their own context with their own set of allowed tools. They’re useful for subtasks that require a lot of context. For example, summarizing a paper. In practice, though, I mostly used them for tools that generate large outputs, where it becomes difficult for a single agent to process everything cleanly within one context window.

I defined small operator agents that return only compact summaries. The main agent stays focused on planning and interpretation, large tool outputs stay outside its context, and cheaper, faster models handle parsing and batch work.

Prompts matter — a lot. Performance changes significantly depending on the prompt. From my experience, when building longer workflows, improving the prompt often helps more than editing the markdown files.

For example, explicitly defining the expected output format and level of detail can reduce lazy behaviour and make the agent more consistent across runs.

One approach I like is building a skill that interviews the user up front about the information you care about using the built-in `AskUserQuestion` tool, and then generates the prompt from the user’s answers in a structured way.

Use the agent to explain its own failures. The agent is actually pretty good at explaining where it failed and why. Use it to help debug and improve itself. Ask it what went wrong, have it suggest edits to the markdown files, or ask what it learned during the session. Some of my best improvements came from just asking the agent why a run failed.

A few bio-specific lessons

First, watch the jargon and define your terms. “Diverse” might mean sequence distance, V-gene spread, or structural diversity. Say exactly what you mean, or define it explicitly in your workflow files.

Second, the agent will always give you an answer, so make sure it is grounded in tools rather than invented. A language model can easily produce a confident, plausible-looking sequence or numbering out of thin air. If you do not explicitly tell the agent to use the available tools, it may continue without them, even when they exist.

Finally, keep a human in the loop. Read the logs yourself, understand what happened, and do not trust a clean-looking summary on its own. Ask the agent to explain each step and justify its decisions — that is often the fastest way to catch a wrong assumption before it ends up in your results.

Agents are surprisingly capable, but I still found it challenging to get them to reliably execute long workflows without intervention. In practice, I had the most success when treating the agent as a collaborator rather than a fully autonomous system, giving it clear tools, workflows, and checkpoints along the way.

Building agents is still a fast-moving area, and there are many ways to approach it. It can feel confusing at first, but once you start experimenting and building real projects, things become much clearer. My advice would be to start simple, build something useful, and learn by doing.

References:
1. https://code.claude.com/
2. https://code.claude.com/docs/en/agent-sdk/modifying-system-prompts
3. https://youtu.be/TqC1qOfiVcQ?si=K24t3oxuHgYWs375
4. https://www.aiwithamitay.com/p/skills

Networks beyond proteins: a Lake Como summer school

My DPhil uses network representations of protein complexes to predict drug targets, so when a summer school on complex networks came up, I wanted to see what tools and ideas from the broader field I might be missing. The Lake Como School on Complex Networks  brought together students and postdocs from universities around the world to discuss recent applications and future possibilities using networks. This was the school’s 10-year anniversary, so we were honoured to have many of the lectures given by founding members of the society.

Continue reading

How Unusual Is Your Generated Molecule? Let The CCDC Tell You

In this post I’ll walk through how to set up the CCDC Python API and use the CSD Geometry Analyser to evaluate the geometric quality of molecules from three representative structure-based de novo design models. I’ve put together a small GitHub repo with the full analysis code where we look at bond lengths, angles, torsions, and ring conformations across the three methods, and compare these against their PoseBusters validity scores to see what each metric is really capturing.

Continue reading

Peering Inside the Black Box: A Beginner’s Introduction to Mechanistic Interpretability

Over the last few years, large language models (LLMs) have gone from being curiosities tucked away in research labs to something most of us interact with on a daily basis; whether for drafting emails, debugging code, or simply pondering the meaning of life at 2am. And yet, for all our reliance on these systems, a rather inconvenient truth lingers in the background: nobody, not even the people who built them, can fully explain what is going on inside.

This is where mechanistic interpretability comes in.

In essence, mechanistic interpretability is the approach of explaining complex machine learning systems through the behaviour of their functional units (Kästner and Crook, 2024) by reverse-engineering them into their more elementary computations (Rai et al., 2025). The aim is not simply to know that a model gives the right answer, but to pull apart the underlying machinery and uncover the causal relationships between input and output. Think of it as neuroscience for neural networks, except we can read every neuron at any moment, rewind, replay, and intervene mid-thought.

Continue reading

A timeline of sampling methods of diffusion models

When approaching the methods used in de-novo protein design, one is quickly confronted with a plethora of overlapping formulations of what looks superficially like “the same thing”. One paper trains an ϵ\boldsymbol{\epsilon}-prediction network with a simple MSE loss; another trains a score network with a stochastic-differential-equation justification; a third trains a clean-data predictor under yet another schedule. Each formulation carries its own notation, its own variance schedule, and its own sampler. Qualitatively, this zoo of formulations is doing the same thing: it starts from some unstructured noise and iteratively refines it to eventually produce a protein structure similar (but different!) to other proteins we have experimentally determined in the past. What is not immediately obvious to a newcomer is that all of these formulations are historical descendants of a small number of foundational ideas, and that essentially every architectural and algorithmic decision in a modern protein-design diffusion model has a specific paper of origin and a specific motivation for being there.

This post is my attempt to put these formulations onto a single timeline. I trace the trajectory of the field through four foundational works: DDPM (Ho et al., 2020), DDIM (Song et al., 2021a), the score-based SDE unification (Song et al., 2021b), and EDM (Karras et al., 2022), explaining at each step what specific problem with the previous formulation the next paper was attacking and how the new formulation generalises or simplifies the old one. The goal is coherent motivation rather than exhaustive coverage; the reader interested in implementation details is referred to the original papers and the references at the end.

Continue reading

Spin Lattices and Proteins – How state-based discretisations have enabled modern protein modelling

I got into protein modelling not long before AlphaFold2 first released. At that time some of the prevailing methods for protein structure prediction came from highly interpretable energy functionals that arose from a particularly beautiful intersection of statistical mechanics and biology. These “Potts” models are going to be the centre of a larger discussion in this blog on state-based discretisations of proteins, how they’ve shaped modern deep learning methods and whether there is still more to learn from them.

In the age of black box deep learning, does the Potts model still have a place?

The Potts/Ising Model

The Ising model is a well established popular theoretical physics model of ferromagnetism. Simply put, given a lattice of atoms each capable of adopting 1 of 2 spins (up and down) ferromagnetism arises when their spins align and their associated magnetic moments point in the same direction. The Ising model tries to parameterise the local and non-local relationships between atoms and their spin states such that we can learn the Hamiltonian of the system and its different configurations under the magnetic field. The Hamiltonian takes the following form for a system of N atoms


$$
E = -\sum_{i}^Nh_ix_i – \sum_{i<j}^N J_{ij}x_i x_j,
$$

where J is the “coupling energy” between any two atoms x_i and x_j, and h represents the magnetic field, or more appropriately for our purposes it can be framed as a single-site field dictating how an individual atom independently acts within the model. You might recognise the form this binary spin model takes as it arises naturally across the sciences including in Hopfield networks and graphical models.

Everything is an Ising-like model if you’re brave enough

Continue reading