
In what I have to admit is now becoming an annual tradition ([2023] [2019]), I’d like to highlight the 2024 edition of the fragment-to-lead success stories, published in J. Med. Chem. at the end of 2025 [Paper].
Continue reading

In what I have to admit is now becoming an annual tradition ([2023] [2019]), I’d like to highlight the 2024 edition of the fragment-to-lead success stories, published in J. Med. Chem. at the end of 2025 [Paper].
Continue readingAn Ångström
An Ångström (Å) is a unit of length equal to 10−10 metres; one ten-billionth of a metre. It sits at a comfortable scale for the atomic world, with the diameter of a hydrogen atom, the length of a chemical bond, all measured in Ångström.
It is not an International System of Units (Système International d’Unités) “SI” unit. In fact, it has been formally deprecated in favour of the nanometre (1 Å = 0.1 nm), and standards bodies such as NIST and the BIPM discourage its use. Yet, in structural biology and chemistry, crystallography, and materials science, the Ångström persists. I would say, partly out of stubbornness, but mostly out of convenience. Saying a protein structure was solved at 2.1 Å feels natural in a way that 0.21 nm does not.
So we keep using it. And because we keep using it, we inherit its quirks and history.
Continue readingSomething I often think about is how surprising it is that Wikipedia works, given that it is a resource accessible to the entire internet to edit and maintain. By all normal internet logic, it should be dreadful: too open, too messy, and vulnerable to misinformation. These flaws are evident now more so than ever on other platforms which permit anybody to contribute, such as X or Reddit. But Wikipedia is one of the few places online that, for the most part, feels sane and reliable. Why?
I think the main contributor to this is that Wikipedia is designed to be revised. It does not need to sound authoritative, it just needs to be checkable. For a reference work, this is a much better ambition. It also leaves the process visible. You can see the edit history, the arguments, and the sources. Each page is exposed to a large number of mildly obsessive people, which turns out to be an excellent quality-control system. The internet has has no shortage of mildly obsessive people, and in the case of Wikipedia, they’re performing a noble job. Wikipedia gives their energy a useful outlet to the benefit of everyone.
It is not perfect, of course. It has gaps and biases, and can often be out-of-date on more niche topics. But it performs what feels like an impossible task – trying to build a repository of all human knowledge. And it works so well that we essentially take it for granted that it exists.
If you don’t know the history of Wikipedia, which you probably use on at least a weekly basis, then you can read more about it here, courtesy of Wikipedia: https://en.wikipedia.org/wiki/Wikipedia.
You’re cycling along minding your own business when your front wheel suddenly drops into a deep, jagged pothole. The handlebars twist sideways, your heart lurches and, for a split second, you fight to stay upright. For cyclists and drivers, potholes aren’t just an annoyance: they can cause falls, break wheels, and lead to more serious injuries. However, potholes are a universal frustration for all road users and an everyday hazard that has plagued travellers throughout human history, not just in the age of the bicycles or cars.
Far from being a modern infrastructure failure, potholes predate the use of asphalt. Historical records show that they have been a persistent challenge for road builders across centuries and civilisations. Yet, despite advances in materials science and engineering, potholes still represent a significant drain on public finances and pose a hazard to drivers, cyclists and pedestrians alike. They are a persistent reminder that even our best roads are in a constant battle with the elements.
So what exactly are potholes, why do they form, and what are engineers doing to finally get ahead of them? Let’s dig in.
Continue readingAs someone who works with T cell antigen receptor (TCR) and peptide-major histocompatibility complex (pMHC) data, I have found several Python packages to be very useful for eliminating tedious steps in data cleaning and feature engineering stages.
Continue reading
Whilst I always enjoy the acquisition of knowledge, I’ve always struggled with depositing it usefully. From pen and paper notes with a 20 colour theme which lost value with each additional colour, to OneNote or iPad GoodNotes based emulations of pen and paper, it’s been a constant quest for the optimal note taking schema. Personally there are 3 key objectives I need my note taking to achieve:
For me the solution to this was Obsidian, the perhaps more cultified sibling to Notion. Obsidian is a note taking application that uses markdown with a surprising amount flexibility, including the ability to partner it with an LLM which I’ll explore in this blog, alongside my vault organisation do or dies, and favourite customisations.
Continue readingMany OPIGlets found their way into a DPhil in Protein Informatics through our Systems Approaches to Biomedical Sciences Industrial Doctoral Landscape Award, which was open to applicants 2009-2024. This innovative course, based at the MPLS Doctoral Training Centre (DTC), offered six months of intensive taught modules prior to starting PhD-level research, allowing students to upskill across a diverse range of subjects (coding, mathematics, structural biology, etc.) and to go on to do research in areas significantly distinct from their formal Undergraduate training. All projects also benefited from direct co-supervision from researchers working in the Pharmaceutical industry, ensuring DPhil projects in areas with drug discovery translation potential. Regrettably, having twice successfully applied for renewal of funding, we were unsuccessful in our bid to refund SABS in 2024.
Happily though, we can now formally announce that our bid for a direct successor to SABS, the Transformative Technologies in Pharmaceutical Sciences IDLA, has been backed by the BBSRC, and we will shortly be opening for applications for entry this October [2026]. As someone who benefited from the interdisciplinary training and industry-adjacency of SABS, I’m thrilled to be a co-director of this new Programme and to help deliver this course to a new generation of talented students.

As a rule in scientific education, at some point between starting undergraduate and doing professional research it would seem we are expected to simply start knowing how to do things, without necessarily any formal training. One example in my degree was writing code, but another, as it turns out, is drawing figures for papers. Today, I would like to assist with the latter!
Inkscape https://inkscape.org/ is free and very powerful. Paired with python plotting and the textext plugin https://github.com/textext/textext it can really handle a lot, from abstract concepts to technical diagrams to artistic illustrations. Here I’ll just describe the workflow for including LaTeX and plots generated in Matplotlib, the rest (the artsy bit) is more about playing around with the different tools to find an ideal workflow.
Continue readingWhen working with structural ensembles from molecular dynamics, AlphaFold2 subsampling, or ensemble reweighting against experimental data, you quickly run into visualization problems. Many of these problems standard PyMOL tutorials don’t address: what do you do when there’s no single reference structure?
In this two-part series, I’ll share the PyMOL techniques I’ve developed for visualizing weighted ensembles where multiple conformational states coexist. Part 1 covers reference state handling, RMSD-based coloring, and cluster visualization. Part 2 will tackle efficient SASA surface generation for large ensembles. To the best of my knowledge, this is the most advanced PyMOL guide EVER.
The code snippets here are extracted from full scripts attached at the end of this post. All examples use two systems: TeaA (a membrane transporter with distinct open/closed states) and MoPrP (mouse Prion Protein with partially unfolded forms).
Continue readingIn Part 1, we covered reference state handling, RMSD-based coloring, and cluster visualization for weighted structural ensembles. Now we tackle a more ambitious goal: generating solvent-accessible surface area (SASA) surfaces that reflect the weighted conformational distribution of your ensemble.
Why surfaces? Because they show the accessible conformational space—where your protein can actually be found, weighted by population. This is particularly powerful when comparing different fitting methods or showing how experimental constraints reshape the ensemble.
The challenge? A typical ensemble might have 500+ frames, each generating thousands of surface points. Naive approaches choke on the computational and memory demands. This post shares the optimizations that make weighted SASA visualization practical.
Continue reading