Category Archives: X-ray Crystallography

How Unusual Is Your Generated Molecule? Let The CCDC Tell You

In this post I’ll walk through how to set up the CCDC Python API and use the CSD Geometry Analyser to evaluate the geometric quality of molecules from three representative structure-based de novo design models. I’ve put together a small GitHub repo with the full analysis code where we look at bond lengths, angles, torsions, and ring conformations across the three methods, and compare these against their PoseBusters validity scores to see what each metric is really capturing.

Continue reading

A Golden Age of Nanomedicine

As someone who spent their entire academic career, from B.Sc. to M.Sc. to Ph.D., within a Kavli Institute for Nanoscience Discovery (first in Delft and now in Oxford), I’ve had the privilege of seeing firsthand just how beautifully intricate the nanoscale world can be. Now, as my research focuses on lipid nanoparticles for genetic therapeutics and vaccines, I would like to use this platform to advocate for what I believe is one of the most transformative frontiers in modern medicine: the rational design of nanomaterials for therapeutic delivery.

Continue reading

Misconduct, Bias or Benign? A Case of Missing Ångströms

An Ångström

An Ångström (Å) is a unit of length equal to 10−10 metres; one ten-billionth of a metre. It sits at a comfortable scale for the atomic world, with the diameter of a hydrogen atom, the length of a chemical bond, all measured in Ångström.

It is not an International System of Units (Système International d’Unités) “SI” unit. In fact, it has been formally deprecated in favour of the nanometre (1 Å = 0.1 nm), and standards bodies such as NIST and the BIPM discourage its use. Yet, in structural biology and chemistry, crystallography, and materials science, the Ångström persists. I would say, partly out of stubbornness, but mostly out of convenience. Saying a protein structure was solved at 2.1 Å feels natural in a way that 0.21 nm does not.

So we keep using it. And because we keep using it, we inherit its quirks and history.

Continue reading

AI generated linkers™: a tutorial

In molecular biology cutting and tweaking a protein construct is an often under-appreciated essential operation. Some protein have unwanted extra bits. Some protein may require a partner to be in the correct state, which would be ideally expressed as a fusion protein. Some protein need parts replacing. Some proteins disfavour a desired state. Half a decade ago, toolkits exists to attempt to tackle these problems, and now with the advent of de novo protein generation new, powerful, precise and way less painful methods are here. Therefore, herein I will discuss how to generate de novo inserts and more with RFdiffusion and other tools in order to quickly launch a project into the right orbit.
Furthermore, even when new methods will have come out, these design principles will still apply —so ignore the name of the de novo tool used.

Continue reading

Walk through a cell

In 2022, Maritan et al. released the first ever macromolecular model of an entire cell. The cell in question is a bacterial cell from the genus Mycoplasma. If you’re a biologist, you likely know Mycoplasma as a common cell culture contaminant.

Now, through the work of app developer Timothy Davison, you can interactively explore this cell model from the comfort of your iPhone or Apple Vision Pro. Here are three reasons why I like CellWalk:

1. It’s pretty

The visuals of CellWalk are striking. The app offers a rich depiction of the cell, allowing the user to zoom from the whole cell to individual atoms. I spent a while clicking through each protein I could see to see if I could guess what it was or what it did. Zooming out, CellWalk offers a beautiful tripartite cross section of the cell, showing first the lipid membrane, then a colourful jumble-bag of all its cellular proteins, and then finally the spaghetti-like polynucleic acids.

Tripartite cross section of a Mycoplasma cell. Screengrab taken from the CellWalk app on my phone.
Continue reading

RSC Fragments 2024

I attended RSC Fragments 2024 (Hinxton, 4–5 March 2024), a conference dedicated to fragment-based drug discovery. The various talks were really good, because they gave overviews of projects involving teams across long stretches of time. As a result there were no slides discussing wet lab protocol optimisations and not a single Western blot was seen. The focus was primarily either illustrating a discovery platform or recounting a declassified campaign. The latter were interesting, although I’d admit I wish there had been more talk of organic chemistry —there was not a single moan/gloat about a yield. This top-down focus was nice as topics kept overlapping, namely:

  • Target choice,
  • covalents,
  • molecular glues,
  • whether to escape Flatland,
  • thermodynamics, and
  • cryptic pockets
Continue reading

9th Joint Sheffield Conference on Cheminformatics

Over the next few days, researchers from around the world will be gathering in Sheffield for the 9th Joint Sheffield Conference on Cheminformatics. As one of the organizers (wearing my Molecular Graphics and Modeling Society ‘hat’), I can say we have an exciting array of speakers and sessions:

  • De Novo Design
  • Open Science
  • Chemical Space
  • Physics-based Modelling
  • Machine Learning
  • Property Prediction
  • Virtual Screening
  • Case Studies
  • Molecular Representations

It has traditionally taken place every three years, but despite the global pandemic it is returning this year, once again in person in the excellent conference facilities at The Edge. You can download the full programme in iCal format, and here is the conference calendar:

Continue reading

histo.fyi: A Useful New Database of Peptide:Major Histocompatibility Complex (pMHC) Structures

pMHCs are set to become a major target class in drug discovery; unusual peptide fragments presented by MHC can be used to distinguish infected/cancerous cells from healthy cells more precisely than over-expressed biomarkers. In this blog post, I will highlight a prototype resource: Dr. Chris Thorpe’s new database of pMHC structures, histo.fyi.

histo.fyi provides a one-stop shop for data on (currently) around 1400 pMHC complexes. Similar to our dedicated databases for antibody/nanobody structures (SAbDab) and T-cell receptor (TCR) structures (STCRDab), histo.fyi will scrape the PDB on a weekly basis for any new pMHC data and process these structures in a way that facilitates their analysis.

Continue reading

CryoEM is now the dominant technique for solving antibody structures

Last year, the Structural Antibody Database (SAbDab) listed a record-breaking 894 new antibody structures, driven in no small part by the continued efforts of the researchers to understand SARS-CoV-2.

Fig. 1: The aggregate growth in antibody structure data (all methods) over time. Taken from http://opig.stats.ox.ac.uk/webapps/newsabdab/sabdab/stats/ on 25th May 2022.

In this blog post I wanted to highlight the major driving force behind this curve – the huge increase in cryo electron microscopy (cryoEM) data – and the implications of this for the field of structure-based antibody informatics.

Continue reading