In molecular biology cutting and tweaking a protein construct is an often under-appreciated essential operation. Some protein have unwanted extra bits. Some protein may require a partner to be in the correct state, which would be ideally expressed as a fusion protein. Some protein need parts replacing. Some proteins disfavour a desired state. Half a decade ago, toolkits exists to attempt to tackle these problems, and now with the advent of de novo protein generation new, powerful, precise and way less painful methods are here. Therefore, herein I will discuss how to generate de novo inserts and more with RFdiffusion and other tools in order to quickly launch a project into the right orbit.
Furthermore, even when new methods will have come out, these design principles will still apply —so ignore the name of the de novo tool used.
Category Archives: Protein Engineering

De novo protein padlocks
Binding a desired protein tightly is important for biotechnology. Recent advances in deep learning have allowed the de novo design of (mostly α-helical) binding protein, sidestepping the laborious process of raising antibodies or nanobodies or evolving affibodies, darpins or similar. These deep learning designed binders will bind with okay affinity, but what if the affinity required were much stronger?
<Enter autocatalytic isopeptide bonds>
The “AI-ntibody” Competition: benchmarking in silico antibody screening/design
We recently contributed to a communication in Nature Biotechnology detailing an upcoming competition coordinated by Specifica to evaluate the relative performance of in vitro display and in silico methods at identifying target-specific antibody binders and performing downstream antibody candidate optimisation.
Following in the footsteps of tournaments such as the Critical Assessment of Structure Prediction (CASP), which have led to substantial breakthroughs in computational methods for biomolecular structure prediction, the AI-ntibody initiative seeks to establish a periodic benchmarking exercise for in silico antibody discovery/design methods. It should help to identify the most significant breakthroughs in the space and orient future methods’ development.
Continue readingProtein Property Prediction Using Graph Neural Networks
Proteins are fundamental biological molecules whose structure and interactions underpin a wide array of biological functions. To better understand and predict protein properties, scientists leverage graph neural networks (GNNs), which are particularly well-suited for modeling the complex relationships between protein structure and sequence. This post will explore how GNNs provide a natural representation of proteins, the incorporation of protein language models (PLLMs) like ESM, and the use of techniques like residual layers to improve training efficiency.
Why Graph Neural Networks are Ideal for Representing Proteins
Graph Neural Networks (GNNs) have emerged as a promising framework to fuse primary and secondary structure representation of proteins. GNNs are uniquely suited to represent proteins by modeling atoms or residues as nodes and their spatial connections as edges. Moreover, GNNs operate hierarchically, propagating information through the graph in multiple layers and learning representations of the protein at different levels of granularity. In the context of protein property prediction, this hierarchical learning can reveal important structural motifs, local interactions, and global patterns that contribute to biochemical properties.
Continue readingThe wider applications of nanobodies
This week, it was my turn to give the short talk at our group meeting. I chose to present a recently published paper on thermostability prediction for nanobodies. The motivation for this work, at least in part, is the need for thermostability in the diverse applications of nanobodies. At OPIG, our research primarily revolves around the therapeutic uses of nanobodies, but their potential extends beyond this. I thought it would be interesting to highlight some of these broader applications here:
Continue readingConference Summary: MGMS Adaptive Immune Receptors Meeting 2024
On 5th April 2024, over 60 researchers braved the train strikes and gusty weather to gather at Lady Margaret Hall in Oxford and engage in a day full of scientific talks, posters and discussions on the topic of adaptive immune receptor (AIR) analysis!

What can you do with the OPIG Immunoinformatics Suite? v3.0
OPIG’s growing immunoinformatics team continues to develop and openly distribute a wide variety of databases and software packages for antibody/nanobody/T-cell receptor analysis. Below is a summary of all the latest updates (follows on from v1.0 and v2.0).
Continue readingThe State of Computational Protein Design
Last month, I had the privilege to attend the Keystone Symposium on Computational Design and Modeling of Biomolecules in beautiful Banff, Canada. This conference gave an incredible insight into the current state of the protein design field, as we are on the precipice of advances catalyzed by deep learning.
Here are my key takeaways from the conference:
Continue readingExperience at a Keystone Symposium
From 19th-22nd February I was fortunate enough to participate in the joint Keystone Symposium on Next-Generation Antibody Therapeutics and Multispecific Immune Cell Engagers, held in Banff, Canada. Now in their 51st year, the Keystone Symposia are a comprehensive programme of scientific conferences spanning the full range of topics relating to human health, from studies on fundamental bodily processes through to drug discovery.
Continue reading3 Key Questions to Think About When Designing Proteins Computationally
We have reached the era of design, not just ‘hunting’. Particularly exciting to me is the de novo design of proteins, which have a wide and ever increasing range of applications from therapeutics to consumer products, biomanufacturing to biomaterials. Protein design has been a) enabled by decades of research that contributed to our understanding of protein sequence, structure & function and b) accelerated by computational advances – capturing the information we have learned from proteins and representing it for computers and machine learning algorithms.
In this blog post, I will discuss three key methodological considerations for computational protein design:
- Sequence- vs structure-based design
- ML- vs physics-based design
- Target-agnostic vs target-aware design