Modelling antibodies, from Sequence, to Structure… | Oxford Protein Informatics Group

Antibody modelling has come a long way in the past 5 years. The Antibody Modelling Assessment (AMA) competitions (effectively an antibody version of CASP) have shown that most antibody design methods are capable of modelling the antibody variable fragment (Fv) at ≤ 1.5Å. Despite this feat, AMA-II provided two important lessons:

1. We can still improve our modelling of the framework region and the canonical CDRs.

Stage two of the AMA-II competition showed that CDR-H3 modelling improves once the correct crystal structure was provided (bar the H3 loop, of course). In addition, some of the canonical CDRs (e.g. L1) were modelled poorly, and some of the framework loops had also been poorly modelled.

2. We can’t treat orientation as if it doesn’t exist.

Many pipelines are either vague about how they predict the orientation, or have no explicit explanation on how the orientation will be predicted for the model structure. Given how important the orientation can be toward the antibody’s binding mode (Fera et al., 2014), it’s clear that this part of the pipeline has to be re-visited more carefully.

In addition to these lessons, one question remains:

What do we do with these models?

No pipeline, as far as we are aware, have no comments on what we should do beyond creating the model from a pipeline. What are its implications? Can we even use it for experiments, and use it as a potential therapeutic in the long-term? In light of these lessons and this blaring question, we developed our own method.

Before we begin, how does modelling work?

In my mind, most, if not all, pipelines follow this generic paradigm:

Our method, ABodyBuilder, also follows this 4-step workflow;

We choose the template structure based on sequence identity; below a threshold, we predict the structure of the heavy and light chains separately
In the event that we use the structures from separate antibodies, we predict the orientation from the structure with the highest global sequence identity.
We model the loops using FREAD (Choi, Deane, 2011)
Graft the side chains in using SCWRL.

Following the modelling procedure, our method also annotates the accuracy of the model in a probabilistic context — i.e., an estimated probability that a particular region is modelled at a given RMSD threshold. Moreover, we also flag up any issues that an experimentalist can run into should they ever decide to model the antibody.

The accuracy estimation is a data-driven estimation of model quality. Many pipelines end up giving you just a model – but there’s no way of determining model accuracy until the native structure is determined. This is particularly problematic for CDRH3 where RMSDs can reach up to >4.0A between models and native structures, and it would be incredibly useful to have an a priori, expected estimation of model accuracy.

Furthermore, by commenting on motifs that can conflict with antibody development, we aim to offer a convenient solution for users when they are considering in vitro experiments with their target antibody. Ultimately, ABodyBuilder is designed with the user in mind, making an easy-to-use, informative software that facilitates antibody modelling for novel applications.

Author

Jinwoo Leem

View all posts