What can you do with the OPIG Antibody Suite?

OPIG has now developed a whole range of tools for antibody analysis. I thought it might be helpful to summarise all the different tools we are maintaining (some of which are brand new, and some are not hosted at opig.stats), and what they are useful for.

Immunoglobulin Gene Sequencing (Ig-Seq/NGS) Data Analysis

1. OAS
Link: http://antibodymap.org/
Required Input: N/A (Database)
Paper: http://www.jimmunol.org/content/201/8/2502

OAS (Observed Antibody Space) is a quality-filtered, consistently-annotated database of all of the publicly available next generation sequencing (NGS) data of antibodies. Here you can:

  • Filter the data by study or filter the data across studies
  • Look at snapshots of the immune repertoire in specific disease states (e.g. healthy, day 7 after vaccination, HIV positive)
  • Analyse different isotype properties
  • Analyse different species properties, and much more…

2. SAAB
Link: http://antibodymap.org/
Required Input: (Single Chain) Antibody Variable Domain Sequence(s)
Paper: https://www.frontiersin.org/articles/10.3389/fimmu.2018.01698/full

SAAB enhances the sequences of your NGS dataset with their likely structural features (e.g. CDRH1-2, CDRL1-3 canonical forms; closest CDRH3 template in the PDB).

3. ABOSS
Downloadable here: http://opig.stats.ox.ac.uk/resources
Required Input: (Single Chain) Antibody Variable Domain Sequences
Paper: Accepted, awaiting DOI

ABOSS trims your NGS dataset of sequences likely to have incurred sequencing errors. Sequences that do not align to germlines (see ANARCI), have IMGT CDRH3 lengths  37, possess indels in the canonical CDRs or framework regions, start at IMGT position 24 or later, or have a J gene with sequence identity < 50% to known IMGT germlines are removed. The estimated error rate for your dataset is then calculated based on how often the C23-C104 (IMGT numbering) conserved disulfide bridge is missing from your data. Sequences with residues seen at a given position less frequently than the estimated error rate are then filtered out of the dataset.

Antibody Numbering

4. ANARCI
Webserver + Downloadable here: http://opig.stats.ox.ac.uk/webapps/sabdab-sabpred/ANARCI.php
Required Input: (Single Chain) Antibody Variable Domain Sequence(s)
Paper: https://academic.oup.com/bioinformatics/article/32/2/298/1743894

Consistent use of a numbering scheme is essential to quickly identify CDR regions or to compare between multiple antibodies. ANARCI uses Hidden Markov Models to align your sequences to germlines of known numbering, and rapidly returns them numbered in the scheme of choice (IMGT, Chothia, Kabat, Martin).


Antibody Structure Prediction

— Loops —

5. SCALOP
Webserver + Downloadable here: http://opig.stats.ox.ac.uk/webapps/sabdab-sabpred/SCALOP.php
Required Input: (Paired/Separate Chain) Antibody Variable Domain Sequence(s)
Paper: https://academic.oup.com/bioinformatics/advance-article-abstract/doi/10.1093/bioinformatics/bty877/5132697?redirectedFrom=fulltext

Five of the CDRs (CDRH1-2, CDRL1-3) are found to fall into distinct, clusterable, canonical structures. SCALOP uses environment-specific substitution matrices to assign likely canonical form from sequence alone. Its high fidelity and speed ensure that this analysis can be performed even on very large datasets (e.g. as part of SAAB).

6. Sphinx (H3 and General)
General Webserver: http://opig.stats.ox.ac.uk/webapps/sabdab-sabpred/Sphinx.php
H3 Webserver: http://opig.stats.ox.ac.uk/webapps/sabdab-sabpred/SphinxH3.php
Required Inputs: CDR Sequence + Framework structure (on which the loop will be grafted)
Paper: https://academic.oup.com/bioinformatics/article/33/9/1346/2908432

Sphinx is useful when your homology modeller (e.g. FREAD) cannot find a close template match for your loop of interest. It uses a combination of shorter, sequence similar, template data and ab initio methodology to fill in the gaps. It then returns its decoys ranked using the SOAP-loop algorithm.

— Side Chains —

7. PEARS
Webserver: http://opig.stats.ox.ac.uk/webapps/sabdab-sabpred/PEARS.php
Required Inputs: Antibody Variable Domain Structure + Corresponding Sequence (with side chains to be remodelled in capital letters)
Paper: https://onlinelibrary.wiley.com/doi/abs/10.1002/prot.25453

PEARS remodels side chains by using Gaussian Mixture Models to predict the most probable rotamer for each remodelled residue, given its position in the antibody sequence.

— Entire Structure —

8. ABodyBuilder
Webserver: http://opig.stats.ox.ac.uk/webapps/sabdab-sabpred/Modelling.php
Required Inputs: (Paired chain) Antibody Variable Domain Sequence (will model Nanobodies if only heavy chain supplied)
Paper: https://www.tandfonline.com/doi/full/10.1080/19420862.2016.1205773

ABodyBuilder chains together ANARCI – FREAD – Modeller/Sphinx (if FREAD fails to find a good loop template) – PEARS as a pipeline to create antibody models from sequence data. It also reports likely model accuracy for each region of the structure. Typical runtime is just over 30s for most antibodies.


Analysis of Antibody Structures

9. SAbDab
Link: http://opig.stats.ox.ac.uk/webapps/sabdab-sabpred/Welcome.php
Required Inputs: N/A (Database)
Paper: https://academic.oup.com/nar/article/42/D1/D1140/1044118

SAbDab mines the PDB for antibody and nanobody structures, annotating them with metadata. It can be searched for:

  • Particular PDB codes
  • PDB codes that match a series of metadata (resolution cutoffs, species, bound/unbound, has affinity data etc.)
  • CDRs that match a series of metadata
  • PDB codes with a particular VH-VL orientation
  • All known post-Phase I Therapeutic Structures within the PDB (via. the Therapeutic Antibody Database section). NB: Sequences of all post-Phase I therapeutic antibodies are available in the Supporting Information of the TAP paper (see tool 11).

[We also maintain an equivalent database of PDB TCR structures (STCRDab) here: http://opig.stats.ox.ac.uk/webapps/stcrdab/, paper: https://academic.oup.com/nar/article/46/D1/D406/4566020]

10. Antibody i-Patch (Paratope Prediction)
Webserver: http://opig.stats.ox.ac.uk/webapps/sabdab-sabpred/ABipatch.php
Required Inputs: Antibody and Antigen structures
Paper: https://academic.oup.com/peds/article/26/10/621/1512673

Antibody i-Patch uses contact prediction, antibody binding profiles (derived from PDB antibody-antigen complexes), and the supplied antibody and antigen structures to rank antibody CDR residues based on their propensity to form part of the paratope [the region of the antibody that engages the antigen]. The inverse protocol (where antigen residues are ranked on their likelihood to form part of the epitope) is available here: http://opig.stats.ox.ac.uk/webapps/sabdab-sabpred/EpiPred.php

11. TAP, Therapeutic Antibody Profiler
Webserver: http://opig.stats.ox.ac.uk/webapps/sabdab-sabpred/TAP.php
Required Input: Antibody Variable Domain Sequence (Paired Chains)
Paper: https://www.biorxiv.org/content/early/2018/06/29/359141 (In Review)

TAP builds a model of your variable domain antibody sequence (via. ABodyBuilder) and calculates several surface properties, extreme values of which are linked to poor therapeutic developability outcomes. It then compares these values to a set of antibodies that reached Phase II of clinical trials and flags outlying candidates.

Author