Category Archives: Uncategorized

Antibody modeling via AMA II and RosettaAntibody

Intro

Protein modeling is one of the most challenging problems in bioinformatics. We still lack a clear theoretical framework which would allow us to link linear protein sequence to its native 3D coordinates. Given that we only have the structures for about a promile of the known seqs, homology modeling is still one of the most successful methods to obtain a structure from a sequence. Currently, using homology modeling and the 1393 known folds we can produce models for more than half known domains. In many cases this is good enough to get an overall idea of the fold but for actual therapeutic applications, there is still a need for high-resolution modeling.

There is one group of molecules whose properties can be readily exploited via computational approaches for therapeutic applications: antibodies.  With blockbuster drugs such as Humira, Avastin or Remicade, they are the leading class of biopharmaceuticals. Antibodies share a great degree of similarity with one another (<50-60% sequence identity) and there are at least 1865 antibody structures in the PDB. Therefore, homology modeling of these structures at high resolution becomes tractable, as exemplified by WAM and PIGS. Here, we will review the antibody modeling paradigm using one of the most successful antibody modeling tools, RosettaAntibody, concluding with the most recent progress from AMA II (antibody CASP).

General Antibody-antigen modeling

Modeling of antibody structures can be divided into the following steps:

  1. Identification of the Framework template
  2. Optimizing Vh/Vl orientation of the template
  3. Modeling of the non-H3 CDRs
  4. Modeling of H3

Most of the diversity of antibodies can be found in the CDRs. Therefore, the bulk of the protein can be readily copied from the framework region. This however needs to undergo an optimization of the Vh/Vl orientation. Prediction of the CDRs is more complicated since they are much more variable than the rest of the protein. Non-H3 CDRs can be modeled using canonical structure paradigms. Prediction of H3 is much more difficult since it does not appear to follow the canonical rules.

When the entire structure is assembled, it is recommended to perform refinement using some sort of relaxation of the structure, coupled with an energy function which should guide it.

RosettaAntibody

RosettaAntibody protocol roughly follows this described above. In the first instance, an appropriate template is identified by highest BLAST bit scores. The best heavy and light chains aligned to the best-BLAST-scoring Fv region. The knowledge-base here is a set of 569 antibody structures form SACS with resolutions 3.5A and better. The Vh/Vl orientation is subsequently refined using local relaxation, guided by Charmm.

Non-H3 CDRs are modeled using the highest-scoring BLAST hit of the same length. Canonical information is not taken into account. Loops are grafted on the framework using the residues overlapping with the anchors.

H3 loops are modeled using a fragment based approach. The fragment library is Rosetta+H3 from the knowledge base of antibody structures created for the purpose of this study. The low-resolution search consists of Monte Carlo attempts to fit 3-residue fragments followed by Cyclic Coordinate Descent loop closure. This is followed by high resolution search when the H3 loop and Vh/Vl are repacked using a variety of moves.

Each decoy coming from the repacking is scored using Rosetta function. The lower the Rosetta score the better the decoy (according to Rosetta).

Results

RosettaAntibody can produce high-quality models (1.4A) on its 54 structure benchmark test. The major limitation of the method (just like any other antibody modeling method) is the H3 loop modeling. It is believed that H3 is the most important loop and therefore getting this loop right is a major challenge.

Right framework and the correct orientation of Vh/Vl have a great effect on the quality of H3 predictions. When the H3 was modeled on using the correct framework, the predictions are order of magnitude better than by using the homology model. This was demonstrated using the native recovery in RosettaAntibody study as well as during ‘Step II’ of the Antibody Modeling assessment where participants were asked to model H3 using the correct framework.

Journal club (Bernhard Knapp): MMPBSA Binding Free Energy Calculations

This week’s topic of the Journalclub was about Molecular Mechanics Poisson−Boltzmann Surface Area (MMPBSA) binding free energy calculations between ligand and receptor using Molecular Dynamics simultions (MD). As an example I selected:

David W. Wright, Benjamin A. Hall, Owain A. Kenway, Shantenu Jha, and Peter V. Coveney. Computing Clinically Relevant Binding Free Energies of HIV-1 Protease Inhibitors. J Chem Theory Comput. Mar 11, 2014; 10(3): 1228–1241

The first question is: Why do we need such rather complex and computationally expensive approaches if other (e.g. empirical) scoring functions can do similar things? The main challenges thereby is that simple scoring functions often do not work very well for systems where they were not calibrated on (e.g. Knapp et al. 2009 (http://www.ncbi.nlm.nih.gov/pubmed/19194661)). The reasons for that are manifold. MD-based approaches can improve two major limitations of classical docking/scoring functions:

1) Proteins are not static. Ligand as well as receptor can undergo various slightly different configurations even for one binding site. Therefore the view of scoring one ligand configuration against one receptor configuration is not the whole picture. The first improvement is to consider a lot of different configurations for one position score of the ligand:

multipleReceptorLigand.png

2) A more physics based scoring function can be more reliable than a simple and run-time efficient scoring function. On the basis of the MD simulations a variety of different terms can be deduced. These include:

dG_formula

– MM stands for Molecular Mechanics. It’s internal energy includes bond stretch, bend, and torsion. The electrostatic part is calculated using a Coulomb potential while the Van der Waals term is calculated using a Lennard-Jones potential.
– PB stands for Poisson−Boltzmann. It covers the polar solvation part i.e. the electrostatic free energy of solvation.
– SA stands for Surface Area. It covers the non-polar solvation part via a surface tension weighted solvent accessible surface area calculation.
– TS stands for the entropy loss of the system. This term is necessary because the non-polar solvation incorporates an estimate of the entropy changes implicitly but does not account for an entropy change upon receptor/ligand formation in vacuo. This term is calculated on the basis of a normal mode analysis.

If all these terms are calculated for each single frame of the MD simulations and those single values are averaged an estimate of the binding free energy of the complex can be obtained. However, this estimate might not represent the actual mean of the spatial distribution. Therefore at least 50 replica MD simulations are needed per investigated complex. In this aspect replica means an identically parameterized simultion of the same complex where only the inital forces are assinged randomly.

On the basis of the described MMPBSA-TS approach in combination with 50 replicas the authors achieve a reasonable correlation (0.63) for the 9 FDA-approved HIV-1
protease inhibitors with know experimental binding affinities. If the two largest complexes are excluded the correlation improves to an excellent value (0.93).

In a current study we are using the same methodology for peptide/MHC interactions. This system is completely different from the protease inhibitor study of Wright et al.: The ligands are peptides and the binding site is a groove consisting of two alpha-helices. The methods was applied as it is (without calibration or any kind of training). Prelimiary data still shows a high correlation with experimental values for the peptide/MHC system. This indicates that this MMPBSA approach can yield reliable predictions for very different systems without further modification.

Protein Folding: Man vs Machine

In 1996 Gary Kasparov, the reigning world chess champion, played IBM’s Deep blue, a computer whose sole purpose was to play chess better than any human. Losing the first match, Gary sprung back swiftly defeating Deep Blue 4-2 over the remaining matches. However, his success was short lived. In a rematch with an updated Deep Blue the following year, the score was 3.5-2.5 to the computer. The media (and IBM) declared this as a pivotal moment in history, where a machine had proven itself better than humanities champion at a game deemed a highly intellectual pursuit. The outcry was that the age of machines had arrived. Was it true? Should humanity have surrendered to machine overloads at that moment? Obviously the answer is a large and resounding no. However, this competition allows for insightful comparison between the manner in which humans and computers play chess and think. By comparing the two, we learn the strengths and weaknesses of both parties from which we can make combined approaches that may exceed either.

Firstly, lets discuss the manner in which a computer “plays” chess. They simply search all possible configurations of moves that are available and pick the most optimal. However, things are not that simple. Consider only the opening sequence, there are 20 possible moves a player can make, so after only a single move by each player there is 400 possible chess positions. This count grows exponentially fast, after 5 moves by each player there is approximately 5 million combinations. For example, it was estimated that Deep Blue could analyse 2 million positions per second. However, since this is not nearly fast enough to examine all possible games from start to end in a reasonable time scale, computers cannot foresee lines of plays which are far in the distance. To overcome this, in the early game the computer will use a reference table developed by grandmasters that list both common openings and the assumed best manner to respond to them. Obviously, these are only assumed as optimal and have never been completely tested. In short, machines participate through a brute force, utilising their intricate ability to perform calculations at high speed to find the best move. However, the search is too large in the initial and end stages of a game to be completely thorough, a reference table is instead used to “inform” of the correct move at these times.

takahashi03

While a human can quite easily see that the following board leads to a draw, computers cannot draw the same conclusion without huge effort.

In contrast, human players use far more visual and spatial recognition alongside both memory and calculation to pick their moves. Like a computer, a player will analyse a portion of the moves available at any given moment. Though since a human cannot compare on computation speed to that of a computer, they cannot analyse nearly the same magnitude of moves. Hence, this subset of moves chosen for analysis must contain the most optimal move(s) to compete against the computer’s raw power. This is where the visual and spatial recognition abilities of humans come to bare. Firstly, a human can easily dissect the board into pieces worth considering and those to be ignored. For example, consider a possible move that would result in your queen being exposed and then taken. A human would conclude this as bad (normally) and discard further moves leading from such a play. A computer, however, would explore the resultant board state. One can see how this immediately and drastically reduces the required search. Another human ability is that a player will often be able to able to see sub-structures within a full set-up that are common in the game and hence can be processed in a known manner. In other words, the game is broken down into fragments which can be processed far easier and with less computation. Obviously, both of the above techniques rely on prior knowledge of chess to be useful, but they based upon our human ability to perceive both the substructure of the game and the overall picture with relative ease.

So how does all this chess talk relate to protein folding? In 2010, the Baker group and creators of the ROSETTA protein fold prediction program produced the protein folding game “Foldit”. In Foldit the general public could attempt to fold proteins for themselves and try to get closer to the native structure than the computer algorithms. Obviously, simplified in presentation to that of academic structural biology, it was hoped that the visual and spatial reasoning abilities of humans, the same ones that differentiated them from machines at chess, would prove useful in protein structure prediction. A key issue within ROSETTA drove this train of thought, the fact that is is relatively bad at exploring fully the confirmation space. Often, it will get stuck in the one general configuration and not explore the fold space fully. Furthermore, due to the size of configuration space, this is not easily overcome with simulated annealing due to the sheer scale of the problem. The ability of humans to view the overall picture meant that it should be easier for them to see other possible configurations. As end goals for Foldit, it was hoped that structures that proved unsolvable by current algorithms would be solved by humans and also that new techniques would emerge as “moves” employed by players to achieve high scores could be studied.

To make a comparison of the structures produced by Foldit players and ROSETTA viable, the underlying energy “scores” that judge a structure is the same between the programs. It is assumed, though is not always true, that the better the score the closer you are to the native fold. In addition,  Foldit players were also able to use a set of optimisation tools that were deterministic and would alter the backbone and side chains to the most optimal local configuration to the arrangement the player would make. This meant that players could focus predominantly on altering the overall structure of the protein rather than the fine detail, such as the position of sidec hains. To make the game as approachable as possible, technical terms were replaced by common analogues and visual cues where displayed to highlight poor scoring areas of the protein. For example, clashes between atoms are shown via large spiked red orbs, while the backbone is coloured from green to red depending on how well buried the hydrophobic residues on that segment are. To drive players, gamification elements were also included such as leader boards and rewarding “fireworks” as graphical effects.

To objectively compare the ability of the player base to that of the ROSETTA algorithm, they performed blind predictions on a set of 10 proteins whose structure were not in the public domain. This was run in a similar manner to CASP for those familiar with that set-up. The results exemplified the innate human ability of visual and spatial recognition. In 5 of the cases the playerbase performed significantly better than the ROSETTA program. In 3 of the cases they performed similar. And in the remaining 2 cases the ROSETTA algorithm performed better, though in both of these the model produced was still extremely far from the native structure. Looking through the cases individually, it was identified that the most crucial element used by players was that they were able to deal with large rearrangements that ROSETTA struggled to deal with, including register shifts and strand swapping. This highlights the ability of humans to view the overall picture and to persevere through “bad scoring patches” to reach a more optimal configuration.

foldit_protein

Comparison of foldit player’s solutions (green) to ROSETTA’s solutions (red) and the native 2KPO protein structure (blue). The players correctly identified a strand swap needed to reach the native form, while this large reconfiguration was not seen by ROSETTA.

Since the release of the game and the accompanying paper in 2010, Foldit has received much praise in conveying the field of protein folding in an approachable manner to so many people. In addition, the player base has contributed to science as whole. In 2011 the player base successfully solved the structure of a M-PMV protein, a retrovirus whose structure was unobtainable via normal means. Then in 2012, by analysing the common set of moves employed by the player base, they collectively produced an algorithm that outperforms previously published fold prediction methods. Personally, I think of Foldit as a fun and relative intuitive game that introduces the core elements of the protein folding problem. As to its scientific merit, I’m unsure as to how much impact it will continue to have. As Saulo discussed last week, if infinite monkeys have infinite time then Shakespeare will be reproduced. Likewise, if enough people manipulate a protein structure, eventually the best structure will be found. Though who am I to judge, if people find the game fun, then there are far worse past-times one can have than trying to solve structures. As a finishing note I would be extremely interested in using Foldit to teach structural biology in the future, though feel it is overall too simple for a university setting.

Distance matrix clustering

In Bioinformatics, we often deal with distance matrices such as:

  • Quantifying pairwise similarities between sequences
  • Structural similarity between proteins (RMSD?)
  • etc.

Next step is to study the groupings within the distance matrix using an appropriate clustering scheme. The obvious issue with most clustering methods is that you would need to specify the number of clusters beforehand (as for K-Means). Assuming that you do not know very much about the data and ‘plotting’ it is not an option, you might try non-parametric hierarchichal clustering such as linkage. The main difference between the two approaches is that using linkage you specify what the maximal distance within each cluster should be and thus the number of clusters will be adjusted accordingly. Par contre, using K-Means you do not have such a distance-guarantee within each cluster since the number of groups is predefined.

Here I will provide a short piece of python code that employs the hcluster library to perform linkage clustering.

Installation

Download hcluster, unpack it and inside the unpacked folder type:

python setup.py install

Alternatively, if you’re not an admin on your machine type:

python setup.py install --user

 Example Code

The purpose of the example bit of code is to generate a random set of points within (0,10) in the 2D space and cluster them according to user’s euclidean distance cutoff.

import matplotlib.pyplot as plt
from matplotlib.pyplot import show
from hcluster import pdist, linkage, dendrogram
import numpy
import random
import sys

#Input: z= linkage matrix, treshold = the treshold to split, n=distance matrix size
def split_into_clusters(link_mat,thresh,n):
   c_ts=n
   clusters={}
   for row in link_mat:
      if row[2] < thresh:
          n_1=int(row[0])
          n_2=int(row[1])

          if n_1 >= n:
             link_1=clusters[n_1]
             del(clusters[n_1])
          else:
             link_1= [n_1]

          if n_2 >= n:
             link_2=clusters[n_2]
             del(clusters[n_2])
          else:
             link_2= [n_2]

	  link_1.extend(link_2)
          clusters[c_ts] = link_1
          c_ts+=1
      else:
          return clusters

#Size of the point matrix
rows = 10 #number of points
columns = 2 #number of dimensions - 2=2D, 3=3D etc.
samples = numpy.empty((rows, columns))

#Initialize a random points matrix with values between 0, 10 (all points in the upper right 0,10 quadrant)
for i in xrange(0,rows):
    for j in xrange(0,columns):
       samples[i][j] = random.randint(0,10)

#Show the points we have randomly generated
print "Samples:\n ", samples

#Create the distance matrix for the array of sample vectors.
#Look up 'squareform' if you want to submit your own distance matrices as they need to be translated into reduced matrices
dist_mat = pdist(samples,'euclidean')

#Perform linkage clustering - here we use 'average', but other options are possible which you can explore here: http://docs.scipy.org/doc/scipy-0.13.0/reference/generated/scipy.cluster.hierarchy.linkage.html
z = linkage(dist_mat, method='average',metric='euclidean')

#Specify a cutoff that will define the clustering - command line argument: 
#python ClusterExample.py 3.0
cutoff = float(sys.argv[1])
clustering = split_into_clusters(z,cutoff,rows)
if clustering==None:
	print "No clusters! Most likely your distance cutoff is too high - all falls into the same bag!"
	quit()

#Print the potential singletons - in magenta
for i in xrange(0,rows):
	plt.plot(samples[i][0],samples[i][1], marker='o', color='m', ls='')
	plt.text(samples[i][0],samples[i][1], str(i), fontsize=12)
#If there are more clusters than these the code will fail!
colors = ['b','r','g','y','c','k']

cluster_num = 0
for cluster in clustering:
   print "Cluster: ",cluster_num
   cluster_num+=1
   for i in clustering[cluster]:
	print "-->",i
	plt.plot(samples[i][0],samples[i][1], marker='o', color=colors[cluster_num], ls='')

#Set the axis limits
plt.xlim([-1,12])
plt.ylim([-1,12])

show()
#Alternatively plot it as dendogram to see where your distance cutoff performed the tree cut
dendrogram(z)
show()

When I ran the code above (python [whatever you call the script].py 2.0) that’s what I got (colors correspond to clusters with ‘magenta’ being singletons):

figure_1

And there is a dendogram command on the bottom of the script to see what the clustering has actually done and where it performed the cut according to your specified cutoff (colors DO NOT correspond to clusters here):

figure_2

 

Hcluster library forms part of scipy with very useful methods for data analysis. You can modify the above code to use a variety of other hierarchichal clustering methods which you can  further explore here.

Freezing python code

Many of us in the group use python as our primary programming language. It is in my opinion an awesome language for lots of reasons. However what happens when you write an application and want to share it with the world? Simply distributing the source code requires a great deal of configuration by the end user. I’m sure you’ve all been there, you have version 1.5.1 they use version 1.6.3. However to download and install this breaks every other bit of code you are using. Creating virtual environments can help towards this, but then do you really want to go towards all the hassle of this for every application you want to use? In the end I have given up trying on a number of projects, which is a fate you would never want for your own code!

From my point of view there are three ways of counteracting this issue.

  1. Make limited use of libraries and imports
  2. Have incredibly clear instructions on how to set up the virtual env
  3. Freeze your code!

The first solution is sometimes just not possible or desirable. For example if you want to use a web framework or connect to third party database engines. The second could be massively time consuming and it is virtually impossible to cover all bases. For example, RDKit, my favourite cheminformatics package, has a lengthy install process with platform specific quirks and many of its own dependencies.

In my project I opted for solution number three. I use PyInstaller however there are many others available (cx_freeze, py2apppy2exe). I used PyInstaller because my application uses the Django project and they offer extra Django support. Also PyInstaller is cross-platform, allowing me (in theory) to package applications for Windows, Mac and Linux using the same protocol.

Here I will briefly outline how to set freeze your code using PyInstaller. This application validates a smiles string and shows you the RDKit canonical form of the smiles string.
This is the structure of the code:

src/
  main.py
  module/
    __init__.py
    functions.py
build/
dist/

main.py is:

import sys
from module.functions import my_fun
if len(sys.argv) > 1:
  smiles = sys.argv[1]
  print my_fun(smiles)
else:
  print "No smiles string requested for validation"

functions.py is:

from rdkit import Chem
def my_fun(smiles):
  mol = Chem.MolFromSmiles(smiles)
  if mol is None:
    return "Invalid smiles"
  else:
    return "Valid smiles IN:  " + smiles + "  OUT: " + Chem.MolToSmiles(mol,isomericSmiles=True) 

  1. Download and install PyInstaller 
  2. Type the following (assuming main.py is your python script)
  3. pyinstaller src\main.py --name frozen.exe --onefile
      This will produce a the following directory structure:

      src/
        main.py
        module/
          functions.py
      build/
        frozen/
      dist/
        frozen.exe
      frozen.spec
      

      frozen.spec is a file containing the options for building the application:

      a = Analysis(['src\\main.py'],
                   pathex=['P:\\PATH\\TO\\HEAD'],
                   hiddenimports=[],
                   hookspath=None,
                   runtime_hooks=None)
      pyz = PYZ(a.pure)
      exe = EXE(pyz,
                a.scripts,
                a.binaries,
                a.zipfiles,
                a.datas,
                name='frozen.exe',
                debug=False,
                strip=None,
                upx=True,
                console=True )
      

      “build” contains files used in the building of the executable

      “dist” contains the executable that you can distribute freely around. Because I used the “–onefile” option above it creates one single .exe file. This makes the file very easy to ship – HOWEVER for large programmes this isn’t totally ideal. All the dependencies are compressed into the .exe and uncompressed into a temporary folder at runtime. If there are lots of files, this process can be VERY slow.

      So now we can run the program:

      dist/frozen.exe c1ccccc1 

      Running dist/frozen.exe returns the error: ImportError: numpy.core.multiarray failed to import
      This is because the RDKit uses this module and it is not packaged up in the frozen code. The easiest way to resolve this is to include this import in main.py:

      from rdkit import Chem
      import numpy
      import sys
      from module.functions import my_fun
      if len(sys.argv) > 1:
        smiles = sys.argv[1]
        print my_fun(smiles)
      else:
        print "No smiles string requested for validation"
      

      And there you have it. “frozen.exe” can be passed around to anyone using windows (in this case) and will work on their box.

      Obviously this is a very simple application. However I have used this to package Django applications, using Tornado web servers and with multiple complex dependencies to produce native windows desktop applications. It works! Any questions, post below!

Structural Biology Module @ the DTC

As part of the DTC Structural Biology module (Feb 2014), first year phD students were given 3 days to answer one of several questions from fields within structural biology. The format had to be an automated presentation, and it had to be ENTERTAINING.

Video 1: Is Your Ligand Really There?

The pilot episode of the award-winning series “Protein Hour”…

Video 2: Protein-Protein Docking

Do not attempt to spoof “The Matrix” – That is impossible…

http://www.youtube.com/watch?v=1hVddocYpwI

Video 3: Are Membrane Proteins Special?

An appeal from “Protein Relief 2014″…

Video 4: Structure-based and fragment-based drug design – do they really work?

Is stop-motion animation the next blockbuster in drug design?

http://www.youtube.com/watch?v=-YHicyspTR4

Expanding Anfinsen’s Principle (Journal Club)

Paper: Expanding Anfinsen’s Principle: Contributions of Synonymous Codon Selection to Rational Protein Design.

In 1961, Anfinsen performed his now (in)famous denaturing experiment upon ribonuclease A, a small one hundred residue globular protein. He showed that it could both unfold and refold via the addition and subsequent removal of chemical substances. From this he concluded that a protein’s fold is that of its global free energy minimum and, consequently, all the information required to know the folded structure of a given protein is solely encoded within its sequence of amino acids. In 1972, Anfinsen was awarded the Nobel prize for this work from which stemmed the vast field of protein folding prediction, a global arms race to see who could best predict/find the elusive global minimum for any given protein.

Unfortunately, protein fold prediction is still in its infancy with regards to its predictive power. As a scientific community, we have made huge progress using homology models, whereby we use the structure of a protein with similar sequence to the one under investigation to provide a reasonable starting point for refinement. However, when there is no similar structure in existence, we struggle abysmally due to being forced to resort to de novo models. This lack of ability when we are given solely a sequence to work with, shows that that our fundamental understanding of the protein folding process must be incomplete.

An increasingly common viewpoint, one that is at odds with Anfinsen’s conclusions, is that there is additional information required for a protein to fold. One suggested source of information is in the production of the protein itself at the ribosome. Coined as cotranslational folding, it has been shown that a protein undergoing synthesis will fold as it emerges from the ribosome, not waiting until the entire sequence is synthesised. As such, the energy landscape that the protein must explore to fold is under constant change and development as more and more of the protein emerges from the ribosome. It is an iterative process of smaller searches as the energy landscape is modulated in steps to that of the complete amino acid sequence.

Another suggested source of information is within the degeneracy observed within the genetic code. Each amino acid is encoded for by up to 6 different codons, and as such, one can never determine exactly the coding DNA that created a given protein. While this degeneracy has been suggested as merely an effect to reduce the deleterious nature of point mutations, it has also been found that each of these codons are translated at a different rate. It is evident that information is consumed when RNA is converted into protein at the ribosome, sine reverse translation is impossible, and it is hypothesised that these variations in speed can alter the final protein structure.

Figure 1. Experimental design for kinetically controlled folding. (a) Schematic of YKB, which consists of three half-domains connected by flexible (AGQ)5 linkers (black lines). The Y (yellow) and B (blue) half-domains compete to form a mutually exclusive kinetically trapped folded domain with the central K (black) half-domain. The red wedge indicates the location of synonymous codon substitutions (see text). (b) Energy landscapes for proteins that fold under kinetic control have multiple deep minima, representing alternative folded structures, separated by large barriers. The conformations of the unfolded protein and early folding intermediates (colored arrows) determine the final folded state of the protein. Forces that constrict the unfolded ensemble (gray cone) can bias folding toward one structure. (c) During translation of the nascent chain by the ribosome (orange), folding cannot be initiated from the untranslated C-terminus, which restricts the ensemble of unfolded states and leads to the preferential formation of one folded structure.

Figure 1. Experimental design for kinetically controlled folding. (a) Schematic of YKB, which consists of three half-domains connected by flexible (AGQ)5 linkers (black lines). The Y (yellow) and B (blue) half-domains compete to form a mutually exclusive kinetically trapped folded domain with the central K (black) half-domain. The red wedge indicates the location of synonymous codon substitutions (see text). (b) Energy landscapes for proteins that fold under kinetic control have multiple deep minima, representing alternative folded structures, separated by large barriers. The conformations of the unfolded protein and early folding intermediates (colored arrows) determine the final folded state of the protein. Forces that constrict the unfolded ensemble (grey cone) can bias folding toward one structure. (c) During translation of the nascent chain by the ribosome (orange), folding cannot be initiated from the untranslated C-terminus, which restricts the ensemble of unfolded states and leads to the preferential formation of one folded structure. Image sourced from J. Am. Chem. Soc., 2014, 136(3),

The journal club paper by Sander et al. looked experimentally at whether both cotranslational folding and codon choice can have effect on the resultant protein structure. This was achieved through the construction of a toy model protein, consisting of three half domains as shown in Figure 1. Each of these half domains were sourced from bifluorescent proteins, a group of protein half domains that when combined fluoresce. The second half domain (K) could combine with either the first (Y) or the third (B) half domains to create a fluorophore, crucially this occurring in a non-reversible fashion such that once a full domain was formed it could not form the other. By choosing flurophores that differed in wavelength, it was simple to measure the ratio in which the species, YK-B or Y-KB, were formed.

They found that the ratio of these two species differed between in-vitro and in-vivo formation. When denatured Y-K-B species were allowed to refold, a racemic mixtrue was produced, both species found the be equally likely to form. In contrast, when synthesised at the ribosome, the protein showed an extreme bias to form the YK-B species as shown in Figure 2. They concluded that this is caused by cotranslational folding, the half domains Y and K having time to form the YK species before B was finished being produced. As pointed out by some members within the OPIG group, it would have been appreciated to see if similar results were produced if the species were reversed, such that B was synthesised first and Y last, but this point does not invalidate what was reported.

Figure 2. Translation alters YKB folded structure. (a) Fluorescence emission spectra of intact E. coli expressing the control fluorescent protein constructs YK (yellow) or KB (cyan). (b) Fluorescence emission spectra of intact E. coli expressing YKB constructs with common or rare codon usage (green versus red solid lines) versus the same YKB constructs folded in vitro upon dilution from a chemical denaturant (dashed lines). Numbers in parentheses correspond to synonymous codon usage; larger positive numbers correspond to more common codons. (c) E. coli MG1655 relative codon usage(3) for codons encoding three representative YKB synonymous mutants: (+65) (light green), (−54) (red), and (−100) (pink line).

Figure 2. Translation alters YKB folded structure. (a) Fluorescence emission spectra of intact E. coli expressing the control fluorescent protein constructs YK (yellow) or KB (cyan). (b) Fluorescence emission spectra of intact E. coli expressing YKB constructs with common or rare codon usage (green versus red solid lines) versus the same YKB constructs folded in vitro upon dilution from a chemical denaturant (dashed lines). Numbers in parentheses correspond to synonymous codon usage; larger positive numbers correspond to more common codons. (c) E. coli MG1655 relative codon usage(3) for codons encoding three representative YKB synonymous mutants: (+65) (light green), (−54) (red), and (−100) (pink line). Image sourced from J. Am. Chem. Soc., 2014, 136(3).

Following the above, they also probed the role of codon choice using this toy model system. They varied the codons choice over a small segment of residues between the K and B half domains, such that the had multitude of species which would be encoded either “faster” or “slower” across this region. Codon usage was used as the measure of speed, though this has yet to established within the literature as to its appropriateness. They found that the slower species increased the bias towards the YK-B over Y-KB, while faster species reduced it. This experiment shows clearly that codon choice has a role on a protein’s final structure, though they only show a large global effect. My work is primarily on whether codon choice has a role at the secondary structure level, so I will be avidly hoping that more experiments will follow that show the role of codons at finer levels.

In conclusion, Sander et al. performed one of the cleanest experimental proofs of cotranslational folding to date. Older evidence is more anecdotal in nature, with reports of protein X or Y changing in response to a single synonymous mutation. In contrast, the experiment reported here is systematic in the approach and leaves little room for doubt over the results. Secondly and more ground breaking, is the (again) systematic nature in which codon choice is investigated and shown to effect the global protein structure. This is one of those rare pieces of science which the conclusions are clear and forthcoming to all readers.

Activity cliffs

Mini-perspective on activity cliffs as a medicinal chemistry tool

Recently in group meeting we discussed activity cliffs and their application to medicinal chemistry. The talk was based largely on the excellent mini-perspective on the subject written by Stumpfe et al.

What is an activity cliff?

Activity cliffs are two compounds that represent a small structural change but a large activity change. They are used commonly in the design of new compounds targeting a particular protein (drugs). They work on the principal that if a given structural change has previously had a large affect on activity it is likely to have a similar affect on a different compound series. In this way they can be used as predictive tools to suggest chemical transformations that are likely to improve activity for a given compound.

To define an activity cliff, one must consider what a small structural change and a large activity change mean.

Small structural change

Structural changes can be measured using a seemingly endless array of methods. A lot of methods will condense the chemical information of the molecule into a bit-vector. Each bit indicates the molecule contains a particular type of chemical functionality, e.g. a methyl group. Molecular similarity is then assessed by comparing the bit-vectors, most commonly by finding the Tanimoto similarity between the them. This then returns a single value between 0 and 1 indicating how similar the two molecules are (the greater the more similar). To define small structural change, one must decide upon a threshold value above which two molecules are sufficiently similar.
An alternative method is to find matched molecular pairs – compounds which are identical apart from one structural change. An example of one is shown below. For matched molecular pairs the only parameter required is the size of the non-matching part of the pair. This is usually measured in non-hydrogen atoms. The threshold to use for this parameter is chosen equally arbitrarily however it has a much more intuitive effect.

mmp

An example of a matched molecular pair

Which method to use?

Similarity methods are less rigid and are capable of finding molecules that are very similar, however that differ in two or more subtle ways. They however are also liable to find molecules similar when they would not be perceived as so. In this work Stumpfe et al. show that different similarity methods do not agree greatly on which molecules are “similar”. They compare six different fingerprint methods used to find similar molecules. Each method finds around 30% similar molecules in the datasets used, however the consensus between the methods is only 15%. This indicates that there is no clear definition of “similar” using bit-string similarity. Interestingly a third of the molecules found to be similar by all six fingerprint methods are not considered matched molecular pairs. This demonstrates a downside of the matched molecular pair approach, that it is liable to miss highly similar molecules that differ in a couple of small ways.

Matched molecular pairs are, however, least liable to find false-positives, i.e. compounds that are seen as similar but in fact are not actually similar. The transformations they represent are easily understood and this can be easily applied to novel compounds. For these reasons matched molecular pairs were chosen by Stumpfe et al. for this work to indicate small structural changes.

Large activity change

A large activity change is an equally arbitrary decision to make. The exact value that indicates an activity cliff will depend on the assay used and the protein being tested against. Stumpfe et al. reasonably suggest that approximate measures should not be used and that activity scores found between different assays should not be compared.

Rationales for activity cliffs

If structural data is available for an activity cliff, rationales for their corresponding activity change can be suggested. These can then be used to suggest other alterations that might have a similar impact. Stumpfe et al. consider the five most common rationales for activity cliffs.

  • H-bond and or ionic interactions: these interactions will increase the binding energy forming specific interactions with the protein
  • Lipophilic and aromatic groups: these groups can form specific protein-ligand interactions, e.g. pi-pi stacking and also form favourable interactions with hydrophobic residues in the protein
  • Water molecules: One molecule in the pair displaces water molecules from the active site, altering the binding energy
  • Stereochemistry changes: for example altering an enantiomeric form of a compound alters the projection of a group, forming or losing favourable/disfavourable protein-ligand interactions
  • Multiple effects: a combination of the above, and thus difficult to establish the dominant feature.

Are they generally useful?

Stumpfe et al. consider whether activity cliffs are more useful for some proteins or protein classes than others. They investigate how many compounds form activity cliffs for many protein targets for which activity data is openly available. For proteins with more than 200 compounds with activity data the number of activity cliff forming compounds is roughly equivalent (around 10%). This is an interesting and unexpected result. The proteins used in this study have different binding sites attracting different opportunities for protein-ligand interactions. It would not, therefore naturally be expected that these would attract similar opportunities for generating activity cliffs. This result shows that the activity cliff concept is generically useful, irrespective of the protein being targeted.

Are they predictive?

Although activity cliffs make intuitive sense, Stumpfe et al. consider whether it has been quantitatively successful in previous drug discovery efforts. They investigate all of the times that activity cliff information was available from openly available data. They then find all the times this information was used in a different compound series and if it was used whether it had a positive or negative effect on activity.

Interestingly available activity cliff information had not been used in 75% of cases. They suggest that this indicates this information is an as yet underused resource. Secondly, in the cases where it was used, 60% of the time it was successful in improving activity and 40% of the time is was unsuccessful. They suggest this indicates the activity cliff method is useful for suggesting novel additions to compounds. Indeed it is true that a method that gives a 60% success rate in predicting more potent compounds would be considered useful by most if not all medicinal chemists. It would be interesting to investigate if there were patterns in protein environment or the nature of the structural changes in the cases where the activity cliff method is not successful.

Have they been successful?

Finally Stumpfe et al. investigate whether using activity cliff information gives a higher probability of synthesising a compound in the 10% most active against the target protein. They show that in 54% of cases using activity cliff information a compound in the 10% most active is formed. Conversely when this information is not used only 28% of pathways produce a compound in the 10% most active. They argue this indicates that using activity cliff information improves the chances of producing active compounds.

Conclusion

The paper discussed here offers an excellent overview of the activity cliff concept and its application. They demonstrate, in this work and others, that activity cliffs are generally useful, predictive and currently underused. The method can therefore be used in new tools to improve the efficiency of drug discovery.

CAPRI, pyDock and the state of protein-protein docking.

Intro

We have discussed a paper highlighting the latest progress in the field of protein-protein docking. It covers the results of the participation of the pyDock group in CAPRI. The latest round of the experiment is particularly interesting as it challenged the participants with affinity prediction which is going one step beyond modeling protein-protein complexes.

Role of the group in this CAPRI edition

For those that are not very familiar with CAPRI, one can participate in any of the following roles:

  • Predictors: Participants are given the unbound structures or sequences of one or both interacting partners. Predictors are then supposed to submit a ranked list of modeled complexes which can be achieved by any docking/prediction tools and some human intervention. This tests the current ad hoc capacity to model protein-protein complexes.
  • Servers: Participants are given the unbound structures or sequences of one or both interacting partners that are supposed to be submitted to an automatic server which needs to return the results (ideally) within 24 hours. This tests the automated capacity to model protein-protein models.
  • Scorers: The community contributes sets of poses (unranked) for each target. The scorers are then supposed to re-rank the poses. This is a form of cross-validation of methods since one could expect that the group that samples the poses also performs better scoring. This exercise is supposed to quantify this.

The pyDock group participates in the capacity of Predictors and Scorers. Below we first outline the general docking methodology, followed by the particular implementation of the pyDock group.

General Docking Methodology

Protein-protein docking starts with two interacting structures at input: (ligand (L) and receptor (R)) (if the structures are unavailable a model is created). The docking techniques can be broadly divided into two types: template based and template-free. In template-based docking, one needs either a complex of homologs or one that is sufficiently structurally similar. However since it is estimated that the number of complexes in the PDB only covers ~4% of the presumed total, this approach is not applicable in a great number of cases.

Template-free docking does not rely on pre-solved complexes and it is the main focus of this post. Standard modus-operandi of a template-free protein-protein docker is the following:

  1. Sample N poses.
  2. Re-score top D sampled poses.
  3. Refine the top K poses.

In the first step, possible poses of ligand with respect to the receptor. This is very often achieved by FFT (ZDOCK) or more elaborate methods such as geometric hashing/pose clustering (PatchDock). The possible orientations are ranked by a simplified energy function or a statistical potential. In most cases this step is performed in rigid-body mode (intra-molecular distances are not allowed to change) for the computational efficiency. The number of sampled poses is usually in the thousands (N~ 2k-10k).

The re-scoring set uses a more elaborate energy function/statistical potential to identify more close-to native poses. Notable examples include pyDock and ZRANK. Since such functions are usually more expensive computationally (for instance pyDock calculates the LJ potential, Coulombic electrostatics and desolvation terms), it is more tractable to apply them to a smaller set of (D<<N) of top poses as returned by the rigid-body sampler. Additionally, the final poses might be pruned for redundancy by removing structures which are very similar to each other. One method in particular (ClusPro) owes its success to scoring the rankings based on the numbers of similar decoys per cluster out of a number of top predictions.

The final step constitutes the refinement of the select top K poses (K<<D). Here, flexibility might be introduced so as to account for conformational changes upon binding. Tools used to achieve this are very computationally expensive energy fields such as AMBER or CHARMM. The coordinates of side-chains or even backbones are distorted, for instance using normal mode calculations, until the energy function reaches an energetic minimum via a flavor of gradient descent.

Fig. 1: Breakdown of results of the pyDock group at the latest CAPRI round. The results are presented for each target, in the predictor as well as scorer capacity.

Fig. 1: Breakdown of results of the pyDock group at the latest CAPRI round. The results are presented for each target, in the predictor as well as scorer capacity.

The pyDock group Methodology and Results

The pyDock group follows the pattern outlined above. As a sampler they employ ZDOCK and FTDOCK, both fast rigid-body Fast Fourier Transform-based methods. They use their pyDock function to score the best models. Additionally, they remove redundancy from the final lists of decoys by removing these entries which are too similar to the higher-scoring ones according to ligand rmsd. The final refining step is carried out using TINKER.

Receptor is shown in white. Best ligand as predictors in red, as scorers in blue.

Receptor is shown in white. Best ligand as predictors in red, as scorers in blue.

The pipeline outlined above is carried out in most of targets, however there are some exceptions. For some targets, additional docking poses were generated using SwarmDock (T53, T54, T57 and T58) and RotBUS (T46 and T47). In some cases rather than using TINKER at the refinement stage, CHARMM or AMBER were used.

Receptor is shown in white. Best ligand as predictors in red, as scorers in blue.

Receptor is shown in white. Best ligand as predictors in red, as scorers in blue.

Such ad hoc approaches were employed for almost every target. Available information was employed to the fullest in order to achieve best models. For instance, in cases where good homology structures were available (T47), or it was known which residues appear of importance to the complex, appropriate distance constraints were imposed. The pyDock group achieves better accuracy as predictors rather than scorers when it comes to docking (67% correct models submitted against 57%). They follow a trend wherein predictors usually do better than scorers (See Fig. 1). This trend is however broken at the stage of predicting the water molecules at the interface of T47. Using DOWSER In the predictor capacity they correctly identify 20% contact-mediating water molecules and 43% as scorers.

Receptor is shown in white. Best ligand as predictors in red, as scorers in blue.

Receptor is shown in white. Best ligand as predictors in red, as scorers in blue.

Overall, the performance of docking is indicating that the field is steadily moving forward towards achieving the goal of modeling complexes using sequence alone. There were some cases in this round where predictor groups started with sequence only and still produced reasonable models of complexes (T47, T50 and T53). In each of these cases one of the partners was an unbound structure and the other was a sequence. The only case where both partners were sequences did not produce any reasonable models. In this case only two groups out of 40 managed to present meaningful solutions.

Receptor is shown in white. Best ligand as predictors in red, as scorers in blue.

Receptor is shown in white. Best ligand as predictors in red, as scorers in blue.

Furthermore, this CAPRI round was the first to introduce affinity prediction – in targets T55 – T56. The aim was to predict the binding energy changes upon mutations. The predictor designed by the pyDock group achieved a good results on this test case with more in-depth details on the approach found in a related community-wide experiment.

Receptor is shown in white. Best ligand as predictors in red, as scorers in blue.

Receptor is shown in white. Best ligand as predictors in red, as scorers in blue.