Journal Club: Random Coordinate Descent

The paper I chose to present at last week’s group meeting was “Random Coordinate Descent with Spinor-Matrices and Geometric Filters for Efficient Loop Closure”, by Pieter Chys and Pablo Chacón.

Loop closure is an important step in the ab initio modelling of protein loops. After a loop is initially built, normally by randomly choosing φ/ψ (phi/psi) dihedral angles from a distribution (Step 1 in the figure below), it is probably not ‘closed’ – i.e. the end of the loop does not meet the rest of the protein structure on the other side of the gap. Waiting for the algorithm to produce closed initial conformations would be horribly inefficient, so it’s much better to have some method of closing the initial loop structures computationally.

The main steps in the ab initio prediction of protein loops.

The main steps in the ab initio prediction of protein loops.

Loop closure methods can be classified into three different types:

  1. Analytical methods: the exact solution to the loop closure problem is calculated. The difficulty with this approach is that it becomes increasingly complicated the more degrees of freedom (i.e. dihedral angles) you have.
  2. Build-up methods: the loop is built residue-by-residue to construct an approximately closed loop which can then be refined. Basically, the loop is guided to the closed position as it is being built.
  3. Iterative methods: do just what they say on the tin – the loop is closed gradually through a series of iterations.

Of course, science is never simple, and  loop closure algorithms often cannot be classified into just one of the above categories. Cyclic coordinate descent (CCD), the method on which the random coordinate descent algorithm introduced in this paper is based, is a mix of analytical and iterative methods. Starting from one anchor residue (the residues either side of the loop), the loop is initialised. To the end of the ‘open’ loop structure is added the anchor residue from the other side. This residue is therefore present twice: the ‘fixed’ anchor residue (the true structure) and the ‘mobile’ anchor residue (the one added to the loop structure). Then, starting from the end of the loop that is attached to the  rest of the protein, the dihedral angles are changed sequentially to try and minimise the distance between the fixed and mobile anchor residues. The angle change that would minimise this distance is calculated analytically. Once the distance is within a particular cut-off value, the loop is considered to be closed and this is then the final structure.

Random coordinate descent (RCD) is based upon CCD, but with a number of alterations and additions:

  1. Instead of iterating through each dihedral angle sequentially along the loop backbone, angles are chosen randomly
  2. A spinor-matrix approach is used – this reduces loop closure times
  3. Various geometric filters are added at various points in the algorithm – either before, during or after loop closure.
  4. Switching‘  – if loop building fails, then the direction of loop building is changed to the opposite – for example, if the structure is being grown from the N-anchor, but doesn’t pass through the filters, then the loop is discarded and the next loop will be grown from the C-anchor. This should mean that the favoured loop closure direction naturally dominates.

The different geometric filters are as follows:

  1. A grid clash filter, which checks for clashes between the loop residues and the rest of the protein structure
  2. A loop clash filter, which checks for internal clashes between loop residues
  3. An adaptive Ramachandran filter, which restrains the dihedral angles to the allowed regions of the Ramachandran plot.

The Ramachandran filter is a good idea, since loop closure can change the dihedral angles of a structure significantly, moving them into disallowed regions. φ (phi) angles are restricted to the range between -175˚ and -40˚, and  ψ angles are restricted between -60˚ and 175˚ – this is basically the top left part of the Ramachandran plot. There are two exceptions: the  φ angle of proline is fixed, and the dihedral angles of glycine residues are not restricted at all. When placed inside the loop closure routine, the filter is ‘adaptive’ – if the calculated optimum angle is outside of the allowed region, the filter calculates the maximum possible rotation that would still be allowed. When these angle changes become too small, however, the restriction is removed entirely and the angle is allowed to change freely.

By testing different combinations of filters in different places, the authors decided upon a final RCD algorithm. This version includes the grid clash filter during loop closure, and the Ramachandran filter applied both before and during loop closure. They then compare their method to some other loop closure algorithms – their method produces good results, outperforming all except a method called ‘direct tweak’ – the only other method tested that includes clash detection during loop closure. From this, the authors conclude that this is a key factor in generating accurate loop conformations. They also report that RCD is 6 to 17 times faster than direct tweak.

Overall, then, the authors of this paper have introduced an accurate and fast loop closure algorithm which outperforms most other methods. Currently, my research is focussed upon developing a new antibody-specific ab initio loop modelling method, and some of the concepts used in this paper would definitely be worth investigating further. Watch this space!

Author