membrane proteins | Oxford Protein Informatics Group

Today I gave a talk on my research project when I joined the group. My research focuses on modeling of membrane proteins (MPs). Membrane proteins are the main class of drug targets and their mechanism of function is determined by their 3D structure. Almost 30% of the proteins in the sequenced genomes are membrane proteins. But only ~2% of the experimentally determined structures in the PDB are membranes. Therefore, computational methods have been introduced to deal with this limitation.

Homology modeling is one of the best performing computational methods which gives “accurate” models of proteins. Many homology modeling methods have been developed, with Modeller being one of the best known ones. But these methods have been tested and customised primarily on the soluble proteins. As we know there are main physical difference between the MPS and water soluble proteins. Therefor, to build a homology modeling pipeline for membrane proteins, we need a pipeline which in all its steps the unique environment of the membrane protein is taken into account.

Memoir is a tool for homology-based prediction of membrane protein structure (Figure below). As input memoir takes a target sequence and a template. First, using imembrane the lipid bilayer boundaries are detected on the template. Using this information MP-T, with its membrane specific substitution matrices, aligns the target and template. Then, Medeller is used to build the core model and finally FREAD, a fragment-based loop modeling, is used to fill in the missing loops.

Memoir Pipeline

Memoir methodology builds accurate models but potentially incomplete. Homology modeling often entails a trade-off between the level of accuracy and the level of coverage that can be achieved in predicted models. Therefore we aim to build Memoir 2.0, in which we increase coverage by modelling the missing structural information only if such prediction is sensible. Therefore, to complete the models in the best way we aim at:

1-Examine the best ways to maximise FREAD coverage, maintaining accuracy

2-Examine the best ab initio loop predictor for membrane proteins

Fread has two main parameters which contribute to its accuracy and coverage. The nature of the chosen database to look for a loop (i.e. membrane or soluble (mem/sol)) and the choice of the sequence identity (ESSS) cut-off:

ESSS >= 25: more accurate loop models are built (Hiacc)

ESSS > 0: more coverage is met but not necessary accurate models (Hicov)

To test the effect of these parameters on the prediction accuracy and coverage we chose to test set:

Mem_DS: 280 loops taken from MP X-ray structures.

Model_DS: 156 loops from homology models of MPs. The loop length in both test ranges from 4 to 17 residues

The comparisons on both dataset confirm that to achieve the highest accuracy and coverage the FREAD Pipeline should be performed as:

1. Hiacc-mem

2. Hicov-mem

3. Hiacc-sol

4. Hicov-sol

Memoir with the new FREAD Pipeline, called Memoir 2.0, achieves higher coverage in comparison to the original Memoir 1.0.

But there are still loops which are not modeled by FREAD Pipeline. These loops should be modeled using an ab initio method. To test the performance of soulable ab initio loop predictors on the membrane proteins we predicted the loops of the above testset sing six ab initio methods available for download: Loopy, LoopBuilder, Mechano, Rapper, Modeller and Plop.

Comparison between ab initio methods on membrane proteins

Comparisons in the image above shows that:

FREAD is more accurate but, doesn’t achieve complete coverage.

Greater coverage is achieved using ab initio predictors.

Mechano, LoopBuilder and Loopy are the best ab initio predictors.

We have selected Mechano for Memoir 2.0 because it:

provides higher coverage than Loopy whilst retaining a similar accuracy.

is faster than LoopBuilder (Mechano is ~30 min faster on loop length of 12)

is able to model terminals.

In memoir 2.0 the C and N terminals of up to 8 residues are built using Mechano. Then, Mechano decoy’s are ranked by their Dfire score , and accepted only if they have exited the membrane. This check improves the average RMSD up to 4Å on DS_280 terminals.

In conclusion, Memoir 2.0 provides higher coverage models while maintaining a reasonable accuracy level. Our comparison results showed that FREAD is significantly more accurate than the ab initio methods. But, greater coverage is achieved using ab initio predictors.Comparison oshows that the top ab initio predictors in terms of accuracy are Mechano, LoopBuilder and Loopy. Similar patterns were also present in the model dataset. We have selected Mechano as it provides higher coverage than Loopy whilst retaining a similar accuracy and is also much faster than LoopBuilder. Mechano also has the advantage that it is able to model terminals. Only loops smaller than 17 residues were considered for modelling since above this threshold the accuracy of predicted loops drops significantly.

Seb gave a talk at the Oxford Structural Genomics Consortium on Wednesday 9 Jan 2013. The talk mentioned the work of several other OPIG members. Below is the gist of it.

Homology modelling pipeline with several membrane-protein-specific steps. Input is the target protein’s sequence, output is the finished 3D model.

Fragment-based loop modelling pipeline for X-ray crystallography

Given an incomplete model of a protein, as well as the current electron density map, we apply our loop modelling method FREAD to fill in a gap with many decoy structures. These decoys are then scored using electron density quality measures computed by EDSTATS. This process can be iterated to arrive at a complete model.

Over the past five years the Oxford Protein Informatics Group has produced several pieces of software to model various aspects of membrane protein structure. iMembrane predicts how a given protein structure sits in the lipid bilayer. MP-T aligns a target protein’s sequence to an iMembrane-annotated template structure. MEDELLER produces an accurate core model of the target, based on this target-template alignment. FREAD then fills in the remaining gaps through fragment-based loop modelling. We have assembled all these pieces of software into a single pipeline, which will be released to the public shortly. In the future, further refinements will be added to account for errors in the core model, such as helix kinks and twists.

X-ray crystallography is the most prevalent way to obtain a protein’s 3D structure. In difficult cases, such as membrane proteins, often only low resolution data can be obtained from such experiments, making the subsequent computational steps to arrive at a complete 3D model that much harder. This usually involves tedious manual building of individual residues and much trial and error. In addition, some regions of the protein (such as disordered loops) simply are not represented by the electron density at all and it is difficult to distinguish these from areas that simply require a lot of work to build. To alleviate some of these problems, we are developing a scoring scheme to attach an absolute quality measure to each residue being built by our loop modelling method FREAD, with a view towards automating protein structure solution at low resolution. This work is being carried out in collaboration with Frank von Delft’s Protein Crystallography group at the Oxford Structural Genomics Consortium.

Oxford Protein Informatics Group

or "OPIG" to friends

Tag Archives: membrane proteins

Building accurate models of membrane protein structures

Talk: Membrane Protein 3D Structure Prediction & Loop Modelling in X-ray Crystallography