Category Archives: Protein Structure

The evolution of contact prediction – a new paper

I’m so pleased to be able to write about our work on The evolution of contact prediction: evidence that contact selection in statistical contact prediction is changing (Bioinformatics btz816). Contact prediction – the prediction of parts of the amino-acid chain that are close together – has been critical to improving the ability of scientists to predict protein structures over the last decade. Here we look at the properties of these predictions, and what that might mean for their use.

The paper begins with a question. If contact prediction methods are based on statistical properties of sequence alignments, and those alignments are generated in the presence of ecological and physical constraints, what effect do the physical constraints have on the statistical properties of real sequence alignments? More concisely: when we predict contacts, do we predict particularly important contacts?

Continue reading

What are Hotspots in Structural Biology?

“Hotspot” is one of those extremely versatile words, similar to “model” and “buffer”, which can mean a variety of things depending on context. According to Merriam-Webster, a hotspot is “a place of more than usual interest, activity, or popularity”. This is the most general definition of the concept I could find in a quick search, and the one I find closest in spirit to the way hotspots are perceived in a structural biology context. What this blog post is definitely not about are hotspots as “areas of political, military, or civil unrest” (my experience with them has so far been mostly peaceful), or anything to do with geology, WiFi connections, or forest fires.
However, even within the context of structural biology and structure-based drug design, the word “hotspot” has multiple meanings. In this blog post, I will try to summarise the main ones I have come across, the (sometimes subtle) differences between them, and provide a few useful papers to serve as an entry point for interested readers. Continue reading

Two Tools for Systematically Compiling Ensembles of Protein Structures

In order to know how a protein works, we generally want to know its 3-dimensional structure. We then can either try to solve it ourselves (which requires considerable time, skill, and resources), or look for it in the Protein Data Bank, in case it has already been solved. The vast majority of structures in the Protein Data Bank (PDB) are solved through protein crystallography, and represent a “snapshot” of the conformational space available to our protein of interest. Continue reading

What is the hydrophobic-polar (HP) model?

Proteins are fascinating. They are ubiquitous in living organisms, carrying out all kinds of functions: from structural support to unbelievably powerful catalysis. And yet, despite their ubiquity, we are still bemused by their functioning, not to mention by how they came to be. As computational scientists, our research at OPIG is mostly about modelling proteins in different forms. We are a very heterogeneous group that leverages approaches of diverse scale: from modelling proteins as nodes in a complex interaction network, to full atomistic models that help us understand how they behave.

Continue reading

More Fun With 3D Printing

Recently the students of the Systems Approaches to Biomedical Science Centre for Doctoral Training took a 2-week module on our favourite subject: structural biology! As part of this, they were given the option to create their very own 3D printed model of a protein.

This year we had some great models created, some of which are shown in the picture above. The proteins are (clockwise from top left):

  • Clathrin (PDB 1XI4) – a really interesting protein that forms cages around vesicles inside the cell. This one was mine; I wrote about clathrin as part of my undergraduate dissertation many years ago…
  • GTPase (PDB 1YZN) – a protein that can bind and hydrolyse guanosine triphosphate (GTP), involved in membrane trafficking
  • TAL effector (PDB 3UGM) – this bacterial protein binds to specific regions of DNA in a host plant to activate the expression of plant genes that aid bacterial infection. The DNA here is in blue, the orange wrapped around it is the protein.
  • Mechanotransduction ion channel (PDB 5VKQ) – converts mechanical stimuli into electrical signals in specialized sensory cells.
  • ATP synthase – this protein machine builds most of the energy storage molecule ATP, which powers our cellular processes.
  • DNA (PDB 5F9I) – a double-helix strand of DNA, 20 base pairs long.