Category Archives: Protein Folding

Happy 10th Birthday, Blopig!

OPIG recently celebrated its 20th year; and on 10 January 2023 I gave a talk just a day before the 10th anniversary of BLOPIG’s first blog post. It’s worth reflecting on what’s stayed the same and what’s changed since then.

Continue reading →

Retrieving AlphaFold models from AlphaFoldDB

There are now nearly a million AlphaFold [1] protein structure predictions openly available via AlphaFoldDB [2]. This represents a huge set of new data that can be used for the development of new methods. The options for downloading structures are either in bulk (sorted by genome), or individually from the webpage for a prediction.

If you want just a few hundred or a few thousand specific structures, across different genomes, neither of these options are particularly practical. For example, if you have several thousand experimental structures for which you have their PDB [3] code, and you want to obtain the equivalent AlphaFold predictions, there is another way!

If we take the example of the PDB’s current molecule of the month, pyruvate kinase (PDB code 4FXF), this is how you can go about downloading the equivalent AlphaFold prediction programmatically.

Query UniProt [4] for the corresponding accession number – an example python script is shown below:

Continue reading →

Monoclonal antibody PRNP100 therapy for Creutzfeldt–Jakob disease

Recently, University College London Hospitals (UCLH) received a “Specials License” to allow the treatment of six patients suffering from Creutzfeldt–Jakob Disease (CJD), by way of a novel antibody known as PRN100. The results of this treatment have now been published in The Lancet.

There is currently no cure for CJD, yet over 100 people per year develop it either spontaneously or through external means including (but not limited to) growth hormones, cataract surgery or infected neurosurgical implements [1]. “There is no UK legislation which implements a compassionate use programme as set out in Article 83 of the relevant EU regulation. But the UK has implemented an exemption process known as the “Specials” in light of the requirement to be able to deal with special needs.” [2]

As there is no known cure, the request for use of PRN100 was put before the court as in Law “Some treatment decisions are so serious that the court has to make them.”

Continue reading →

Unraveling the role of entanglement in protein misfolding

Proteins that fail to fold correctly may populate misfolded conformations with disparate structure and function. Misfolding is the focus of intense research interest due to its putative and confirmed role in various diseases, including neurodegenerative diseases such as Parkinson’s and Alzheimer’s Diseases as well as cystic fibrosis (PMID: 16689923).

Many open questions about protein misfolding remain to be answered. For example, how do misfolded proteins evade cellular quality control mechanisms like chaperones to remain soluble but non-functional for long timescales? How long do misfolded states persist on average? How widespread is misfolding? Experiments indicate that misfolding can even be caused by synonymous mutations that alter the speed of protein translation but not the sequence of the protein produced (PMID: 23417067), introducing the additional puzzle of how the protein maintains a “memory” of its translation kinetics after synthesis is complete.

A series of four recent preprints (Preprints 1, 2, 3, and 4, see below) suggests that these questions can be answered by the partitioning of proteins into long-lived self-entangled conformations that are structurally similar to the native state but with perturbed function. Simulation of the synthesis, termination, and post-translational dynamics of a large dataset of E. coli proteins suggests that misfolding and entanglement are widespread, with two thirds of proteins misfolding some of the time (Preprint 1). Many misfolded conformations may bypass proteostasis machinery to remain soluble but non-functional due to their structural similarity to the native state. Critically, entanglement is associated with particularly long-lived misfolded states based on simulated folding kinetics.

Coarse-grain and all-atom simulation results indicate that these misfolded conformations interact with chaperones like GroEL and HtpG to a similar extent as does the native state (Preprint 2). These results suggest an explanation for why some protein always fails to refold while remaining soluble, even in the presence of multiple folding chaperones – it remains trapped in entangled conformations that resemble the native state and thus fail to recruit chaperones.

Finally, simulations indicate that changes to the translation kinetics of oligoribonuclease introduced by synonymous mutations cause a large change in its probability of entanglement at the dimerization interface (Preprint 3). These entanglements localized at the interface alter its ability to dimerize even after synthesis is complete. These simulations provide a structural explanation for how translation kinetics can have a long-timescale influence on protein behavior.

Together, these preprints suggest that misfolding into entangled conformations is a widespread phenomenon that may provide a consistent explanation for many unanswered question in molecular biology. It should be noted that entanglement is not exclusive to other types of misfolding, such as domain swapping, that may contribute to misfolding in cells. Experimental validation of the existence of entangled conformations is a critical aspect of testing this hypothesis; for comparisons between simulation and experiment, see Preprint 4.

Preprint 1: https://www.biorxiv.org/content/10.1101/2021.08.18.456613v1

Preprint 2: https://www.biorxiv.org/content/10.1101/2021.08.18.456736v1

Preprint 3: https://www.biorxiv.org/content/10.1101/2021.10.26.465867v1

Preprint 4: https://www.biorxiv.org/content/10.1101/2021.08.18.456802v1

How fast can a protein fold?

A protein’s folding time is the time required for it to reach its unique folded state starting from its unfolded ensemble. Globular, cytosolic proteins can only attain their intended biological function once they have folded. This means that protein folding times, which typically exceed the timescales of enzymatic reactions that proteins carry out by several orders of magnitude, are critical to determining when proteins become functional. Many scientists have worked tirelessly over the years to measure protein folding times, determine their theoretical bounds, and understand how they fit into biology. Here, I focus on one of the more interesting questions to fall out of this field over the years: how fast can a protein fold? Note that this is a very different question than asking “how fast do proteins fold?”

Continue reading →

Ribosome occupancy profiles are conserved between structurally and evolutionarily related yeast domains

Shameless plug for any OPIG blog readers to take a look at our recent publication in Bioinformatics. Consider giving it a read if the below summary grabs your attention.

Many proteins are now known to fold during their synthesis through the process known as co-translational folding. Translation is an inherently non-equilibrium process – one consequence of this fact is that the speed of translation can radically influence the ability of proteins to fold and function. In this paper we compare ribosome occupancy profiles between related domains in yeast to test the hypothesis that evolutionarily related proteins with similar native folds should tend to have similar translation speed profiles to preserve efficient co-translational folding. We find strong evidence in support of this hypothesis at the level of individual protein domains and across a set of 664 pairs of related domains for which we are able to compute high-quality ribosome occupancy profiles.

To find out more, view the Advance Article at Bioinformatics.

An in vivo force sensor reveals varied mechanisms of co-translational force generation

This blog post comments on the results published by Fujiwara and co-workers in the 2020 Cell Reports article “Proteome-wide capture of co-translational protein dynamics in Bacillus subtilis using TnDR, a transposable protein-dynamics reporter.”

The study of mechanical force generation and its influence on biological systems has expanded in recent years. In the realm of nascent protein folding, we now know that both unstructured and folded nascent proteins generate forces on the order of piconewtons that propagate down the nascent chain. These forces can distort the functional site of the ribosome and may influence the rate of translation (PMIDs: 30824598, 29577725). It has also been shown that translational arrest can be relieved by mechanical force (PMID: 25908824). Much study has focused on so-called arrest peptides, short peptide sequences that interact so strongly with the ribosome exit tunnel that they can completely stall translation (e.g., SecM, MifM).

Continue reading →

Curious About the Origins of Computerized Molecules? Free Webinar Dec 22…

After the stunning announcement at CASP14 that DeepMind’s AlphaFold 2 had successfully predicted the structures of proteins from their sequence alone, it’s hard to believe we began this journey by representing molecules with punched cards…

Image of a punched card, showing 80 columns and 12 rows, with particular rectangular holes representing the 1 bits of binary numbers. The upper right corner is cut at an angle, to facilitate feeding the card into a punched card reader. The column numbers are printed along the bottom. The words “IBM UNITED KINGDOM LIMITED” are printed along the very bottom. This card is line 12 from a Fortran program, “12 PIFRA=(A(JB,37)-A(JB,99))/A(JB,47) PUX 0430”. Image Credit: Pete Birkinshaw, Manchester, U.K. CC BY 2.0

Tales of carrying stacks of punched cards to the computer centre with a line drawn diagonally on the side of the stack, to help put them back in order should you trip and fall—seem like another universe—but this is what passed for the human-computer interface in much of the mid-20th century.

Continue reading →

CASP14: what Google DeepMind’s AlphaFold 2 really achieved, and what it means for protein folding, biology and bioinformatics

Disclaimer: this post is an opinion piece based on the experience and opinions derived from attending the CASP14 conference as a doctoral student researching protein modelling. When provided, quotes have been extracted from my notes of the event, and while I hope to have captured them as accurately as possible, I cannot guarantee that they are a word-by-word facsimile of what the individuals said. Neither the Oxford Protein Informatics Group nor I accept any responsibility for the content of this post.

You might have heard it from the scientific or regular press, perhaps even from DeepMind’s own blog. Google ‘s AlphaFold 2 indisputably won the 14^th Critical Assessment of Structural Prediction competition, a biannual blind test where computational biologists try to predict the structure of several proteins whose structure has been determined experimentally — yet not publicly released. Their results are so incredibly accurate that many have hailed this code as the solution to the long-standing protein structure prediction problem.

Continue reading →

Electrostatic interactions govern extreme nascent protein ejection times from ribosomes and can delay ribosome recycling

Finishing up a lingering project from your PhD almost a year into your postdoc is a great feeling, especially when it has actually been about 3 years in the making.

Though somewhat outside of the usual scope of activities in OPIG, I encourage you to take a look if the below summary grabs your interest. The full paper and supporting materials (including some movies which took entirely too long to make) can be found at https://pubs.acs.org/doi/abs/10.1021/jacs.9b12264.

Continue reading →