Can we make Boltz predict allosteric binding? | Oxford Protein Informatics Group

Orthosteric vs Allosteric binding (Nano Banana generated)

(While this post is meant to shed light on the problem of making AI structure prediction models like Boltz become better for allosteric binding, it is also an open call for collaborating on this problem.)

I recently took part in a Boltz hackathon organised by the MIT Jameel Clinic. I worked on improving Boltz 2 predictions for allosteric binders. The validation dataset provided was from a recent paper, Co-folding, the future of docking – prediction of allosteric and orthosteric ligands, which benchmarks some of the recent state-of-the-art AI structure prediction models on a curated set of allosteric and orthosteric binders. Generally, all AI structure prediction models are trained mostly on orthosteric binding cases, which means that their performance on allosteric binding is significantly worse.

To improve the predictions of Boltz2, the Boltz2 team is using steering, which is a correction mechanism to update the denoised atom coordinates using various physics and geometry-based potentials. To improve the performance of Boltz2 for allosteric binding cases, I tried a variation of the ContactPotential in Boltz2, Negative Steering, which discourages the ligand from being too near the primary active (orthosteric site). I set a minimum (6 Å) and maximum distance (10 Å) between which the ligand is encouraged to bind to from the commonly predicted site (which in most cases is the active site) while also being not too far away. If it binds outside of this range, the potential penalises that. This was something I tried within the 15 hours of the hackathon but I am keen to continue extending this work into a paper if possible. Here is a preliminary implementation I tried during the hackathon and I working on improving it: Negative Steering

The results (plot above) from using Negative Steering are not consistent. In some cases, they do result in a slight improvement in terms of the median RMSD (between the predicted structure and ground truth structure), or very similar performance or worse compared to Contact Potential. Also, as there are only 20 allosteric binding cases to test on, it is also tough to come to a reliable conclusion.

Given that the allosteric binding is still an area most structure prediction models struggle with, I am wondering if we could fine-tune this Negative Steering further or implement any other potential such that it can be used on the denoised atom coordinates from each structure prediction model you had benchmarked in your paper, namely, Boltz, NeuralPlexer, and RosettaFold. This idea would not require retraining any of these models, but rather just using models to come up with the primary denoised atom coordinates, which can then be physically corrected using steering potentials to be better for allosteric binders.

In my implementation, Negative Steering is only used on PL-complexes that have already been found to be allosteric else, ContactPotential (from Boltz2 paper) is used on the orthosteric complex. We were given these labels (‘allosteric’ or ‘orthosteric’) in the validation set. Therefore, this steering-based method requires prior knowledge about whether a given complex would be involved in allosteric or orthosteric binding.

I believe it would be a very interesting research challenge to make such AI structure prediction models perform better on allosteric binding using some architectural tricks to learn allosteric pathways or inference time tricks to make the pretrained Boltz models work better for allosteric binding cases based on some biophysics knowledge already gained encoded in the weights.

If you have any thoughts on this, please feel free to reach out to me 🙂

Author

Arun Raja

View all posts