In the last group meeting, I reported on the success of ligand-fitting programs for the automated solution of ligand structures.
In Fragment Screens by X-ray Crystallography, a library of small compounds (fragments) is soaked into protein crystals, and the resulting structures are determined by diffraction experiments. Some of the fragments will bind to the protein (~5% of the library), and these are detected by their appearance in the derived electron density.
The models of binding fragments can be used to guide structure-based drug-design efforts, but first they must be built. Due to the large number of datasets (200-1000), the automated identification of the fragments that bind, and the automated building of atomic models is required for efficient processing of the data.
Anecdotally, available ligand-fitting programs are unreliable when modelling fragments. We tested three ligand fitting programs in refitting a series of ligand structures. We found that they fail more frequently when the electron density for the ligand is weak. Many fragments that are seen to bind in screens do so only weakly, due to their size. So the weaker the fragment binds, the harder it will be for the automated programs to model.
Models are usually ranked by the Real-Space Correlation Coefficient (RSCC) between the model and the experimental electron density. This metric is good at identifying ‘correct’ models, and an RSCC > 0.7 normally indicates a correct, or at least mostly correct, model.
Typically, the binding locations of ligands are found by searching for un-modelled peaks in the electron density map. Models are then generated in these locations, and are then scored and ranked. Good models can be identified and presented to the user. However, if a ‘good’ model is not generated, to be scored and ranked, the RSCCs of the ‘bad’ models will not tell you that there is something to be modelled, at a particular place, and binding may be missed…
This is especially true for weak-binding ligands, which will not give a large electron density peak to give evidence that there is something there to be modelled.
Currently, all of the datasets must be inspected manually, to check that a weak-binding fragment has not been missed…