I thought I would use this blog to summarise the recent Alchemistry Free Energy workshop in Göttingen, Germany. This event, organised by MPI BPC and BioExcel, brought together academics and industrialists who work with alchemical MM methods to calculate free energies. This was a very successful successor to a similar event organised two year ago in London and now looks to be repeated yearly, alternating between Europe and Boston.
Continue readingCategory Archives: Small Molecules
Graph-based Methods for Cheminformatics
In cheminformatics, there are many possible ways to encode chemical data represented by small molecules and proteins, such as SMILES, fingerprints, chemical descriptors etc. Recently, utilising graph-based methods for machine learning have become more prominent. In this post, we will explore why representing molecules as graphs is a natural and suitable encoding. Continue reading
Check My Blob
A brief overview and discussion of: Automatic recognition of ligands in electron density by machine learning .This paper aims to reduce the bias of crystallographers fitting ligands into electron density for protein ligand complexes. The authors train a supervised machine learning model using known ligand sites across the whole protein databank, to produce a classifier that can identify which common ligands could fit to that electron density.
So, you are interested in compound selectivity and machine learning papers?
At the last OPIG meeting, I gave a talk about compound selectivity and machine learning approaching to predict whether a compound might be selective. As promised, I hereby provide a list publications I would hand to a beginner in the field of compound selectivity and machine learning. Continue reading
Mol2vec: Finding Chemical Meaning in 300 Dimensions

2D projections (t-SNE) of Mol2vec vectors of amino acids (bold arrows). These vectors were obtained by summing the vectors of the Morgan substructures (small arrows) present in the respective molecules (amino acids in the present example). The directions of the vectors provide a visual representation of similarities. Magnitudes reflect importance, i.e. more meaningful words. [Figure from Ref. 1]
A recent publication of one of my former InhibOx-colleagues, Simone Fulle, and her co-workers, Sabrina Jaeger and Samo Turk, shows how we can embed molecular substructures and chemical compounds into a similarly high-dimensional, continuous vectorial representation, which they dubbed “mol2vec“.1 They also released a Python implementation, available on Samo Turk’s GitHub repository.
Seventh Joint Sheffield Conference on Cheminformatics Part 1 (#ShefChem16)
In early July I attended the the Seventh Joint Sheffield Conference on Cheminformatics. There was a variety of talks with speakers at all stages of their career. I was lucky enough to be invited to speak at the conference, and gave my first conference talk! I have written two blog posts about the conference: part 1 briefly describes a talk that I found interesting and part 2 describes the work I spoke about at the conference.
One of the most interesting parts of the conference was the active twitter presence. #ShefChem16. All of the talks were live tweeted which provided a summary of each talk and also included links to software or references. It also allowed speakers to gain insight and feedback on their talk instantly.
One of the talks I found most interesting presented the Protein-Ligand Interaction Profiler (PLIP). It is a method for the detection of protein-ligand interactions. PLIP is open-source and has a web-based online tool and a command-line tool. Unlike PyMol which only calculates polar contacts, and not the type of interaction, PLIP calculates 8 different types of interactions: hydrogen bonding, hydrophobic, π-π stacking, π-cation interactions, salt bridges, water bridges, halogen bonds, metal complexes. For a given pdb file the interactions are calculated and shown in a publication quality figure shown here.
The display can also be downloaded as a PyMol session so the display can be modified.
This tool is an extremely useful way to calculate protein-ligand interactions and can be used to find the types of interactions formed by the protein-ligand complex.
PLIP can be found here: https://projects.biotec.tu-dresden.de/plip-web/plip/
Making small molecules look good in PyMOL
Another largely plagiarized post for my “personal notes” (thanks Justin Lorieau!) and following on from the post about pretty-fication of macromolecules. For my slowly-progressing confirmation report I needed some beautiful small molecule representation. Here is some PyMOL code:
show sticks set ray_opaque_background, off set stick_radius, 0.1 show spheres set sphere_scale, 0.15, all set sphere_scale, 0.12, elem H color gray40, elem C set sphere_quality, 30 set stick_quality, 30 set sphere_transparency, 0.0 set stick_transparency, 0.0 set ray_shadow, off set orthoscopic, 1 set antialias, 2 ray 1024,768
And the result:
Beautiful, no?


