Category Archives: Python

PyMOL: colour by residue

PyMOL is a handy free way of viewing three dimensional protein structures. It allows you to toggle between different representations of the protein – such as cartoon, surface, sticks, etc. – which all have their own pros and cons.

However one thing I felt that PyMOL lacked was an easy way to visually distinguish residues based on type. Whist you can easily differentiate atom types based on colour in the colour menu, and even choose which colour you wish carbons to show up as whilst keeping heteroatoms different colours, this assigned carbon colour would be constant throughout the entire protein.

Continue reading

PyMOL: colouring proteins by property

We all love pretty, colourful pictures of proteins. There is quite a variety of programs to produce publication-quality images of proteins, some of the most popular being VMD, PyMOL and Chimera. Each has advantages and disadvantages — for example, VMD is particularly good to deal with molecular dynamics simulations (perhaps that’s why it is called “Visual Molecular Dynamics”?), and Chimera is able to produce breathtaking graphics with very little user input. In my work, however, I tend to peruse PyMOL: a Python interface is incredibly helpful to produce quick analyses.

Continue reading

K-Means clustering made simple

The 21st century is often referred to as the age of “Big Data” due to the unprecedented increase in the volumes of data being generated. As most of this data comes without labels, making sense of it is a non-trivial task. To gain insight from unlabelled data, unsupervised machine learning algorithms have been developed and continue to be refined. These algorithms determine underlying relationships within the data by grouping data points into cluster families. The resulting clusters not only highlight associations within the data, but they are also critical for creating predictive models for new data.

Continue reading

Storing variables in Jupyter Notebooks using %store magic

We’ve all been there. You’ve just run an expensive computation in your Jupyter Notebook and are about to draw those conclusions which will prove that your theories were right all along (until you find the sixteen bugs in your code which render them invalid, but that’s an issue for a different time). Then at the critical moment, your flatmate begins streaming their Lord Of The Rings marathon in 4k and your already temperamental Wi-Fi severs your connection to the department servers in protest, crashing your Jupyter Notebook, leaving your hopes and dreams in tatters.

Continue reading

Transforming Parliament – Training and deploying speech generation transformers for parliamentary speakers

Introduction

I recently wanted to explore areas of machine learning that I do not usually interact with as part of my DPhil research on antibody drug discovery. This post explores how to train and deploy a speech generation model for parliamentary speeches in the style of Jeremy Corbyn and Boris Johnson. You can play around with the resulting model at https://con-schneider.github.io/theytalktoyou.html.

Continue reading

How to be a Bayesian – ft. a completely ridiculous example

Most of the stats we are exposed to in our formative years as statisticians are viewed through a frequentist lens. Bayesian methods are often viewed with scepticism, perhaps due in part to a lack of understanding over how to specify our prior distribution and perhaps due to uncertainty as to what we should do with the posterior once we’ve got it.

Continue reading

OpenMM – easy to learn, highly flexible molecular dynamics in Python

When I came to OPIG this past March I realized I had a novel opportunity – there was no one to tell me which molecular dynamics (MD) program I had to use! Usually, researchers do not have much choice in the matter due to a number of practical concerns. Conflicts between input and output file formats, forces, velocities, and basically everything else between MD suites make having multiple programs flying around tenuous at best if you want group members to be able to help one another. After weighing my options, I settled on OpenMM – and so far I am very happy with the decision.

Continue reading

Python Handout

Many OPIGlets extensively use Jupyter (in either Notebook or Lab flavour) to prototype and present their work. However, as project progress frequently notebooks are converted into regular python files for a number of reasons, losing the notebook functionality.

Wouldn’t it be nice if we could combine some of the benefits of Jupyter notebooks (not least the ability to present both code & results naturally) with regular python files?

Enter Python Handout.

Python Handout was recently (5th August 2019) released by Danijar Hafner and allows Python scripts to be converted into handouts with Markdown comments and inline figures (see above picture).

Installation is via pip (pip3 install -U handout) and Python Handout supports python 3 scripts.

While I’ve not used Handout much (yet), I will definitely be experimenting more in the coming weeks.