Category Archives: How To

How to do things. doh.

Python Handout

Many OPIGlets extensively use Jupyter (in either Notebook or Lab flavour) to prototype and present their work. However, as project progress frequently notebooks are converted into regular python files for a number of reasons, losing the notebook functionality.

Wouldn’t it be nice if we could combine some of the benefits of Jupyter notebooks (not least the ability to present both code & results naturally) with regular python files?

Enter Python Handout.

Python Handout was recently (5th August 2019) released by Danijar Hafner and allows Python scripts to be converted into handouts with Markdown comments and inline figures (see above picture).

Installation is via pip (pip3 install -U handout) and Python Handout supports python 3 scripts.

While I’ve not used Handout much (yet), I will definitely be experimenting more in the coming weeks.

Quick Python tricks

It’s always fun when you stumble across something in your programming toolkit that you had never noticed. Here are three things I’ve recently enjoyed learning.

  • Ternary syntax
    a = int(raw_input())
    is_even = True if a % 0 == 0 else False
  • Enumerate

I’ve been looping over the length of my list, all these years, like a chump. It turns out you can do this:

for index, item in enumerate(some_list):
    # now the index of each item is available as well as the item    
# Don't do do this
for index in range(len(some_list)):
    item = some_list[index]
  • for… else

Every so often, you really need to know that a for loop has run to completion. That’s what for…else is for!

for item in iterable:
if item % 0 == 0:
first_even_number = item
else:
raise ValueError('No even numbers')

Some useful tools

For my blog post this week, I thought I would share, as the title suggests, a small collection of tools and packages that I found to make my work a bit easier over the last few months (mainly python based). I might add to this list as I find new tools that I think deserve a shout-out.

Biopandas

Reading in .pdb files for processing and writing your own parser (while being a good exercise to familiarize yourself with the format) is a pain and clutters your code with boilerplate.

Luckily for us, Sebastian Raschka has written a neat package called biopandas [1] which enables quick I/O of .pdb files via the pandas DataFrame class.

Continue reading

Graph-based Methods for Cheminformatics

In cheminformatics, there are many possible ways to encode chemical data represented by small molecules and proteins, such as SMILES, fingerprints, chemical descriptors etc. Recently, utilising graph-based methods for machine learning have become more prominent. In this post, we will explore why representing molecules as graphs is a natural and suitable encoding. Continue reading

Turning MD Trajectories into Movies using PyMOL

Putting movies into your presentations is the perfect way to cover up a terrible underlying presentation help the audience visualise the systems you are discussing. Static protein movies can enhance an introduction or help users understand important interactions between proteins and ligands. PyMOL plugins, such as emovie.py, help you move beyond the ‘rock’ and ‘roll’ scenes in PyMOL’s movie tab. But there ends the scope for your static structures.

If you want to take your PyMOL movie making skills to the next level, you should start adding some dynamics data. This allows your audience to visualise how your protein dynamics evolve over time and a much easier way to explain your results (because, who likes 10,000 graphs in a presentation!? Even if your R plots look super swish.). For example: understanding binding events, PPIs over time or even loop motion.

The following tutorial shows you how to turn a static PDB structure into a dynamic one, by adding a GROMACS trajectory. Most of the commands you will encounter while making a static structure movie, so should not be too alien.

Continue reading

What can you do with the OPIG Antibody Suite?

OPIG has now developed a whole range of tools for antibody analysis. I thought it might be helpful to summarise all the different tools we are maintaining (some of which are brand new, and some are not hosted at opig.stats), and what they are useful for.

Immunoglobulin Gene Sequencing (Ig-Seq/NGS) Data Analysis

1. OAS
Link: http://antibodymap.org/
Required Input: N/A (Database)
Paper: http://www.jimmunol.org/content/201/8/2502

OAS (Observed Antibody Space) is a quality-filtered, consistently-annotated database of all of the publicly available next generation sequencing (NGS) data of antibodies. Here you can:

Continue reading