Author Archives: Fergus Boyles

Using Singularity on Windows with WSL2

Previously on this blog, my colleagues Carlos and Eoin have extolled the many virtues of Singularity, which I will not repeat here. Instead, I’d like to talk about a rather interesting subject that was unexpectedly thrust upon me when my faithful Linux laptop started to show the early warning signs of critical existence failure: is there a good way to run a Singularity container on a pure Windows machine? It turns out that, with version 2 of the Windows Subsystem for Linux (WSL), there is.

Continue reading

Hosting multiple Flask apps using Apache/mod_wsgi

A common way of deploying a Flask web application in a production environment is to use an Apache server with the mod_wsgi module, which allows Apache to host any application that supports Python’s Web Server Gateway Interface (WSGI), making it quick and easy to get an application up and running. In this post, we’ll go through configuring your Apache server to host multiple Python apps in a stable manner, including how to run apps in daemon mode and avoiding hanging processes due to Python C extensions not working well with Python sub-interpreters (I’m looking at you, numpy).

Continue reading

Doing rigid receptor docking? Consider using multiple structures!

Here it is. It’s finally happening. I’m actually writing a blog post about docking. Are the end times upon us? Perhaps. If by my next post I’m not back to my usual techie self, the horsemen may well be on their way.

If you’ve ever used, read about, or listened to a lab mate complain about protein-ligand docking, you’re likely familiar with the rigid receptor assumption. In this model, the active site of the protein is treated as completely rigid, with no side chain flexibility, and only the rotatable bonds in the ligand are allowed to move. The motivation behind this assumption is simple. The computational cost of sampling the conformational space of a ligand within a protein’s active site, and doing so with sufficient rigour so as to sample a near-native binding mode, grows rapidly with the number of rotatable bonds in the ligand. Further increasing the degrees of freedom in the system by incorporating receptor side chain flexibility compounds this problem, making the sampling of accurate binding modes for the ligand an incredibly expensive and difficult problem.

One compromise, if multiple structures with different active site conformations are available for the target protein, is to simply dock your ligands into multiple structures, and trust your scoring function (!!!) to pick out the best binding mode from across the different structures. This is a crude approximation to true flexible receptor docking which won’t capture fully any induced fit effects due to a particular ligand, but if the structures are available, this may offer a more computationally-feasible alternative to flexible docking.

A study earlier this year by Cleves and Jain illustrates this approach nicely. They dock the ligands of the DUD-E database into multiple structures for each target, in each case treating the receptor strucutre as completely rigid. Unsurprisingly, when the target is rigid and there is little structural variation in the active site across the structures, the choice of structure has little influence on the docking results. However, when the receptor is flexible, with clear structural variation across the active sites in the different structures, there is a strong impact on the poses generated by rigid-receptor docking. This effect translates directly into improved virtual screening performance when docking into multiple different structures, illustrating the value of considering the conformational space of the receptor, even when it is treated as rigid during the docking process.

Editors for remote development

The ongoing COVID-19 situation has forced us all to dramatically rethink how we work, with many industries struggling to adjust their on-site procedures to ensure the safety of workers, and many more adapting to support much of their workforce in working from home. As a largely computational research group, we are incredibly fortunate in our ability to carry out most of our work remotely, and our department’s wonderful IT and administrative support staff have enabled a smooth transition to remote working.

Continue reading

Streamlining SSH for remote work

With the university now working remotely, and our group working entirely on linux systems, I figured that now would be a good time to share some useful SSH commands to streamline remote access. This is far from an exhaustive list, but will hopefully serve as a useful starting point for anybody who finds themself needing to work remotely on a linux system.

Continue reading

Bringing practical bioinformatics to high school classrooms

Back in July a litter of OPIGlets went rooting for interesting science at ISMB/ECCB 2019 in Basel, Switzerland. When not presenting, working on my sunburn, or paying nine Francs for a beer, I made a point to attend talks outside my usual bubble of machine learning and drug discovery. In particular, I spent the latter half of the conference in the Education track, and am very glad I did. I love teaching, and am always excited to learn from more experienced educators and trainers. Today I’m going to talk about a fantastic presentation by Stevie Bain from the University of Edinburgh about introducing practical bioinformatics to high school biology classrooms through the 4273pi project.

Continue reading

Why you should care about type hints in Python

Duck typing is great. Knowing that as long as my object does what the function expects it to, I can pass it to the function and get my results without having to worry about exactly what else my object might do. Coming from statically-typed languages such as Java and C++, this is incredibly liberating, and makes it easy to rapidly prototype complex and expressive code without worrying about checking types everywhere. This expressiveness, however, comes with a cost: type errors are only caught at runtime, and can be hard to debug if the original author didn’t document what that one variable in that one function signature is expected to look like.

Continue reading

docopt for dummies

Parsing command line arguments is an annoying piece of boilerplate we all have to do. Documenting our code is either an absolutely essential part of software engineering, or a frivolous waste of research time, depending on who you ask. But what if I told you that we can combine the two? That you can handle your argument parsing simply by documenting how your code works? Well, the dream is now reality. Continue reading

Working with Jupyter notebook on a remote server

To celebrate the recent beta release of Jupyter Lab (try it out of you haven’t already), today we’re going to look at how to run a Jupyter session (Notebook or Lab) on a remote server.

Suppose you have lots of data which lives on a remote server and you want to play with it in a Jupyter notebook. You can’t copy the data to your local machine (well, you can, but you’re sensible so you won’t), but you can run your Jupyter session on the remote server. There’s just one problem – since Jupyter notebook is browser-based and works by connecting to the Jupyter session running locally, you can’t just run Jupyter remotely and forward X11 like you would a traditional graphical IDE. Fortunately, the solution is simple: we run Jupyter remotely, create an ssh tunnel connecting a local port to the one used by the Jupyter session, and connect directly to the Jupyter session using our local browser. The best part about this is that you can set up the Jupyter session once then connect to it from any browser on any machine once an ssh tunnel is created, without worrying about X11 forwarding.

Here’s how to do it.

1. First, connect to the remote server if you haven’t already

ssh fergus@funkyserver

1.5. Jupyter takes browser security very seriously, so in order to access a remote session from a local browser we need to set up a password associated with the remote Jupyter session. This is stored in jupyter_notebook_config.py which by default lives in ~/.jupyter. You can edit this manually, but the easiest option is to set the password by running Jupyter with the password argument:

jupyter notebook password
>>> Enter password:

This password will be used to access any Jupyter session running from this installation, so pick something sensible. You can set a new password at any time on the remote server in exactly the same way.

2: Launch a Jupyter session on the remote server. You can specify the access port using the --port option. This might be useful on a shared server where others might be doing the same thing. You’ll also want to run this without launching a browser on the remote server since this is of no use to you.

jupyter lab --port=9000 --no-browser &

Here I’m using Jupyter Lab, but this works in exactly the same way for Jupyter Notebook.

3: Now for the fun part. Jupyter is running on our remote server, but what we really want is to work in our favourite browser on our local machine. To do this we just need to create an ssh tunnel between a port on our machine and the port our Jupyter session is using on the remote server. On our local machine:

ssh -N -f -L 8888:localhost:9000 fergus@funkyserver

For those not familiar with ssh tunneling, we’ve just created a secure, encrypted connection between port 8888 on our local machine and port 9000 on our remote server.

  • -N tells ssh we won’t be running any remote processes using the connection. This is useful for situations like this where all we want to do is port forwarding.
  • -f runs ssh in the background, so we don’t need to keep a terminal session running just for the tunnel.
  • -L specifies that we’ll be forwarding a local port to a remote address and port. In this case, we’re forwarding port 8888 on our machine to port 9000 on the remote server. The name ‘localhost’ just means ‘this computer’. If you’re a Java programmer who lives for verbosity, you could equivalently pass -L localhost:8888:localhost:9000.

4: If you’ve done everything correctly, you should now be able to access your Jupyter session via port 8888 on your machine. Fire up your favourite browser and type localhost:8888 into the address bar. This should bring up a Jupyter session and prompt you for a password. Enter the password you specified for Jupyter on the remote server.

Congratulations! You now have a Jupyter session running remotely which you can connect to anytime, anywhere, from any machine.

Disclaimer: I haven’t tried this on Windows, nor do I intend to. I value my sanity.