Author Archives: Lucy Vost

ISMB/ECCB conference feedback 

The ISMB/ECCB conference took place in Liverpool this year. So, a couple of OPIGlets took the train up north to attend this biyearly joint conference. Here we will give some general feedback on the conference and highlight some interesting talks/posters. 

General feedback 

ISMB/ECCB is a 4.5 day conference starting on the Sunday evening and running until Thursday evening. The conference is attended by around 2500 people, mostly from academic groups around the world. With more than 20 different tracks, it is a broad conference with lots of tracks happening at the same time. As always, it is thus recommended to have a look at the schedule beforehand to not get too overwhelmed. Each day there is one keynote, two poster sessions, and three blocks of talks. These talks are often given by PIs, but also PostDocs and PhD students get the opportunity to present. There are also some smaller slots for highlighting posters which are presented that day. 

This year there was a very interesting line-up of Distinguished Keynote speakers. The conference was kicked off by John Jumper talking about AlphaFold2, with a focus on how the team went about the various problems during the process of going from the initial AlphaFold model to AlphaFold2. On Monday Prof. Amos Bairoch talked about biocuration and importance and challenges of public databases. He discussed the FAIR principles for Findable, Accessible, Interoperable, and Reusable for data management [1]. The next Keynote was by Prof. James Zou about computational biology in the age of AI agents (later more). On Wednesday we had our own Prof. Charlotte Deane (woo!) talking about structure-based drug discovery with a focus on the importance of baselines and benchmarking. The conference was ended by a short interview with Prof. David Baker, followed by a talk from Prof. Fabian Theis on decoding cellular systems. He discussed Cellflow [2], an AI tool that predicts how perturbations like drugs effect the cellular phenotype. 

Continue reading

Conference feedback: Protein Society Annual Symposium

Recently, a couple of OPIG members had the opportunity to attend and present at the 39th Annual Symposium of the Protein Society—a not-for-profit scholarly society founded in 1985 that focuses on protein structure, function, and design—held in San Francisco.

The PS39 schedule was well designed, offering a balance between plenary talks, themed parallel sessions, and networking opportunities. A wide range of topics was covered, including transient protein states, supramolecular assemblies, proteostasis, and circadian clocks. This allowed us to follow areas of personal interest, both related and unrelated to our research, while exploring unfamiliar fields. Although many talks were biology-heavy, they were generally pitched at an accessible level for those from other disciplines (ie. the small molecules side of OPIG). Presentations almost always included results from both in silico and experimental approaches, with relatively few focusing exclusively on one or the other; a very nifty thing to see as people who mostly just dream of experimental validation! In contrast to our generalisable-model-focus, many of the researchers presenting had dedicated years to studying a single protein or system, uncovering its nuances in a way that made for some neat storytelling.

Continue reading

Publishing 101

Scientists pride themselves on clear, logical and concise communication. So naturally, the process for publishing our research involves an absurd number of formalities, like coming up with 700 slightly different ways to ‘thank the reviewer for their insightful comment’. Nevertheless, I’m told this is all a necessary part of spreading your beautiful researcher butterfly wings—and frankly, I’m enough years into my DPhil to stop questioning every quirk of academia. However, the current protocol for new researchers wanting to learn the moves to this bizarre dance seems to be begging postdocs/ old timers for examples of cover letters, marked-up manuscripts, and reviewer responses. To attempt to save everyone some time, I thought I’d provide some guidance and templates here.

Continue reading

Baby’s First NeurIPS: A Survival Guide for Conference Newbies

There’s something very surreal about stepping into your first major machine learning conference: suddenly, all those GitHub usernames, paper authors, and protagonists of heated twitter spats become real people, the hallways are buzzing with discussions of papers you’ve been meaning to read, and somehow there are 17,000 other people trying to navigate it all alongside you. That was my experience at NeurIPS this year, and despite feeling like a microplankton in an ocean of ML research, I had a grand time. While some of this success was pure luck, much of it came down to excellent advice from the group’s ML conference veterans and lessons learned through trial and error. So, before the details fade into a blur of posters and coffee breaks, here’s my guide to making the most of your first major ML conference.

Continue reading

The War of the Roses: Tea Edition

Picture the following: the year is 1923, and it’s a sunny afternoon at a posh garden party in Cambridge. Among the polite chatter, one Muriel Bristol—a psychologist studying the mechanisms by which algae acquire nutrients—mentions she has a preference for tea poured over milk, as opposed to milk poured over tea. In a classic example of women not being able to express even the most insignificant preference without an opinionated man telling them they’re wrong, Ronald A. Fisher, a local statistician (later turned eugenicist who dismissed the notion of smoking cigarettes being dangerous as ‘propaganda’, mind you) decides to put her claim to the test with an experiment. Bristol is given eight cups of tea and asked to classify them as milk first or tea first. Luckily, she correctly identifies all eight of them, and gets to happily continue about her life (presumably until the next time she dares mention a similarly outrageous and consequential opinion like a preferred toothpaste brand or a favourite method for filing papers). Fisher, on the other hand, is incentivized to develop Fisher’s exact test, a statistical significance test used in the analysis of contingency tables.

Continue reading

How to make ML your entire personality

In our silly little day-to-day lives in over in stats, we forget how accustomed we all are to AI being used in many of the things we do. Going home for the holidays, though, I was reminded that the majority of people (at least, the majority of my family members) don’t actually make most of their choices according to what a random, free AI tool suggests for them. Unfortunately, though, I do! Here are some of my favourite non-ChatGPT free tools I use to make sure everyone knows that working in ML is, in fact, my entire personality.

Continue reading

Conference feedback: AI in Chemistry 2023

Last month, a drift of OPIGlets attended the royal society of chemistry’s annual AI in chemistry conference. Co-organised by the group’s very own Garrett Morris and hosted in Churchill College, Cambridge, during a heatwave (!), the two days of conference featured aspects of artificial intelligence and deep machine learning methods to applications in chemistry. The programme included a mixture of keynote talks, panel discussion, oral presentations, flash presentations, posters and opportunities for open debate, networking and discussion amongst participants from academia and industry alike. 

Continue reading

BRICS Decomposition in 3D

Inspired by this blog post by the lovely Kate, I’ve been doing some BRICS decomposing of molecules myself. Like the structure-based goblin that I am, though, I’ve been applying it to 3D structures of molecules, rather than using the smiles approach she detailed. I thought it may be helpful to share the code snippets I’ve been using for this: unsurprisingly, it can also be done with RDKit!

I’ll use the same example as in the original blog post, propranolol.

1DY4: CBH1 IN COMPLEX WITH S-PROPRANOLOL

First, I import RDKit and load the ligand in question:

Continue reading

SnackGPT

One of the most treasured group meeting institutions in OPIG is snackage. Each week, one group member is asked to bring in treats for our sometimes lengthy 16:30 meeting. However, as would be the case for any group of 25 or so people, there are various dietary requirements and preferences that can make this snack-quisition a somewhat tricky process: from gluten allergies to a mild dislike of cucumber, they vary in importance, but nevertheless are all to be taken into account when the pre-meeting supermarket sweep is carried out.

So, the question on every researcher’s mind: can ChatGPT help me? Turns out, yes: I gave it a list of the group’s dietary requirements and preferences, and it gave me a handy little list of snacks I might be able to bring along…

When pushed further, it even provided me with an itemised list of the ingredients required. During this process it seemed to forget a couple of the allergies I’d mentioned earlier, but they were easy to spot; almost more worryingly, it suggested I get a beetroot and mint hummus (!) for the veggie platter:

I don’t know if I’ll actually be using many of these suggestions—judging by the chats I’ve had about the above list, I think bringing in a platter of veggies as the group meeting snack may get me physically removed from the premises—but ChatGPT has once again proven itself a handy tool for saving a few minutes of thinking!

Some ponderings on generalisability

Now that machine learning has managed to get its proverbial fingers into just about every pie, people have started to worry about the generalisability of methods used. There are a few reasons for these concerns, but a prominent one is that the pressure and publication biases that have led to reproducibility issues in the past are also very present in ML-based science.

The Center for Statistics and Machine Learning at Princeton University hosted a workshop last July highlighting the scale of this problem. Alongside this, they released a running list of papers highlighting reproducibility issues in ML-based science. Included on this list are 20 papers highlighting errors from 17 distinct fields, collectively affecting a whopping 329 papers.

Continue reading