Tag Archives: deep learning

AlphaGeometry: are computers taking over math?

Last week, Google DeepMind announced AlphaGeometry, a novel deep learning system that is able to solve geometry problems of the kind presented at the International Mathematics Olympiad (IMO). The work is described in a recent Nature paper, and is accompanied by a GitHub repo including full code and weights.

This paper has caused quite a stir in some circles. Well, at least the kind of circles that you tend to get in close contact with when you work at a Department of Statistics. Like folks in structural in biology wondered three years ago, those who earn a living by veering into the mathematical void and crafting proofs, were wondering if their jobs may also have a close-by expiration date. I found this quite interesting, so I decided to read the paper and try to understand it — and, to motivate myself, I set to present this paper at an upcoming journal club, and also write this blog post.

So, let’s ask, what has actually been achieved and how powerful is this model?

What has been achieved

The image that has been making the rounds this time is the following benchmark:

Continue reading

What the heck are TPUs?

I recently became curious about TPUs, a specialised hardware for training Machine- and Deep-Learning models, where TPU stands for Tensor Processing Unit. This fancy chip can provide very high gains for anyone aiming to perform really massive parallelisation of AI tasks such as training, fine-tuning, and inference.

In this blog post, I will touch on what a TPU is, why it could be useful for AI applications when compared to GPUs and briefly discuss associated opportunity costs.

What’s a TPU?

Continue reading

Graphormer: Merging GNNs and Transformers for Cheminformatics

This is my first OPIG blog! I’m going to start with a summary of the Graphormer, a Graph Neural Network (GNN) that borrows concepts from Transformers to boost performance on graph tasks. This post is largely based on the NeurIPS paper Do Transformers Really Perform Bad for Graph Representation? by Ying et. al., which introduces the Graphormer, and which we read for our last deep learning journal club. The project has now been integrated as a Microsoft Research project.

I’ll start with a cheap and cheerful summary of Transformers and GNNs before diving into the changes in the Graphormer. Enjoy!

Continue reading

Universal graph pooling for GNNs

Graph neural networks (GNNs) have quickly become one of the most important tools in computational chemistry and molecular machine learning. GNNs are a type of deep learning architecture designed for the adaptive extraction of vectorial features directly from graph-shaped input data, such as low-level molecular graphs. The feature-extraction mechanism of most modern GNNs can be decomposed into two phases:

  • Message-passing: In this phase the node feature vectors of the graph are iteratively updated following a trainable local neighbourhood-aggregation scheme often referred to as message-passing. Each iteration delivers a set of updated node feature vectors which is then imagined to form a new “layer” on top of all the previous sets of node feature vectors.
  • Global graph pooling: After a sufficient number of layers has been computed, the updated node feature vectors are used to generate a single vectorial representation of the entire graph. This step is known as global graph readout or global graph pooling. Usually only the top layer (i.e. the final set of updated node feature vectors) is used for global graph pooling, but variations of this are possible that involve all computed graph layers and even the set of initial node feature vectors. Commonly employed global graph pooling strategies include taking the sum or the average of the node features in the top graph layer.

While a lot of research attention has been focused on designing novel and more powerful message-passing schemes for GNNs, the global graph pooling step has often been treated with relative neglect. As mentioned in my previous post on the issues of GNNs, I believe this to be problematic. Naive global pooling methods (such as simply summing up all final node feature vectors) can potentially form dangerous information bottlenecks within the neural graph learning pipeline. In the worst case, such information bottlenecks pose the risk of largely cancelling out the information signal delivered by the message-passing step, no matter how sophisticated the message-passing scheme.

Continue reading

Einops: Powerful library for tensor operations in deep learning

Tobias and I recently gave a talk at the OPIG retreat on tips for using PyTorch. For this we created a tutorial on Google Colab notebook (link can be found here). I remember rambling about the advantages of implementing your own models against using other peoples code. Well If I convinced you, einops is for you!!

Basically, einops lets you perform operations on tensors using the Einstein Notation. This package comes with a number of advantages a few of which I will try and summarise here:

Continue reading

3 Key Questions to Think About When Designing Proteins Computationally

We have reached the era of design, not just ‘hunting’. Particularly exciting to me is the de novo design of proteins, which have a wide and ever increasing range of applications from therapeutics to consumer products, biomanufacturing to biomaterials. Protein design has been a) enabled by decades of research that contributed to our understanding of protein sequence, structure & function and b) accelerated by computational advances – capturing the information we have learned from proteins and representing it for computers and machine learning algorithms.

In this blog post, I will discuss three key methodological considerations for computational protein design:

  1. Sequence- vs structure-based design
  2. ML- vs physics-based design
  3. Target-agnostic vs target-aware design
Continue reading

Issues with graph neural networks: the cracks are where the light shines through

Deep convolutional neural networks have lead to astonishing breakthroughs in the area of computer vision in recent years. The reason for the extraordinary performance of convolutional architectures in the image domain is their strong ability to extract informative high-level features from visual data. For prediction tasks on images, this has lead to superhuman performance in a variety of applications and to an almost universal shift from classical feature engineering to differentiable feature learning.

Unfortunately, the picture is not quite as rosy yet in the area of molecular machine learning. Feature learning techniques which operate directly on raw molecular graphs without intermediate feature-engineering steps have only emerged in the last few years in the form of graph neural networks (GNNs). GNNs, however, still have not managed to definitively outcompete and replace more classical non-differentiable molecular representation methods such as extended-connectivity fingerprints (ECFPs). There is an increasing awareness in the computational chemistry community that GNNs have not quite lived up to the initial hype and still suffer from a number of technical limitations.

Continue reading

AlphaFold 2 is here: what’s behind the structure prediction miracle

Nature has now released that AlphaFold 2 paper, after eight long months of waiting. The main text reports more or less what we have known for nearly a year, with some added tidbits, although it is accompanied by a painstaking description of the architecture in the supplementary information. Perhaps more importantly, the authors have released the entirety of the code, including all details to run the pipeline, on Github. And there is no small print this time: you can run inference on any protein (I’ve checked!).

Have you not heard the news? Let me refresh your memory. In November 2020, a team of AI scientists from Google DeepMind  indisputably won the 14th Critical Assessment of Structural Prediction competition, a biennial blind test where computational biologists try to predict the structure of several proteins whose structure has been determined experimentally but not publicly released. Their results were so astounding, and the problem so central to biology, that it took the entire world by surprise and left an entire discipline, computational biology, wondering what had just happened.

Continue reading