Category Archives: Publications

Interesting Antibody Papers.

Below are several antibody papers that should be of interest to those dealing with antibody engineering, be it computational or experimental. The running motif in this post will be humanization, or the process of engineering a mouse antibody sequence which binds to a target to look ‘more human’ so as to reduce the immune response (if you need an early citation on this issue, here it is).

We present two papers which talk about antibody humanization directly, one from structural point of view (Choi et al. 2015), the other one highlighting issues facing antibody engineers  mining for information (Martin & Rees, 2016). The third paper (Collins et al. 2015) takes a step back from the issues presented in the other papers and talks broadly about the nature of mouse sequences raised in the lab.

Humanization via structural means [here] (Bailey-Kellogg group). The authors introduce a novel methodology named CoDAH to facilitate humanization of antibodies. They design an approach which makes a tradeoff between sequence and structural humanization scores. The sequence score used is the Human String Content (Laza et al. 2007, Mol Immunol), which calculates how similar the query (murine) sequence is to short stretches of human sequences (mostly germilne). In line with the fact that T-Cells are one of main drivers of anti-biologics immunity, they define the sequences stretches to be 9-mer, as recognized by T-Cells. For the structural score, they use Rotameric energy as calculated by Amber. They demonstrate that constructs designed using their score express and retain affinity towards the target antigen, however they do not appear to prove that the new sequences are not immunogenic.

Extracting data from databases for humanization [here] (Martin group and Rees consulting). The main purpose of this manuscript is to warn potential antibody engineers of the pitfalls of species mis-annotations. They point out that in a routine ‘humanization’ pipeline where we aim to find human sequences given a mouse sequence, a great number of seemingly good ‘human’ templates are not human at all (sources as diverse as IMGT or PDB). This might lead to errors down the line if the engineer does not double check the annotations (unfortunate but true). Many of such annotations arise because the cells in which mouse antibodies are expressed are human cells or because the sequences are chimeric — in either case the annotation would not read mouse or chimeric, but erroneously ‘human’. NB. Another thing to watch in this publication is the fact that authors are working on a sequence database of their own: EMBLIG which is said to collect data from EMBL-ENA (nucleotide repository from EMBL). Hopefully in their database, authors will address the issues that they point out here.

What can we say about antibodies produces by laboratory mice? [here] (Collins group). Authors of this manuscript have addressed the issue that the now available High Throughput Sequencing (HTS) overlooked mouse repertoires. Different mice strains have different susceptibilities to diseases (Houpt, 2002, J Imunol; which might mean that you need to think twice which mice strain to choose for a given target). Currently known antibody repertoire of mice is based on the sequencing of two strains, BALB/c and C57BL/6. Here the authors apply HTS to two strains (BALB/c and C57BL/6) of laboratory mice (eight mice per strain) to get a better snapshot of antibody gene usage. Specifically, they pay close attention to the different genes combinations (VDJ) in the sequences that they obtain. Authors conclude that the repertoires between the two strains are strikingly different and quite restricted — which might mean that the laboratory mice were under very specific pressures (read inbred/overbred). All in all, the VDJ usage numbers that they produce in this publication are a useful reference to know which sequence combinations might be used by antibody engineers.

Interesting Antibody Papers

De Novo H3 prediction by C-terminal kink-biasing (Gray Lab) [here].

Authors introduce an improvement to the prediction of CDR-H3 in the form of a constraint for de-novo decoy generation. Working from the observation that 80% of CDR-H3 have kinked C-Terminal (Weitzner et al., 2015, Structure), they bias the loops to assume this conformation (they prove that it does not force ALL loops to do so!). The constraint is in the form of a pseudo bond angle between Ca for the three C-terminal residues and a pseudo dihedral angle for the three C-terminal residues and one adjacent residue in the framework. The bias takes the form of a penalty score if the generated angle falls outside mean +/- 1s. They use a quite stringent H3 loop benchmark of only 49 loops. Using this constraint on this dataset improves prediction for majority of the loops. They also demonstrate the utility of the score for full Fv homology modeling and Ab-Ag docking.

Therapeutic vs synthetic vs natural antibodies (Ofran Lab) [here].

The authors analyzed 137 Ab-Ag complexes from the PDB. Those from hybridoma and synthetic libraries were classified as ‘Natural’ and those coming from ‘synthetic’ libraries. They demonstrate that synthetic libraries overuse H3 in the number of contacts the antibody forms with the antigen, whereas natural constructs share the paratope with H1& H2 to a larger extent. This, together with their tool, CDRs analyzer (analysis of structural & biochemical properties of ab-ag complex) can be a useful method to inform the design of antibodies.

From the past: TABHU, tools for antibody humanization (Tramontano Lab) [here]. Authors have created a tool to aid antibody humanization. Given a sequence of an antibody, the system would look for the most suitable template from their extensive sequence databases (DIGIT) and germline sequences from IMGT. The templates are assessed on sequence similarity to the query and the similarity of the ‘binding’ mode which is assessed by their paratope prediction tool proABC. After the template had been chosen, the user can produce a structural model of the sequence.

Network Pharmacology

The dominant paradigm in drug discovery has been one of finding small molecules (or more recently, biologics) that bind selectively to one target of therapeutic interest. This reductionist approach conveniently ignores the fact that many drugs do, in fact, bind to multiple targets. Indeed, systems biology is uncovering an unsettling picture for comfortable reductionists: the so-called ‘magic bullet’ of Paul Ehrlich, a single compound that binds to a single target, may be less effective than a compound with multiple targets. This new approach—network pharmacology—offers new ways to improve drug efficacy, to rescue orphan drugs, re-purpose existing drugs, predict targets, and predict side-effects.

Building on work Stuart Armstrong and I did at InhibOx, a spinout from the University of Oxford’s Chemistry Department, and inspired by the work of Shoichet et al. (2007), Álvaro Cortes-Cabrera and I took our ElectroShape method, designed for ultra-fast ligand-based virtual screening (Armstrong et al., 2010 & 2011), and built a new way of exploring the relationships between drug targets (Cortes-Cabrera et al., 2013). Ligand-based virtual screening is predicated on the molecular similarity principle: similar chemical compounds have similar properties (see, e.g., Johnson & Maggiora, 1990). ElectroShape built on the earlier pioneering USR (Ultra-fast Shape Recognition) work of Pedro Ballester and Prof. W. Graham Richards at Oxford (Ballester & Richards, 2007).

Our new approach addressed two Inherent limitations of the network pharmacology approaches available at the time:

  • Chemical similarity is calculated on the basis of the chemical topology of the small molecule; and
  • Structural information about the macromolecular target is neglected.

Our method addressed these issues by taking into account 3D information from both the ligand and the target.

The approach involved comparing the similarity of each set ligands known to bind to a protein, to the equivalent sets of ligands of all other known drug targets in DrugBank, DrugBank is a tremendous “bioinformatics and cheminformatics resource that combines detailed drug (i.e. chemical, pharmacological and pharmaceutical) data with comprehensive drug target (i.e. sequence, structure, and pathway) information.” This analysis generated a network of related proteins, connected by the similarity of the sets of ligands known to bind to them.

2013.ElectroShapePolypharmacologyServerWe looked at two different kinds of ligand similarity metrics, the inverse Manhattan distance of our ElectroShape descriptor, and compared them to 2D Morgan fingerprints, calculated using the wonderful open source cheminformatics toolkit, RDKit from Greg Landrum. Morgan fingerprints use connectivity information similar to that used for the well known ECFP family of fingerprints, which had been used in the SEA method of Keiser et al. We also looked at the problem from the receptor side, comparing the active sites of the proteins. These complementary approaches produced networks that shared a minimal fraction (0.36% to 6.80%) of nodes: while the direct comparison of target ligand-binding sites could give valuable information in order to achieve some kind of target specificity, ligand-based networks may contribute information about unexpected interactions for side-effect prediction and polypharmacological profile optimization.

Our new target-fishing approach was able to predict drug adverse effects, build polypharmacology profiles, and relate targets from two complementary viewpoints:
ligand-based, and target-based networks. We used the DUD and WOMBAT benchmark sets for on-target validation, and the results were directly comparable to those obtained using other state-of-the-art target-fishing approaches. Off-target validation was performed using a limited set of non-annotated secondary targets for already known drugs. Comparison of the predicted adverse effects with data contained in the SIDER 2 database showed good specificity and reasonable selectivity. All of these features were implemented in a user-friendly web interface that: (i) can be queried for both polypharmacology profiles and adverse effects, (ii) links to related targets in ChEMBLdb in the three networks (2D, 4D ligand and 3D receptor), and (iii) displays the 2D structure of already annotated drugs.

2013.ElectroShapePolypharmacologyServer.Screenshot

References

Armstrong, M. S., G. M. Morris, P. W. Finn, R. Sharma, L. Moretti, R. I. Cooper and W. G. Richards (2010). “ElectroShape: fast molecular similarity calculations incorporating shape, chirality and electrostatics.” J Comput Aided Mol Des, 24(9): 789-801. 10.1007/s10822-010-9374-0.

Armstrong, M. S., P. W. Finn, G. M. Morris and W. G. Richards (2011). “Improving the accuracy of ultrafast ligand-based screening: incorporating lipophilicity into ElectroShape as an extra dimension.” J Comput Aided Mol Des, 25(8): 785-790. 10.1007/s10822-011-9463-8.

Ballester, P. J. and W. G. Richards (2007). “Ultrafast shape recognition to search compound databases for similar molecular shapes.” J Comput Chem, 28(10): 1711-1723. 10.1002/jcc.20681.

Cortes-Cabrera, A., G. M. Morris, P. W. Finn, A. Morreale and F. Gago (2013). “Comparison of ultra-fast 2D and 3D ligand and target descriptors for side effect prediction and network analysis in polypharmacology.” Br J Pharmacol, 170(3): 557-567. 10.1111/bph.12294.

Johnson, A. M., & G. M. Maggiora (1990). “Concepts and Applications of Molecular Similarity.” New York: John Willey & Sons.

Landrum, G. (2011). “RDKit: Open-source cheminformatics.” from http://www.rdkit.org.

Keiser, M. J., B. L. Roth, B. N. Armbruster, P. Ernsberger, J. J. Irwin and B. K. Shoichet (2007). “Relating protein pharmacology by ligand chemistry.” Nat Biotechnol, 25(2): 197-206. 10.1038/nbt1284.

Wishart, D. S., C. Knox, A. C. Guo, S. Shrivastava, M. Hassanali, P. Stothard, Z. Chang and J. Woolsey (2006). “DrugBank: a comprehensive resource for in silico drug discovery and exploration.” Nucleic Acids Res, 34(Database issue): D668-672. 10.1093/nar/gkj067.

Ten Simple Rules for a Successful Cross-Disciplinary Collaboration

The name of our research group (Oxford Protein Informatics Group) already indicates its cross-disciplinary character. In doing research of this type, we can acquire a lot of experience in working across the boundaries of research fields. Recently, one of our group members, Bernhard Knapp, became lead author of an article about guidelines for cross-disciplinary research. This article describes ten simple rules which you should consider if working across several disciplines. They include going to the other lab in person, understanding different rewards models, having patience with the pace of other disciplines, and recognising the importance of synergy.

The ten rules article was even picked up by a journalist of the “Times Higher Education” and further discussed in the newspaper.

Happy further interdisciplinary work!

 

SAS-5 assists in building centrioles of nematode worms Caenorhabditis elegans

We have recently published a paper in eLife describing the structural basis for the role of protein SAS-5 in initiating the formation of a new centriole, called a daughter centriole. But why do we care and why is this discovery important?

We, as humans – a branch of multi-cellular organisms, are in constant demand of new cells in our bodies. We need them to grow from an early embryo to adult, and also to replace dead or damaged cells. Cells don’t just appear from nowhere but undergo a tightly controlled process called cell cycle. At the core of cell cycle lies segregation of duplicated genetic material into two daughter cells. Pairs of chromosomes need to be pulled apart millions of millions times a day. Errors will lead to cancer. To avoid this apocalyptic scenario, evolution supplied us with centrioles. Those large molecular machines sprout microtubules radially to form characteristic asters which then bind to individual chromosomes and pull them apart. In order to achieve continuity, centrioles duplicate once per cell cycle.

Similarly to many large macromolecular assemblies, centrioles exhibit symmetry. A few unique proteins come in multiple copies to build this gigantic cylindrical molecular structure: 250 nm wide and 500 nm long (the size of a centriole in humans). The very core of the centriole looks like a 9-fold symmetrical stack of cartwheels, at which periphery microtubules are vertically installed. We study protein composition of this fascinating structure in the effort to understand the process of assembling a new centriole.

Molecular architecture of centrioles.

SAS-5 is an indispensable component in C. elegans centriole biogenesis. SAS-5 physically associates with another centriolar protein, called SAS-6, forming a complex which is required to build new centrioles. This process is regulated by phosphorylation events, allowing for subsequent recruitment of SAS-4 and microtubules. In most other systems SAS-6 forms a cartwheel (central tube in C. elegans), which forms the basis for the 9-fold symmetry of centrioles. Unlike SAS-6, SAS-5 exhibits strong spatial dynamics, shuttling between the cytoplasm and centrioles throughout the cell cycle. Although SAS-5 is an essential protein, depletion of which completely terminates centrosome-dependent cell division, its exact mechanistic  role in this  process remains  obscure.

IN BRIEF: WHAT WE DID
Using X-ray crystallography and a range of biophysical techniques, we have determined the molecular architecture of SAS-5. We show that SAS-5 forms a complex oligomeric structure, mediated by two self-associating domains: a trimeric coiled coil and a novel globular dimeric Implico domain. Disruption of either domain leads to centriole duplication failure in worm embryos, indicating that large SAS-5 assemblies are necessary for function. We propose that SAS-5 provides multivalent attachment sites that are critical for promoting assembly of SAS-6 into a cartwheel, and thus centriole formation.

For details, check out our latest paper 10.7554/eLife.07410!

@kbrogala

Top panel: cartoon overview of the proposed mechanism of centriole formation. In cytoplasm, SAS-5 exists at low concentrations as a dimer, and each of those dimers can stochastically bind two molecules of SAS-6. Once SAS-5 / SAS-6 complex is targeted to the centrioles, it starts to self-oligomerise. Such self-oligomerisation of SAS-5 allows for the attached molecules of SAS-6 to form a cartwheel. Bottom panel: detailed overview of the proposed process of centriole formation. In cytoplasm, where concentration of SAS-5 is low, the strong Implico domain (SAS-5 Imp, ZZ shape) of SAS-5 holds the molecule in a dimeric form. Each SAS-5 protomer can bind (through the disordered linker) to the coiled coil of dimeric SAS-6. Once SAS-5 / SAS-6 complex is targeted to the site where a daughter centriole is to be created, SAS-5 forms higher-order oligomers through self-oligomerisation of its coiled coil domain (SAS-5 CC – triple horizontal bar). Such large oligomer of SAS-5 provides multiple attachments sites for SAS-6 dimers in a very confied space. This results in a burst of local concentration of SAS-6 through the avidity effect, allowing an otherwise weak oligomer of SAS-6 to also form larger species. Effectively, this seeds the growth of a cartwheel (or a spiral in C. elegans), which in turn serves as a template for a new centriole.

 

Natural Move Monte Carlo: Sampling Collective Motions in Proteins

Protein and RNA structures are built up in a hierarchical fashion: from linear chains and random coils (primary) to local substructures (secondary) that make up a subunit’s 3D geometry (tertiary) which in turn can interact with additional subunits to form homomeric or heteromeric multimers (quaternary). The metastable nature of the folded polymer enables it to carry out its function repeatedly while avoiding aggregation and degradation. These functions often rely on structural motions that involve multiple scales of conformational changes by moving residues, secondary structure elements, protein domains or even whole subunits collectively around a small set of degrees of freedom.

The modular architecture of antibodies, makes them amenable to act as an example for this phenomenon. Using MD simulations and fluorescence anisotropy experiments Kortkhonjia et al. observed that Ig domain motions in their antibody of interest were shown to correlate on two levels: 1) with laterally neighbouring Ig domains (i.e. VH with VL and CH1 with CL) and 2) with their respective Fab and Fc regions.

Correlated Motion

Correlated motion between all residue pairs of an antibody during an MD simulation. The axes identify the residues whereas the colours light up as the correlation in motion increases. The individual Ig domains as well as the two Fabs and the Fc can be easily identified. ref: Kortkhonjia, et al., MAbs. Vol. 5. No. 2. Landes Bioscience, 2013.

This begs the question: Can we exploit these molecular properties to reduce dimensionality and overcome energy barriers when sampling the functional motions of metastable proteins?

In 2012 Sim et al. have published an approach that allows for the incorporation of these collective motions (they call them “Natural Moves”) into simulation. Using simple RNA model structures they have shown that explicitly sampling large structural moves can significantly accelerate the sampling process in their Monte Carlo simulation. By gradually introducing DOFs that propagate increasingly large substructures of the molecule they managed to reduce the convergence time by several orders of magnitude. This can be ascribed to the resulting reduction of the search space that narrows down the sampling window. Instead of sampling all possible conformations that a given polynucleotide chain may take, structural states that differ from the native state predominantly in tertiary structure are explored.

Reduced Dimensionality

Reducing the conformational search space by introducing Natural Moves. A) Ω1 (residue-level flexibility) represents the cube, Ω2 (collective motions of helices) spans the plane and Ω3 (collective motions of Ω2 bodies) is shown as a line. B) By integrating multiple layers of Natural Moves the dimensionality is reduced. ref: Sim et al. (2012). PNAS 109(8), 2890–5. doi:10.1073/pnas.1119918109

It is important to stress, however, that in addition to these rigid body moves local flexibility is maintained by preserving residue level flexibility. Consequently, the authors argue, high energy barriers resulting from large structural rearrangements are reduced and the resulting energy landscape is smoothened. Therefore, entrapment in local energy minima becomes less likely and the acceptance rate of the Monte Carlo simulation is improved.

Although benchmarking of this method has mostly relied on case studies involving model RNA structures with near perfect symmetry, this method has a natural link to near-native protein structure sampling. Similarly to RNA, proteins can be decomposed into local substructures that may be responsible for the main functional motions in a given protein. However, due to the complexity of protein motion and limited experimental data we have a limited understanding of protein dynamics. This makes it a challenging task to identify suitable decompositions. As more dynamic data emerges from biophysical methods such as NMR spectroscopy and databases such as www.dynameomics.org are extended we will be able to better approximate protein motions with Natural Moves.

In conclusion, when applied to suitable systems and when used with care, there is an opportunity to breathe life into the static macromolecules of the pdb, which may help to improve our understanding of the heterogeneous structural landscape and the functional motions of metastable proteins and nanomachines.

Journal Club: Human Germline Antibody Gene Segments Encode Polyspecific Antibodies

This week’s paper by Willis et al. sought to investigate how our limited antibody-encoding gene repertoire has the ability to recognise the unlimited array of antigens. There is a finite number of V, D, and J genes that encode our antibodies, but it still has the capacity to recognise an infinite number of antigens. Simply, the authors’ notion is that an antibody from the germline (via V(D)J recombination; see entry by James) is able to adopt multiple conformations, thus allowing the antibody to bind multiple antigens.

Three antibodies derived from the germline gene 5*51-01, all binding to very different antigens.

Three antibodies derived from the germline gene 5*51-01 bind to very different antigens.

To test this hypothesis, the authors performed a multiple sequence alignment for the amino acid sequence between the mature antibodies and the germline antibody sequence from which the antibodies are derived from. if a single position from ONE mature antibody showed a difference to the germline sequence, it was identified as a ‘variable’ position, and allowed to be changed by Rosetta’s multi-state design (MSD) and single-state design (SSD) protocols.

Pipeline: align mature antibodies (2XWT, 2B1A, 3HMX) to the germline sequence (5-51) , identify 'variable' positions from the alignment, then allow Rosetta to change those residues during design.

Figure 1) from Willis et al., showing the pipeline: align mature antibodies (2XWT, 2B1A, 3HMX) to the germline sequence (5-51) , identify ‘variable’ positions from the alignment, then allow Rosetta to change those residues.

Surprisingly, without any prior information of the germline sequence, the MSD yielded a sequence that was closer to the germline sequence, and the SSD for each mature antibody had retained the mature sequence. In short, this indicated that the germline sequence is a harmonising sequence that can accommodate the conformations of each of the mature antibodies (as proven by MSD), whereas the mature sequence was the lowest energy amino acid sequence for the particular antibody’s conformation (as proven by SSD).

To further demonstrate that the germline sequence is indeed the more ‘flexible’ sequence, the authors then aligned the mature antibodies and determined the deviation in ψ-ϕ angles at each of the variable positions that were used in the Rosetta study. They found that the ψ-ϕ angle deviation in the positions that recovered to the germline residue was much larger than the other variable positions along the antibody. In other words, for the positions that tend to return to the germline amino acid in MSD, the ψ-ϕ angles have a much larger degree of variation compared to the other variable positions, suggesting that the positions that returned to the germline amino acid are prone to lots of movement.

In addition to the many results that corroborate the findings mentioned in this entry, it’s neat that the authors took a ‘backwards’ spin to conventional antibody design. Most antibody design regimes aim to find amino acid(s) that give the antibody more ‘rigidity’, and hence, mature its affinity, but this paper went against the norm to find the most FLEXIBLE antibody (the most likely germline predecessor*). Effectively, they argue that this type of protocol can be exported to extract new antibodies that can bind to multiple antigens, thus increasing the versatility of antibodies as potential therapeutic agents.

[Publication] Arginine Methylation-Dependent Reader-Writer Interplay Governs Growth Control by E2F-1

If you are familiar with the reader, writer & eraser concepts or you are passionate about epigenetics and arginines, this recent publication might be of interest to you. The study addresses the transcription factor E2F-1, which plays a crucial role in the control of cell cycle and is linked with cancer. Like Yin-yang, it has opposing functional roles: to promote cell-cycle progression and to induce apoptosis. The results demonstrate that the biological outcome of E2F-1 activity is affected by arginine methylation marks. While asymmetric arginine methylation causes apoptosis, the symmetrical methylation results in proliferation. This reader-writer interplay determined by the two types of marks governs the function of E2F-1 and potentially the fate of the cell.

[Database] SAbDab – the Structural Antibody Database

An increasing proportion of our research at OPIG is about the structure and function of antibodiesCompared to other types of proteins, there is a large number of antibody structures publicly available in the PDB (approximately 1.8% of structures contain an antibody chain). For those of us working in the fields of antibody structure prediction, antibody-antigen docking and structure-based methods for therapeutic antibody design, this is great news!

However, we find that these data are not in a standard format with respect to antibody nomenclature. For instance, which chains are “heavy” chains and which are “light“? Which heavy and light chains pair? Is there an antigen present? If so, to which H-L pair does it bind to? Which numbering system is used … etc.

To address this problem, we have developed SAbDab: the Structural Antibody Database. Its primary aim is for easy creation of antibody structure and antibody-antigen complex datasets for further analysis by researchers such as ourselves. These sets can be selected using a number of criteria (e.g. experimental method, species, presence of constant domains…) and redundancy filters can be applied over the sequences of both the antibody and antigen. Thanks to Jin, SAbDab now also includes associated curated affinity (Kd) values for around 190 antibody-antigen complexes. We hope this will serve as a benchmarking tool for antibody-antigen docking prediction algorithms.

sabdab

Alternatively, the database can be used to inspect and compare properties of individual structures. For instance, we have recently published a method to characterise the orientation between the two antibody variable domains, VH and VL. Using the ABangle tool, users can select structures with a particular VH-VL orientation, visualise and quantify conformational changes (e.g. between bound and unbound forms) and inspect the pose of structures with certain amino acids at specific positions. Similarly, the CDR (complimentary determining region) search and clustering tools, allow for the antibody hyper-variable loops to be selected by length, type and canonical class and their structures visualised or downloaded.

structure_viewer

 

SAbDab also contains features such as the template search. This allows a user to submit the sequence of either an antibody heavy or light chain (or both) and to find structures in the database that may offer good templates to use in a homology modelling protocol. Specific regions of the antibody can be isolated so that structures with a high sequence identity over, for example, the CDR H3 loop can be found. SAbDab’s weekly automatic updates ensures that it contains the latest available data. Using each method of selection, the structure, a standardised and re-numbered version of the structure, and a summary file containing information about the antibody, can be downloaded both individually or en-masse as a dataset. SAbDab will continue to develop with new tools and features and is freely available at: opig.stats.ox.ac.uk/webapps/sabdab.

[Publication] Cloud computing in Molecular Modelling – a topical perspective

cloud_computingtoc

My ex-InhibOx colleagues (Simone Fulle, Garrett Morris, Paul Finn) and myself have recently published a topical review on “The emerging role of cloud computing in molecular modelling” in the Journal of Molecular Graphics and Modelling.   This paper starts with a gentle and in-depth introduction to the field of cloud computing.  The second part of the paper is how it applies to molecular modelling (and the sort of tasks we can run in the cloud).  The third and last part presents two practical case studies of cloud computations, one of which describes how we built a virtual library to use in virtual screening on AWS.

We hope that after reading this article the cloud will become a less nebulous affair! *pun intended*

As an addendum, I recently came across this paper “Teaching cloud computing: A software engineering perspective” (2013) on how to teach cloud computing at a graduate level.  This work is relevant, because lots of universities are presently including cloud computing in their curricula.