Author Archives: Thomas Hadfield

Tidbits from YouGov Polls

Some recent verdicts from the British public on YouGov polls

  • The Queen (97%) is less well known than her husband Prince Philip (98%)
  • Liz Truss’ UK popularity rating (21%) is lower than George W. Bush’s (22%)
  • The most popular British dish is ‘Chips’ (84%) followed by ‘Fish and Chips’ (83%)
  • Oxford (55%) is a less popular university than Cambridge (58%)

So much for Aristotle’s ‘wisdom of crowds’!

An A-Z of Oxford

The 2021/2 academic year is now well underway in Oxford, which means a fresh batch of new students getting to grips with some of the bewildering terminology employed here, as well as prospective applicants for next year trying to figure out what on earth a college is and which one they should apply to. As a wizened final year DPhil student I decided to compile an A-Z of Oxford related terms in the hope that someone might find it useful.

A – Ashmolean Museum

Britain’s first public museum, established all the way back in 1678. Home to exhibits covering Ancient Egypt to Modern Art and everything in between.

The Ashmolean Museum of Art and Archaeology | Art UK
The front of the Ashmolean, right in the middle of Oxford City Centre

B – Battels

A termly bill students receive from their college which might cover things like charges for food and accommodation, or fines for not returning books to the library on time.

C – College

The 39 colleges are small educational institutions which together comprise the University of Oxford. Every student is a member of a college, each of which has their own set of facilities, including a dining hall, bar, library and student accommodation. Colleges also have their own student unions, called the Junior Common Room (for undergraduates) and Middle Common Room (for postgraduates), which are excellent places to socialise and meet people studying lots of different subjects.

Aerial view of Oxford, UK, a very well preserved city with one of the most  beautiful university campuses I know about.: ArchitecturalRevival
An aerial view of many of the university’s colleges
Continue reading

Multiple Testing: What is it, why is it bad and how can we avoid it?

P-values play a central role in the analysis of many scientific experiments. But, in 2015, the editors of the Journal of Basic and Applied Social Psychology prohibited the usage of p-values in their journal. The primary reason for the ban was the proliferation of results obtained by so-called ‘p-hacking’, where a researcher tests a range of different hypotheses and publishes the ones which attain statistical significance while discarding the others. In this blog post, we’ll show how this can lead to spurious results and discuss a few things you can do to avoid engaging in this nefarious practice.

The Basics: What IS a p-value?

Under a Hypothesis Testing framework, a p-value associated with a dataset is defined as the probability of observing a result that is at least as extreme as the observed one, assuming that the null hypothesis is true. If the probability of observing such an event is extremely small, we conclude that it is unlikely the null hypothesis is true and reject it.

But therein lies the problem. Just because the probability of something is small, that doesn’t make it impossible. Using the standard significance test threshold of 0.05, even if the null hypothesis is true, there is a 5% chance of obtaining a p-value below the significance threshold and therefore rejecting it. Such false positives are an inescapable part of research; there’s always a possibility that the subset you were working with isn’t representative of the global data and sometimes we take the wrong decision even though we analysed the data in a perfectly rigorous fashion.

Continue reading

A Smattering of Olympic Trivia!

Tokyo 2020 is now firmly in our rearview mirror, and I for one will be sad to be deprived of the opportunity to wake up at 4AM to passionately cheer on someone I’ve never heard of in an event I know nothing about as they go for Gold. The heyday of amateurism in the Olympics may be long  gone, but it’s never been better for the amateur fan, with 24/7, on-demand, coverage, unprecedented access to the athletes via social media and remote working offering the opportunity to watch the games on a second screen without worrying about one’s boss noticing (not that I would ever engage in such an irresponsible practice, in case my Supervisor is reading this…).

To indulge both my post-Olympics melancholy and my addiction to sports trivia, I’ve trawled the internet to find some interest factoids related to the Summer Games and present them below for your mild enjoyment:

Continue reading

CAML: Courses in Applied Machine Learning

*Shameless self-promotion klaxon!! Have a look at my new website!*

I’m excited to share a project I’ve been working on for the past few months! One of the biggest challenges of working on an interdisciplinary research project is getting to grips with the core principles of the disciplines which you don’t have much formal training in. For me, that means learning the basics of Medicinal Chemistry and Structural Biology so that when someone mentions pi-stacking I don’t think they’re talking about the logistics of managing a bakery; for people coming from Bio/Chem backgrounds it can mean understanding the Maths and Statistics necessary to make sense of the different algorithms which are central to their work.

Continue reading

Drawing Wavy Lines That Match Your Data, or, An Introduction to Kernel Density Estimation

One of the fundamental questions of statistics is “How likely is it that event X will occur, given what we’ve observed already?”. It’s a question that pops up in all sorts of different fields, and in our daily lives as well, so it’s well worth being able to answer rationally. Under the statistician’s favourite assumption that the observed data are independent and identically distributed (i.i.d.), we can use the data to construct a probability distribution; that is, if we’re about to observe a new data point, x*, we can say how likely it is that x* will take a specific value.

Continue reading

Pigs in the Parks: OPIG Social 28JUL2020

Tuesday afternoon normally heralds Group Meeting, the precious hour of the week where we gather on Zoom to hear about recently published papers, dissect each other’s research and, most importantly, bicker about appropriate usage of the servers. Knowing that Fergus B was on holiday this week and that a Group Meeting devoid of SLURM-inspired ranting would have felt strangely empty, it was instead decided that now was the time for the first in-person group social since the lockdown began in March.

Struggling to adapt to not being able to turn off Mic and Webcam – how on earth did we manage like this all the time before?!
Continue reading

No labels, no problem! A quick introduction to Gaussian Mixture Models

Statistical Modelling Big Data AnalyticsTM is in vogue at the moment, and there’s nothing quite so fashionable as the neural network. Capable of capturing complex non-linear relationships and scalable for high-dimensional datasets, they’re here to stay.

For your garden-variety neural network, you need two things: a set of features, X, and a label, Y. But what do you do if labelling is prohibitively expensive or your expert labeller goes on holiday for 2 months and all you have in the meantime is a set of features? Happily, we can still learn something about the labels, even if we might not know what they are!

Continue reading