Welcome to a slightly different blog post than usual. Today I am sharing an insight into my life at Keble College, Oxford. I am the Chair of Cheese and Why?, which is a talk series we host in our common room during term. The format is simple: I provide cheese and wine, and a guest speaker provides the “why”—a short, thought-provoking talk to spark discussion for the evening.

To kick off the series, I opened with the question of artificial intelligence replacing human thought. I am sharing my spoken essay below. The aim of a Cheese and Why? talk is to generate questions rather than deliver answers, so I hope you’ll forgive me if what follows doesn’t quite adhere to the rigorous structure of a traditional Oxford humanities essay. For best reading, I recommend a glass of claret and a wedge of Stilton, to recreate the full Oxford common-room experience.

Opening

Hello everyone! Welcome to the first Cheese and Why? of the term. Please do help yourselves to some cheese and wine [this statement is also true for those reading this essay online]. My name is Nicholas, and I’m a DPhil student in Statistics, researching the use of large language models in chemistry.

Before I begin, I should clarify that this is my first attempt in quite some time to write a humanities-style essay, and I have found this much more challenging than expected. I now have a greater appreciation and respect for my colleagues in the humanities who are able to formulate their ideas and arguments so succinctly. Of course, I did try to use ChatGPT to write this essay, but it was very poor at structuring arguments and proposing original ideas itself. So for better or for worse, I have had to write most of this myself.

Introduction

In this essay, I will raise questions about the implications of artificial intelligence (AI) replacing our cognitive processes. When I talk about AI, I generally mean large language models (LLMs) like ChatGPT, Gemini, and Claude. There are other AI models, notably image and video generators; however, I don’t see these as intrinsically replacing “human thinking”, and they don’t have the same immediate threats that LLMs pose.

Knowing my audience, I am aware that certain individuals will want to descend into a discussion on free will, metaphysics, and AI consciousness. While these are equally interesting questions, they are far too complex to be discussed here. We will reserve such discussions for a future Cheese and Why?, and I would welcome abstracts for such a talk.

I would also like to immediately highlight the tension between idealism and reality. Of course, AI models are a threat to many of us, and potentially pose an existential threat to multiple disciplines. There are many people who are anti-AI and would wish for the world to be different. However, the reality is that we live in a world with artificial intelligence, and we need to determine how we can productively work with this technology to the benefit of humanity. So, for the purposes of this evening, let’s take it as a given that AI is now an intrinsic part of our reality.

Oxford has become the first university in the UK to roll out ChatGPT to all students and staff. Our institution has made an active decision to embrace this technology and expects us to use it. In this Cheese and Why?, I want us to focus on the question of outsourcing human thinking to AI models—and what implications this may have for academia and for our pursuit of truth.

Capability shock

Until recently, I was an AI skeptic. ChatGPT was released three years ago, in November 2022. At the time, everyone was impressed: you could ask it anything and get a plausible response. But there were obvious limitations. Early versions would frequently “hallucinate”, making up information and asserting falsehoods with total confidence, though still sounding believable. In academia, a common complaint was that ChatGPT would just make up fake citations when asked to support its arguments, and these models struggled with complex problem-solving, limiting their utility in scientific research.

Many of these early limitations have now disappeared. Today’s models can search the web and support their arguments with genuine citations. They can write hundreds of lines of working code at once, and they appear to possess PhD-level knowledge across many, if not all, domains. These models are quite clearly useful to many of us.

If you’re not as dependent on AI models as I am, you may be unfamiliar with some tools that are available. OpenAI offers “deep research” which can perform a literature review, and only takes 10 minutes. Google has a tool called NotebookLM where you can upload documents and it will turn these into a lifelike podcast between two individuals. I have used this tool to help understand complex papers, and at one point, the model constructed an analogy that made a difficult concept suddenly click.

It is undeniable that these models are useful; however, there are fundamental questions that need to be addressed which seem to have fallen under the radar in public discourse. If we outsource all our thinking to AI, what remains for humanity? Are there intrinsically human tasks we should never relinquish? What are the implications for academia, society, and the nature of knowledge itself?

The point of a Cheese and Why? talk is to raise questions and stimulate discussion. This is fortunate, as I don’t have any answers. In the remainder of this talk I will focus on two areas: academia, and then truth.

Academia, Automated?

A few weeks ago I attended the LLMs @ Oxford conference which included attendees from every corner of the University. We had speakers from English and theology, all the way to computer science and biology. To my surprise, there was overwhelming optimism and interest in AI (although of course, there was a self-selection bias in those who attended this meeting). There were many interesting discussions, and I will summarise some of them here.

Arts and Humanities

English

Professor Hannah Sullivan is an accomplished poet, and explored the use of LLMs for writing sonnets. She found that ChatGPT was not great at this task, often generating nonsense lines, struggling with the iambic pentameter, and forcing unnatural words to match the rhyming couplets. Her assessment was that ChatGPT could not write at an undergraduate level, and lacks depth and awareness that transforms words into poetry. However, to my surprise, she was not fundamentally opposed to using AI to write poetry. She went on to explore prompting strategies to improve the quality of the poems and highlighted instances where the model genuinely wrote some good lines. My impression from her talk was that there is optimism and intrigue at what these models could write. Maybe they can invent something new, or help you find the words required to complete the picture.

Language allows us to communicate what is in our hearts, but finding the words to express ourselves is challenging. I’m sure people have used ChatGPT to write bridesmaid speeches, and eulogies for funerals. While it might feel intrinsically wrong to read a ChatGPT script at a funeral, if this allows you to truly express your love and appreciation for the individual, then maybe this is a good thing. ChatGPT is democratising language and giving us easy, direct access to our hearts. Equally though, we have just lost something intrinsically human. Maybe part of the reason literature is so special is that an individual has had to suffer to find the right words to express themselves. I am undecided on whether AI is intrinsically good or bad for creative writing.

Linguistics

In a related vein, Professor Matthew Reynolds of linguistics highlighted an opportunity with language translation. This task is often interpreted as finding an absolute mapping from one language to another; however, this loses the essential context of the feeling the words generate in the language in which they were written. When translating literary works from one language to another, a shallow translation will go word-for-word, a great translation will communicate the feelings evoked. However, writing such a translation is difficult, meaning great translations are restricted only to the most popular works, or those where an individual has a unique fascination with one writer. Professor Reynolds hopes that this technology could democratise access to classic literature in different languages, allowing individuals to experience the joy of literature with the same emotions evoked by the author in the language in which it was written.

Theology

As my last humanities example, I will highlight Dr Lyndon Drake from theology. He explained that humanities research progresses by individuals becoming experts in one or two specific areas (maybe a specific author, or time period), and then trying to analyse them in a different context, often a modern framing. Due to the nature of the humanities requiring mountains of reading, it restricts the ability to explore adjacent concepts. LLMs might enable broader connections to be made between distantly related ideas, allowing faster progress in the evolution of thought.

He also had the audacity to claim that most people don’t care about specific technicalities in science, and that humanities hold the questions of broad interest… I think that’s fair enough. He said that individuals might have theological questions that they would want to ask AI for support on. I decided to ask ChatGPT the ultimate question of whether God exists, and it replied “no”. If people are expected to turn to these models for spiritual guidance, their beliefs could be influenced by the intrinsic biases of these models. When discussing this case study with a friend, it became obvious why ChatGPT would say this: it is just clearing the way to crowning itself our overlord.

Students

These speakers were all quite optimistic about the opportunities posed by AI; however, these academics are already established in their fields and are likely safe from the immediate worries of automation. There is an obvious concern among the student population. What becomes of undergraduate teaching when students just ask ChatGPT to find the literature, plan their essay, and write it? Ultimately, the arts and humanities are essay-based subjects which evolve through written communication. If AI models are able to write essays at a human level, they could threaten the very “human” nature of the humanities.

Sciences

I think there is a tendency to believe that the arts and humanities are under threat from AI, whereas the sciences are somehow safe. I’m not convinced. With the current pace of development, the sciences might actually face an even greater risk.

Maths

In the past few months we have seen multiple breakthroughs in capabilities of AI models. The International Mathematical Olympiad was seen as a key milestone for AI models, and forecasters predicted it would take many years before AI would win gold at this competition. This threshold was crossed this summer by Google and OpenAI which both achieved gold-level performance with their models. There are now many cases of people using LLMs to assist their maths research, and there are some recent claims of LLMs contributing significant support towards mathematical proofs.

Coding

Around a month ago was the ICPC contest which is an elite Olympiad-like coding competition for the best university coders in the world. At this competition, OpenAI scored an unprecedented result getting a perfect 12/12 score. The best human team scored 11/12. According to this metric, ChatGPT is now superhuman at coding.

Chemistry

My DPhil research is in using LLMs for chemistry. I am determining what questions we want to ask these models, assessing whether these models can answer them, and then figuring out how we can make them better. Fortunately, ChatGPT is still not great at chemistry; however, I have demonstrated that there has been a step change in capabilities of LLMs in chemistry compared with where we were last year. They are now able to solve certain tasks that would take humans minutes, maybe up to an hour, to complete.

My work is totally dependent on AI models. Since I am trying to figure out what ChatGPT can do, I end up speaking to it all day. I ask it lots of crazy questions, and see if it can answer them. I rarely write any code myself: most of my work is “vibe-coded” meaning I ask ChatGPT to write code for me and I refine the code through iterative prompting. ChatGPT can write better code than me, and physically writes faster, so it is no longer productive for me to write code myself.

My entire DPhil research has become a prompting exercise. This is not what I expected I would be doing when I arrived at Oxford. I hoped that I had somehow proved myself as an “intelligent being”, and that my research would involve me struggling to come up with clever solutions to complex problems. However, I am totally at peace with my work being a collaboration with AI models. My research is having impact and we are making genuine progress towards open questions in chemistry. Solving questions in the life sciences will have a massive benefit to humanity. The positive impact of this research outweighs any sense of ego I might hold. I no longer place much value on my own cognition, and instead I care about the impact of my work. So long as these questions are solved, does it matter whether these ideas come from AI or my own synapses?

[To my readers online, I feel like I have just outed myself as totally incompetent. I promise you I do still use my brain (at least sometimes) and I actually do have to think about the problems I am working on. However, it is also a statement of fact that my research is significantly assisted by AI models.]

Does It Matter Who Solves the Problem?

So, maybe this is the natural evolution of academia. Scientific progress is expected to rapidly accelerate with AI tools. So long as we are making progress, does it matter if the progress comes from a prompt or from the human brain? I also wonder: as AI becomes ubiquitous and research accelerates, will baseline expectations rise so much that life in the 21st century becomes impossible without an AI assistant?

Computo, ergo sum

That was quite a romp through academia, but there is one common thread that connects it all. I am calling this part of my talk “computo, ergo sum”. Of course, this is a nod to Descartes and his epistemological exploration of the nature of knowledge. Today, we place so much trust in large language models that they are fast becoming an accepted source of truth.

Recentralisation of knowledge

It used to be that knowledge was centralised in institutions—either the Church or academia. With the invention of the internet and social media, knowledge became decentralised. Suddenly self-proclaimed experts could decide what was true or not. The internet is filled with bias and misinformation, but ultimately individuals have agency to navigate this knowledge and choose what to believe. The AI companies have sharply changed this. Now, users can ask LLMs for an answer or opinion, and the responses are accepted as a ground truth by many. We seem to have blindly allocated the ultimate power of critical thought to AI companies, and it has not been questioned.

Models are shaping truth

All AI models have some degree of bias. This might be a reflection of the data they were trained on, or they may reflect the values of the people creating the models. Either way, I am not confident we can ever guarantee LLMs to be an objective source of truth.

There is, of course, the question of censorship. DeepSeek, a Chinese LLM, will refuse discussion of Tiananmen Square. ChatGPT will refuse to tell you how to create weapons of mass destruction [when speaking this, I accidentally said “weapons of mass discussion” which got a laugh in the room]. However, this safety filter also limits the model’s discussion of genuinely useful tasks, such as the generation of biological lab protocols. Then there’s Grok, Elon Musk’s so-called “truth seeking” model, which is hardly objective—over the summer, it began spewing outrageously offensive content and even referred to itself as “Mecha Hitler.”

Many questions we want to ask LLMs will involve a value judgement. This means their answers will impose an ethical framework on us—one which may not match our own. Alternatively, it is also possible for these models to perfectly tailor responses to us, reinforcing pre-existing values, and excessively validating our own beliefs. There has been a lot of discussion of AI sycophancy, where models readily affirm users’ beliefs about themselves, resulting in mental and physical harm to individuals.

Should we be giving agency to these models to decide truth and make value judgements on our behalf? Many companies already rely on internal AI tools for making decisions, and even governments are starting to follow suit. For example, Albania has appointed an AI system to a cabinet-level government role, hoping it will act objectively and reduce corruption. But can we really trust these systems to act in our best interests? As we surrender more responsibility to AI, we need to ask ourselves: are we gaining objectivity, or simply handing over our agency?

Conclusion

Despite all my concerns, I remain a strong proponent of AI. Personally, I am totally fine with outsourcing the majority of my cognitive load to AI models. I am already doing it, and it’s going great. My research is having impact, and we are making progress on open questions in chemistry. But I do fear that as we put more and more trust in these models to think on our behalf, we will lose the ability to think for ourselves. If we stop teaching and practising the importance of critical thinking, artificial intelligence itself might be the end of human thought.

Questions for discussion

Do you see AI as an existential threat to your subject? Or is there an overwhelmingly positive opportunity?
What parts of thinking are we willing to outsource to LLMs, and what must remain human?
Who should set the epistemic defaults of models used in public institutions and universities?
If models become superhuman at many tasks, what does research become for humans, and what do we then value?

Declaration of AI use

In line with Oxford’s policy on the fair use of AI, I declare that I used ChatGPT to help reword and tighten my language. All points raised in this essay are my own, informed by discussions with colleagues. Regrettably, the title was also generated by ChatGPT.

Author

Nicholas Runcie

View all posts

Oxford Protein Informatics Group

or "OPIG" to friends

I Prompt, Therefore I Am: Is Artificial Intelligence the End of Human Thought?