Author Archives: Rebonto Haque

Peering Inside the Black Box: A Beginner’s Introduction to Mechanistic Interpretability

Over the last few years, large language models (LLMs) have gone from being curiosities tucked away in research labs to something most of us interact with on a daily basis; whether for drafting emails, debugging code, or simply pondering the meaning of life at 2am. And yet, for all our reliance on these systems, a rather inconvenient truth lingers in the background: nobody, not even the people who built them, can fully explain what is going on inside.

This is where mechanistic interpretability comes in.

In essence, mechanistic interpretability is the approach of explaining complex machine learning systems through the behaviour of their functional units (Kästner and Crook, 2024) by reverse-engineering them into their more elementary computations (Rai et al., 2025). The aim is not simply to know that a model gives the right answer, but to pull apart the underlying machinery and uncover the causal relationships between input and output. Think of it as neuroscience for neural networks, except we can read every neuron at any moment, rewind, replay, and intervene mid-thought.

Continue reading

Pitfalls of AI-Generated Reviews: Case Study of a Frontiers in Microbiology Review on Anti-Influenza A bnAbs

In the last five or so years, large language models (LLMs) have transformed from a novel regurgitator of haphazardly stitched together sentences to an almost ‘human’ personality standing by our side as we tackle life. Whilst the perceived humanity of these models is the topic for perhaps a future blogpost, it is almost undeniable to understate the impact of LLMs in our daily lives. Do you need someone to proofread your essay you’ve spent hours drafting? GPT (or one of its many counterparts) has you covered. Need help drafting an email from scratch? No problem. Want to write and/or heavily edit an entire academic article which would typically require days, if not weeks, of research? Surely just needs a push of a button… right?

Despite tremendous advances in LLMs, key issues mean they are not yet a fully dependable addition to our writing endeavours. They are known to fail when asked to generate new content with only a basic prompt. Some of these failures have made headlines 1. Some of the scariest instances are those of hallucinated information 2–4 . This refers to the phenomenon where AI tools generate convincing information which is factually inaccurate or simply fabricated 2 . In Belgium, the Ghent university rector came under fire for citing quotes, supposedly from influential thinkers, which were later found to be AI-hallucinations 1.
Whilst there are numerous examples of the poorly cited and often AI-hallucinated papers falling through the cracks of the peer-review process, today we focus on a Frontiers in Microbiology review titled ‘Broadly neutralizing monoclonal antibodies against influenza A viruses: current insights and future directions’ 5. This paper attempts to provide an overview of the current landscape of monoclonal antibodies (mAbs) which are being developed to confer protection against influenza A, highlighting ‘technological advances, clinical performance, and scalability’. This paper contains many of the hallmarks of text that has been created or edited with generative AI, despite the generative AI statement stating ‘The author(s) declared that Generative AI was not used in the creation of this manuscript.’

Continue reading