What we talk about when we talk about LLMs
Stochastic parrots, blurry JPEGs, aliens, and more.
Faced with something new, people often try to understand it in terms of something they already know. There’s not much else we can do in these situations: we rely on metaphors and analogies and hope they lead us closer to something like the truth. But as George Lakoff and Mark Johnson pointed out in Metaphors We Live By, metaphors are not perfect representations of the thing itself: by their very nature, they highlight certain aspects of a concept and hide others.1
I’ve been thinking about this point recently with respect to large language models (LLMs). Much of the basic technology underlying LLMs (e.g., neural networks) has been around for a few decades, but advancements in recent years—particularly in the scale of training data and the models themselves—have produced systems that feel different in kind. There’s not yet a consensus on what LLMs can and can’t do, in part because of fundamental disagreement about what kind of thing they are.2
I’ve mentioned this issue at least once before, in my piece on LLM-ology:
What’s the right model for understanding how LLMs work? Is it the human mind––bounded, as in traditional cognitive psychology, by the skull/brain barrier––or is it something entirely more distributed?
I’m not going to resolve this question here. I’m not even sure there’s a right answer. But I do think it’s helpful to sketch out the metaphor structures and analogies people are regularly using to grasp their way towards a mental model of LLMs. That’s what this post is about: what do people talk about when they talk about LLMs?
The temptation of anthropomorphism
Probably the most common analogy is the human mind.
Because LLMs deal in language, and because they’re embedded in (relatively) user-friendly chat interfaces like ChatGPT—using these systems feels, after all, like one is “talking to” the LLM—it’s easy and convenient to attribute mental states to them, such as “beliefs” and “desires”. This shouldn’t be too surprising: humans see themselves in everything (sometimes literally), and we’re prone to imbuing the world around us with agency, from a winter storm to our computer. Our tendency towards anthropomorphism is so prevalent that it forms the basis of some psychological theories of religion’s origins. So why not LLMs?
It’s hard, in fact, to talk about LLMs without attributing some form of mental life or agency to them. We might say an LLM “prefers” certain prompts over others; that it “knows” or “doesn’t know” certain things; even that it has “beliefs”. Indeed, one of the most well-known prompt engineering techniques invites an LLM to “think step-by-step”!
Of course, it’s possible to use words like “believe” and “know” without really thinking the thing in question has beliefs or knowledge. They’re convenient abstractions: for example, we might say “My phone thinks we’re still at home” when what we really mean is “My phone’s GPS hasn’t updated since leaving home”. But in the case of LLMs, I think the line between what we often say and what we mean is blurrier, perhaps because the primary way we interact with LLMs (or LLM-enabled tools) is through language.
Notably, considerable research on LLMs also conceives of LLMs as individual minds. In my post defending LLM-ology, I discussed two levels of analysis of LLMs, which I compared (roughly) to cognitive psychology and neurophysiology. Research on alignment often goes further, explicitly discussing LLMs or Artificial Intelligence (AI) systems more generally in agentive terms, i.e., the “goals” they will “try to pursue” and whether or not they have the same “values” as humans.
In a 2024 ACM paper, the philosopher Murray Shanahan argues that we ought to be careful using words like “know” and “believe” to talk about LLMs. He suggests that while it may be appropriate to say LLMs contain or encode information about the world—much like an encyclopedia contains information—the LLMs themselves don’t necessarily know anything (again, like an encyclopedia). I think this is a really interesting debate that I’ll explore more in a future post, but my point here is just that the comparison to individual human (“LLMs as humanlike agents”) is very much a live metaphor, and perhaps a dominant one, when it comes to LLMs.
Which raises the question: what else do people talk about when they talk about LLMs?
A partial accounting
In reading about LLMs, I’ve noticed some other common metaphors or analogies for what LLMs are. Some of these are deflationary (i.e., they suggest that LLMs are in some sense less impressive than they seem), while others are inflationary (i.e., they suggest that LLMs are or will more impressive than they currently seem). These comparisons also highlight (and hide) different attributes of LLMs: their imitative properties, their training objective, their potential for harm, and more.
Below is my attempt to provide a rough taxonomy of some of these metaphors; I definitely don’t intend this list to be exhaustive, nor am I arguing that some of these are more common (or more correct) than others. My goal is just to explore the territory. Finally, I’ve tried to resist the urge towards systematization: it’d be great if the metaphors could be arranged in some neat, two-dimensional space with clearly interpretable axes, but the real world of discourse is probably not that simple.
LLMs as copies
One kind of analogy emphasizes the mimetic properties of LLMs. LLMs are trained on sequences of text, which they get better and better at predicting over the course of training. Even though LLMs may not always copy their training data outright (which itself is the source of considerable debate), it’s not unreasonable to think of this training process as a kind of “lossy imitation”.
The most famous example here is probably the “stochastic parrots” metaphor, which originates in a 2021 paper by Emily Bender, Timnit Gebru, Angelina McMillan-Major, and Margaret Mitchell. They write (pg. 617):
an LM is a system for haphazardly stitching together sequences of linguistic forms it has observed in its vast training data, according to probabilistic information about how they combine, but without any reference to meaning: a stochastic parrot.
The comparison to parrots is important here, because the authors are asserting that LLMs copy language without knowing what it means—like a parrot. This process is probabilistic, not exact: hence, “stochastic” parrots. Of course, human children might learn language in part by lossy imitation too, but the argument is that they have meanings (or communicative intents) to which they can attach those signals.
A similar analogical structure underlies Ted Chiang’s 2023 essay in the New Yorker, entitled “ChatGPT is a Blurry JPEG of the Web”. Chiang compares LLMs to a kind of lossy compression algorithm: their job is to take a vast training corpus (e.g., a bunch of text on the Internet) and compress it into some lower-dimensional (though still quite vast!) space of neural network parameters that nonetheless allows them to reconstruct the original text with relative accuracy. He writes:
Think of ChatGPT as a blurry jpeg of all the text on the Web. It retains much of the information on the Web, in the same way that a jpeg retains much of the information of a higher-resolution image, but, if you’re looking for an exact sequence of bits, you won’t find it; all you will ever get is an approximation. But, because the approximation is presented in the form of grammatical text, which ChatGPT excels at creating, it’s usually acceptable. You’re still looking at a blurry jpeg, but the blurriness occurs in a way that doesn’t make the picture as a whole look less sharp.
The notion that LLMs are like a “blurry JPEG” of the data they’re trained on isn’t too far from the truth. Yet there are multiple directions one could proceed from this starting point. One direction, which Chiang doesn’t take, would be to suggest that this property of “compressing input into lossy but useful representations with which to make future predictions” looks a lot like certain theories of cognition. Another direction—the one Chiang does take—is to suggest that this property of LLMs renders them not particularly useful or interesting:
In the meantime, it’s reasonable to ask, What use is there in having something that rephrases the Web? If we were losing our access to the Internet forever and had to store a copy on a private server with limited space, a large language model like ChatGPT might be a good solution, assuming that it could be kept from fabricating. But we aren’t losing our access to the Internet. So just how much use is a blurry jpeg, when you still have the original?
Chiang’s emphasis is on the “lossy” part of “lossy compression”, which is why he starts his piece with a story about increasingly blurry copies of copies:
Xerox photocopiers use a lossy compression format known as jbig2, designed for use with black-and-white images. To save space, the copier identifies similar-looking regions in the image and stores a single copy for all of them; when the file is decompressed, it uses that copy repeatedly to reconstruct the image. It turned out that the photocopier had judged the labels specifying the area of the rooms to be similar enough that it needed to store only one of them—14.13—and it reused that one for all three rooms when printing the floor plan.
A blurry JPEG is not exactly the same thing as a stochastic parrot, but both metaphor structures highlight the fact that LLMs produce outputs that resembles their inputs; they also both suggest that the fact that LLMs don’t copy perfectly is in some sense a bad thing, leading to degraded or untrustworthy outputs—when in fact, as I’ve written elsewhere, forgetting details of the input is seen by some to be a necessary prerequisite to actual learning and generalization.
LLMs as simulators
A related analogy casts LLMs as simulators, i.e., systems that can adopt (or “perform”) various dialogic roles or identities depending on the context. Variants of this analogy have been presented in multiple places: “Simulators” by janus, a 2023 Nature: Perspectives paper by Murray Shanahan and co-authors, and a 2023 ACM paper by Joon Sung Park and co-authors—just to name a few.
Here’s how Shanahan and others describe it (bolding mine):
Now recall that the underlying LLM’s task, given the dialogue prompt followed by a piece of user-supplied text, is to generate a continuation that conforms to the distribution of the training data, which are the vast corpus of human-generated text on the Internet. What will such a continuation look like? If the model has generalized well from the training data, the most plausible continuation will be a response to the user that conforms to the expectations we would have of someone who fits the description in the preamble. In other words, the dialogue agent will do its best to role-play the character of a dialogue agent as portrayed in the dialogue prompt.
LLM-enabled chatbots like ChatGPT have been conditioned—through extensive training and feedback—to perform the role of a helpful assistant, but one could easily imagine other roles too. And as the authors point out, LLMs don’t pre-commit to performing a specific role. The “character” an LLM is performing might evolve over the course of an interaction:
Rather, it generates a distribution of characters, and refines that distribution as the dialogue progresses. The dialogue agent is more like a performer in improvisational theatre than an actor in a conventional, scripted play.
They suggest that we can think of LLMs as simulators “capable of role-playing an infinity of characters” or “generating an infinity of simulacra”. At any given point in an interaction, an LLM maintains a probability distribution over possible next tokens; the authors are analogizing that distribution to the “choices” available to the LLM, and the process of sampling from that distribution as “making a choice”. Once a choice has been made (i.e., a token has been sampled), the multiverse of possibilities available to the LLM collapses—analogous to role-playing the kind of character that would have generated a given token.
The authors argue that this analogy is helpful for resisting the allure of anthropomorphism. The LLM itself “contains multitudes” but is also a passive system without agency of other mental states. The text it generates represents a particular character that this system has taken on, and we can (maybe) more appropriately refer to that character as having properties like “beliefs” or “desires”.
I initially lumped “LLMs as simulators” in with “LLMs as copies”, but I think the metaphors are subtly different. The simulator analogy implies a kind of flexibility on the part of LLMs with respect to the “roles” they can take on, and it emphasizes the dynamic nature of LLM output as the system is used to produce text. In contrast, the copy analogy evokes (at least to me) a more static system.
LLMs as crowds
LLMs are also sometimes conceptualized as harnessing the “wisdom of the crowd”, in part due to their vast training data. See, for example, this quote from a 2023 paper published in Cognitive Science:
The ‘minds’ of language models are trained on vast amounts of human expression, so their expressions can indirectly capture millions of human minds. Language models ‘express themselves’ with words elicited by human queries.
This use of the word “minds” implies something like collective intelligence, as opposed to an individual human agent. I used a similar metaphor in my 2024 paper entitled “Large Language Models and the Wisdom of Small Crowds”, which was explicitly framed as testing the hypothesis that LLMs capture the aggregate judgments of multiple individual humans:
Despite considerable debate (Crockett & Messeri, 2023; Harding et al., 2023), however, there remains a dearth of empirical evidence directly comparing the viability of LLMs to the de facto alternative: a sample of human participants. More precisely: do LLMs actually capture the wisdom of the crowd (Dillion et al., 2023)—and if so, what is the size of that crowd?
The metaphor was extended further in a recent preprint by Philipp Schoenegger and others exploring the ability of LLMs to make accurate forecasts. They find that the aggregate judgments of multiple LLMs improves upon each of their individual judgments, concluding (bolding mine):
This replicates the human forecasting tournament’s ‘wisdom of the crowd’ effect for LLMs: a phenomenon we call the ‘wisdom of the silicon crowd.’
In each of these cases, the underlying conceit is that LLM-generated text somehow reflects the aggregate or even “average” of lots of humans. There are a few key features or entailments that this metaphor highlights. First, it emphasizes the vast size of LLM training data: the purported mechanism by which LLMs capture the “wisdom of the crowd” is that they are trained on the outputs of more than one human. Second, the wisdom of the crowd works best when the sample is diverse and unbiased, so this metaphor may suggest to some that LLM training data is also diverse and unbiased (which isn’t necessarily true). And third, if LLM outputs are “aggregates” of their inputs, then their outputs also reflect a flattening of their inputs—diverse or not. Just as the average of a distribution collapses across the distribution’s variance, so too (according to the metaphor) does an LLM collapse across the pockets of variance in its training data.
Let’s compare this metaphor to the previous two metaphors. Like “LLMs as simulators”, this metaphor implies that there’s diversity in the training data—but unlike the simulators metaphor, the crowd metaphor implies that LLMs flatten that diversity. Like “LLMs as copies”, this metaphor captures the fact that LLMs work by reconstructing their training data—but the crowd metaphor articulates a specific mechanism by which that lossy reconstruction occurs (i.e., an average).
LLMs as inscrutable gods or aliens
The final analogy conceives of LLMs as inscrutable, powerful entities—not unlike gods or aliens. This comparison is the most clearly metaphorical, but it also reflects a broader movement of individuals who are deeply concerned about the potential existential risks posed by AI (or, in the case of “e/acc”, people who want to usher in the next technological age). It’s also applied more broadly than LLMs specifically, so this section will include examples about AI in general, not just LLMs.
For example, the physicist Scott Aaronson writes in a blog post (bolding mine):
For a million years, there’s been one type of entity on earth capable of intelligent conversation: primates of the genus Homo, of which only one species remains…Now there’s a second type of conversing entity. An alien has awoken—admittedly, an alien of our own fashioning, a golem, more the embodied spirit of all the words on the Internet than a coherent self with independent goals.
In this excerpt, LLMs are variously compared to aliens (inscrutable), golems (created by us to serve some purpose), and embodied spirits (a reflection or distillation of their training data). The mention of “species” also casts this discussion in terms of evolution and the alleged uniqueness of Homo sapiens, suggesting that we’ve somehow created our equal.
The journalist Ross Douthat, in a New York Times column from March 2023 (“The Return of the Magicians”), quotes Aaronson’s post and reframes our obsession with AI in terms of older tales about magic, spirits, and summoned demons (bolding mine):
In this sense what we’re doing resembles a complex incantation, a calling of spirits from Shakespeare’s “vasty deep.” Build a system that imitates human intelligence, make it talk like a person and answer questions like an encyclopedia and solve problems through leaps we can’t quite follow, and wait expectantly to see if something infuses itself into the mysterious space where the leaps are happening, summoned by the inviting home that we have made.
Such a summoning is most feared by A.I. alarmists, at present, because the spirit might be disobedient, destructive, a rampaging Skynet bent on our extermination.
But the old stories of the magicians and their bargains, of Faust and his Mephistopheles, suggest that we would be wise to fear apparent obedience as well.
Like Aaronson, Douthat is emphasizing the inscrutable aspects of LLMs (“leaps we can’t quite follow”). But he’s also connecting these systems—and the way we interact with them—to the idea of “summoning” spirits from some other realm using arcane magic. If you’re of a certain bent, then it’s not so hard to see certain practices that have developed around LLMs (e.g., prompt engineering) as a kind of “complex incantation”. We don’t necessarily have mechanistic explanations for why these practices seem to work, but we observe that they do work (at least by our limited ability to operationalize “working”), and so we continue to follow them. How else to describe this but as a kind of ritual or superstition, driven by the observed correlation between our own actions and some outcome?
The end of Douthat’s post also illustrates that concerns about “misalignment”—and that unintended behavior can emerge through sheer obedience—are very old. Humans have been warning about the dangers of creating a sentient system to serve us. Perhaps most famous is the story of the Golem of Prague, but there’s also Goethe’s The Sorcerer’s Apprentice (brought to the screen in Fantasia); and of course, warnings about creating artificial life more generally are even more widespread (e.g., Frankenstein). A more recent example of this can be found in the books of the Bartimaeus Sequence, which depicts magicians in a fictionalized England who summon djinn to do their bidding—but must take great care to give their instructions as clearly and unambiguously as possible, lest the demon intentionally misinterpret them. Crucially, all of these stories urge caution in the act of creation or “summoning”.
(Side note: I should point out that Douthat’s article is more about the language and cultural conception of LLMs—much like this article you’re reading now—than about the “kind of thing” they are. I think that differentiates it from the other examples in this section, like Aaronson’s comparison of LLMs to aliens.)
At the other end of the spectrum, Marc Andreessen’s “Techno-Optimist Manifesto” argues that we ought to embrace the promises of new technology, including AI (bolding mine):
We believe Artificial Intelligence is our alchemy, our Philosopher’s Stone – we are literally making sand think.
We believe Artificial Intelligence is best thought of as a universal problem solver. And we have a lot of problems to solve.
We believe Artificial Intelligence can save lives – if we let it. Medicine, among many other fields, is in the stone age compared to what we can achieve with joined human and machine intelligence working on new cures. There are scores of common causes of death that can be fixed with AI, from car crashes to pandemics to wartime friendly fire.
Here, the language once again harkens back to an “enchanted world” (“alchemy”, “Philosopher’s Stone”), but instead of urging caution, Andreessen emphasizes the potential pay-offs of building AI. Here, I should note that Andreessen doesn’t describe AI in terms of gods or aliens—in this case, he’s conceiving of AI as the alchemical process itself, by which something new emerges (“we are literally making sand think”).
There’s considerable variance, then, in how this metaphor system is realized, but there are also a couple consistent themes that emerge. First, the metaphor tends to emphasize the inscrutability and novelty of AI systems, using terms like “magical”, “alien”, or “spirits”. Second, it emphasizes the potential power of these systems—either in terms of their potential danger to humanity, or their potential promise to solve important problems.
It also stands in pretty sharp contrast to the other metaphors we’ve discussed in that it abstracts away virtually all of the details of how LLMs are trained (predicting next or masked tokens), what they’re trained on (lots of text data from the Internet), what an LLM actually consists of (a bunch of weight matrices), and how they’re used to generate text (sampling tokens from a probability distribution conditioned on some context). Instead, the “LLMs as inscrutable gods or aliens” metaphor highlights the potential societal implications of LLMs and the way they fit into certain cultural scripts (such as the dangers of creation).
What do we learn from what we talk about?
Clearly, there are lots of ways to characterize large language models. None of these characterizations are literally correct: an LLM, at the end of the day, is not a blurry JPEG or a crowd, and it’s certainly not a summoned demon.
But I think we do learn something by looking at the way people talk about LLMs. The metaphors that people use—either intentionally or incidentally—inevitably highlight and hide different aspects of the thing itself. LLMs are trained to reproduce their training data, albeit in a lossy way (they are “copies”). They can also be prompted to exhibit pretty different behavior in different contexts—perhaps because they’ve learned a representation of the underlying generative process by which text is produced—and this process unfolds dynamically throughout repeated, successive sampling of tokens from an LLM (they are “simulators”). LLMs are also trained on much more data than any individual human encounters, so in some sense their outputs reflect the aggregate of lots of language producers (they are “the crowd”). And finally, LLMs are inscrutable black boxes that may hold both danger and promise for society (they are “gods”, “demons”, or “aliens”).
I’m not arguing here that one of these metaphors is better than another. But by looking at which metaphor someone uses, we can probably learn something about how they’re thinking about “the kind of thing” an LLM is. We might even think a bit more carefully about the metaphors we’re using ourselves, and whether our choice of framing is exerting a subtle influence on the direction of our thought.
Related posts:
Update (8/11/2024): After publishing this post, I came across this great article “Hunting for AI metaphors” by Brigitte Nerlich, which discusses some metaphors I covered (e.g., AI as an agent) and also some I didn’t cover (e.g., AI-generated content as pollution). I recommend it!
For example, the conceptual metaphor ARGUMENT is WAR (“he shot down my argument”, “I defended my point”) seems to raise the salience of features like conflict and winners/losers. Lakoff and Johnson speculate whether framing arguments in other terms (e.g., as a dance) might lead to different behaviors. This is also part of the premise of The Scout Mindset by Julia Galef, which suggests that we ought to view ourselves as scouts (i.e., charting out the territory) rather than soldiers (i.e., defending a position) in the context of an argument.
E.g., LLMs display impressive performance on tasks used to measure Theory of Mind, but some have argued that LLMs are a priori incapable of having Theory of Mind.
Very nice discussion. The LLMs as crowds metaphor ties to the joke:
An LLM walks into a bar. "What'll you have?" asks the bartender. The LLM looks around and replies "what is everyone else having?"
My current stance is the alien metaphor. Specifically I imagine an electrically complex gas cloud on Jupiter that after decades of listening to radio and TV begins generating and transmitting new episodes of "I Love Lucy", "All in the Family", etc. I accept LLM behavior as robust and flexible enough to count as intelligence, though a kind far different than our own.
Maybe this falls somewhere between copies and crowds in your categorization, but Alison Gopnik's framing of LLMs as a "cultural technology", i.e. like the Internet, libraries, printed text that all serve to enhance cognitive capacity and aide knowledge transmission, is an interesting metaphor as well!
Her recent talk: https://www.youtube.com/watch?v=qoCl_OuyaDw