"Algorithmic entombment", explore-exploit trade-offs, and serendipity

Do our recommendation systems doom us to homogeneity?

Oct 04, 2022

I’m impressed with Spotify’s Discover Weekly playlist. In the last few months or so, I (or perhaps Spotify) have “discovered” several fantastic songs and musicians it’s unlikely I would have heard otherwise: Alabaster Deplume (specifically, the album To Cy and Lee: Instrumentals Vol. 1), La Femme (specifically, the song It’s Time to Wake Up), and Joose Keskitalo (specifically, the song TIETOISUUS). Somehow, these recommendations felt perfectly selected for me––not just in general, but for the particular week when I listened to them.

Yet recommender systems aren’t always so impressive––sometimes they drift towards a kind of homogenization. Amazon often recommends me items I’ve already purchased, Instagram relentlessly shows me reels of black cats1, and even the content of a Discover Weekly playlist can start to sound repetitive at times.

I’m certainly not the first person to notice this tendency. In How To Do Nothing, Jenny Odell calls this algorithmic entombment: an evocative phrase that, at least for me, conjures up a future in which the systems we build to improve our lives end up imprisoning us––kindly, benevolently––instead in a kind of predictable monotony. The notion of algorithmic entombment stuck with me, and I’ve since been thinking about how it connects to issues of technology-induced terraforming I’ve written about before.

Because you liked…

Recommender systems are a way to get us to do something, based on things we’ve already done. This applies to any manner of digital media we might consume: news articles, Instagram posts, Netflix shows, Spotify songs.

In some sense, I think data-driven recommender systems are a positive development. I don’t think it’s a stretch to say Discover Weekly has tangibly improved my life: music makes my emotional life richer and more substantive––it makes me happy, wistful, melancholic––and discovering new music adds new textures to that landscape.

And importantly, people are different. It makes sense to try to “get to know” which things a given person will like the most, rather than serving everyone up with the same exact recommendations or ads (talk about terraforming!).

But as Odell and others have noted, there’s still something inherently limiting about this approach. Recommender systems, by definition, narrow the space of inputs we receive. They help us make sense of the blooming, buzzing confusion of potential media; they help us select an action from a space of near-infinite possibility. But the way in which they do this is by observing what we’ve done before and assuming that we’ll continue to do similar things. In doing so, they lead us down a path––pleasantly, conveniently––that essentially forces our future to look like our past.

And of course, if the goal is to make better predictions, then a recommender system will be most successful if it makes us more predictable in turn. For example, perhaps Spotify’s algorithm identifies certain attractors in song-space that exert a kind of gravitational pull: once a user is sucked into their orbit, they tend to sample more and more songs of that nature. In theory, then, it’s in the algorithm’s best interest––in terms of satisfying its objective function––to lead people to those attractors, such that their behavior becomes increasingly predictable.

Again, I should note that this isn’t worse than just recommending the same things to everybody. It’s always important to consider the counterfactual scenario. I prefer a world of data-driven recommendations to a world where we’re all force-fed the same music diet.

But it’s still terraforming. And importantly, it’s terraforming of a particular kind. If Odell and others are right, this kind of terraforming makes our experiences more homogenous, more self-similar, all in the service of providing predictably good content––a process that brings to mind the explore-exploit trade-off.

To explore or to exploit?

Researchers in decision-making––as well as machine learning, evolutionary biology, and more––often reference the notion of an exploration-exploitation trade-off.

Any kind of decision-making agent faces this problem: should they choose what they already know (“exploit”) or search for the possibility of some, possibly better, alternative (“explore”)? This trade-off shows up most obviously in a foraging scenario, but it’s been applied to all manner of decision-making situations. It’s the fundamental tension between taking risks vs. playing it safe. Should you order the dish you’ve had before or try something new? Should you go to the same coffee shop you go to every Saturday or try the new place that opened up down the street? Should you live where you grew up or try out a different place?

And importantly, there are benefits and disadvantages to each approach.

Exploiting a known reward guarantees pay-off: you know what you’re going to get. At the same time, if you exploit the first “reward” you encounter, it’s pretty likely you’ll miss other, better rewards. For example, if you only ever eat the same dish at the same restaurant, you’re missing out on all sorts of potential meals you might love even more.

Exploration, then, is important. It’s good to try new things, because there’s some possibility that they’ll bring you joy. But it’s also risky. You might end up having a terrible meal, watching a bad movie, or moving to a city that makes you miserable. You might invest your money in a new venture that goes bust.

Hence, the trade-off. Too much exploitation leads to a life of sub-optimal monotony; too much exploration drives up your risk. You need to find a balance of some kind––and exactly what that balance entails depends on an agent’s goals and interests.

The future is like the past, and so on.

Right now, it seems to me that recommender systems lean towards the “exploitation” side of this trade-off.

The goal of a recommender system is to predict what you’ll like and suggest it to you. This is an intrinsically challenging problem: human beings are fickle creatures, and our likes and dislikes often feel idiosyncratic. But recommender systems resolutely try to infer these preferences nonetheless, making the critical assumption that they can be modeled as a function of our past behavior and the past behavior of others who behave similarly to us.

The simplest form of this assumption is that if my friend and I both like the same ten movies, then there’s a good chance I’ll like other movies my friend has seen but I haven’t. By assuming some degree of shared preferences, a system can “fill in the gaps”. Even if I haven’t read The Factory, I’ve read and enjoyed similar books to other people who have read and enjoyed The Factory, so there’s a good chance I’ll like it too (and I did).

Critics of this approach will note that it leads to a kind of insulation––we’re never forced to stray outside our comfort zones in preference-space, whether we’d like to or not. In predicting what we’ll like, recommender systems assume that we are, in the end, predictable––even mechanical. But this means that we might be missing entire regions of preference-space that would otherwise make us very happy.

Where, in this worldview, would something like serendipity come into play?

What happened to serendipity?

Serendipity––the development of events by chance in a fortuitous way––is a funny concept. First, even the notion of “chance” is a little odd: if you believe in a roughly causal mechanist view of the world, as I mostly do, then there are simply things that happen given the appropriate physical conditions. And second, serendipity is often associated with something being particularly meaningful. That is, even though this event was “random” in some sense, we also feel that it was somehow “meant to happen” and that it therefore carries larger significance for the shape of our lives––yet it also feels important that it wasn’t planned to happen.2

And at least for me, it’s really hard to deny the joy of this sense of significance. Recently, my partner and I were visiting Toronto; on our second-to-last day, I was exploring Chinatown and noticed a marquee at El Mocambo announcing that the band Monsieur Periné was playing that very evening. Both of us like the band a lot, though we don’t follow them closely enough to know when and where they’re touring. It seemed almost impossible, then, that they were playing a show that very night, while we were still in town. It seemed so much more likely that we might’ve chosen to leave the night before, or that the show would’ve taken place the following night, or that I wouldn’t have walked down that street at all and noticed the marquee in the first place. We ended up going, and it was an amazing concert––perhaps my favorite ever––and an important part of that experience was the sense of serendipity.

Now, I recognize that we’re prone to attributing meaning to what’s essentially random noise. But as I said above, there’s something quite lovely about the feeling of that randomness somehow being tailored for me––not necessarily by design, per se, but as if it were somehow meant to be that way.

Finding some of those songs on Discover Weekly felt serendipitous as well. Not all of them, certainly. But there were a few––the ones I mentioned at the beginning of this article––that felt like they perfectly fit the moment in which I listened to them.

On (not) automating serendipity.

Could Spotify reliably––say, once a week––trigger that feeling of serendipity? Would we want it to?

My first intuition here is that a more exploratory recommender system could, as the name implies, sample from broader and more diverse regions of preference-space. That is, rather than trying to narrow in on a relatively circumscribed attractor of your preferences––e.g., “Latin jazz from the last decade”––it would occasionally throw in “wild-cards”. These wild-card recommendations could either be genuinely random or simply recommendations that the system has lower confidence in. Over time, such a system would be less likely to calcify and more likely to serve up surprises.3

The first problem that occurs to me––and I suspect that writers like Odell might agree––is that programming an “explore routine” is itself a form of algorithmic entombment. As soon as something is written down in code, perhaps it attracts towards a kind of fossilization or “living taxidermy” (as Jane Jacobs accused city planners of approaching city design). It’s hard for me to say whether this is a real problem or not. Even if the exploratory routine is set down in code, the whole point is that it’s sampling more widely from preference-space (e.g., more possible songs), so it seems to me that the recommendations themselves are less likely to become entombed.

The other problem, which I think is more serious, is that randomness simply isn’t the same thing as serendipity. Serendipity is the feeling we get when a seemingly chance event is imbued with a kind of special meaning. In my view, this feeling is not purely a function of the event itself––it’s some combination of the event, our state of mind, and the context surrounding the event. To think otherwise is to conflate the effect of a stimulus with the effect of its context.

The feeling of serendipity I experienced in Toronto wasn’t really about the concert itself. The music was lovely, of course, but that feeling emerged because of the context of that concert––happening upon the marquee during my walk in Chinatown; the knowledge that had we left a day earlier, we would’ve missed the concert entirely; the brief moment of indecision as I considered not going (it would mean a late night, and I was already tired), and the satisfaction we felt when we decided to just go ahead and buy the tickets; riding the train to Queen’s Park station and walking over to El Mocambo; waiting inside the venue, slightly anxious that the band wouldn’t show up in the end (they were late); the energy of the crowd around us; even the journey home.

When I think about it, the same is true for those Spotify recommendations that now hold a special place in my heart. The songs, I’m sure, are well-matched to my musical preferences, and I probably would’ve enjoyed them on any day. But each of them was recommended in a context that felt particularly well-suited, and I can remember each of those contexts quite well even now, months after the fact. For example, I first heard TIETOISUUS by Joose Keskitalo while walking around the field near our apartment; the color of the sky was that perfect shade of dark blue just after the sun has set, which often makes me feel a kind of bittersweet melancholy. The song, too, is melancholic; I don’t understand the lyrics––it’s in Finnish––but the combination of acoustic guitar, horns, and Keskitalo’s voice evoke a similarly pleasant sadness.

I don’t see how Spotify could reliably trigger such a feeling with its recommendations. It’s a losing game: our feelings about an event––or a song, a movie, a book––aren’t just about the thing itself. They reflect who we’re with, where we are, what we’re thinking about when we encounter or experience that event––a million thoughts and sensations that somehow sum up to us.

Spotify doesn’t, shouldn’t, and perhaps simply can’t have access to that level of detail about us––it’s the stuff of life.

Enjoyment and the thing itself.

I’ve strayed a bit from the subject.

Some readers might think I’ve missed the point. I began with the question of how to avoid homogenization and ended up arguing that no recommendation can reliably trigger a sense of meaning, since much of that feeling is produced by the context of an experience as opposed to the experience itself. But I’d argue that the latter argument is relevant to the former question. Would a more exploratory recommender system allay the fossilization of our preferences we’re trying to avoid? Perhaps, to some extent. But there’s a sense in which this is missing the ultimate point: the goal, ultimately, is to create a system that brings people enjoyment––and it’s important to understand that enjoyment is not just about the thing itself.4

Some readers might also think I’m being unfair to recommender systems and those who build them. That’s not my intent; I fully acknowledge that this is a hard problem. I also want to be clear that I’m not suggesting it’s somehow wrong to build these systems. People are going to navigate the space of aesthetic possibilities one way or another, and leveraging large datasets to make targeted recommendations is a clever way to help them navigate it. The alternative, after all, could very well be even more homogenous: the droning monotony of a single television channel.

But I also think it’s worth thinking about alternative approaches to sifting through the blooming, buzzing confusion of stimuli.

The case for and against curation.

One alternative is curation: relying on the advice of someone with extensive experience in a domain, whose guidance we trust. Some of my most cherished memories have involved a kind of “curated experience”: a sake tasting in Kyoto, Japan and a port tasting in Petaluma, California; listening to a jazz album with my father. Curation is also, in my opinion, the main utility of taking an academic seminar with an expert in a field––the whole point of a curriculum is to curate a set of papers, readings, and discussion points in such a way that you end up with a deeper understanding of that subfield than if you tried to navigate it alone. And indeed, there are plenty of “recommender systems” that rely on the curation mechanism: fivebooks.com asks experts in a domain to recommend five books on a topic, along with their explanations; similarly, the Ezra Klein Show typically ends by asking guests to recommend three books that have influenced them; Wirecutter delivers curated product recommendations; Substack allows writers to recommend other Substacks.

Curation has its own problems. For one, it’s labor-intensive. Now, I happen to think the kind of labor involved isn’t necessarily one we should seek to automate: the people giving these recommendations seem to enjoy the act of curation. I’d also note that even “automated” approaches to recommendation rely on labeled data of some kind (e.g., user reviews).

Second, the reliance on “expert opinion” might smack of elitism to some. I don’t see how this could be avoided, nor do I think reliance on expertise is a bad thing––people have different levels of expertise in different domains, and curation necessarily involves expertise of some kind. The problem, if there is one, is perhaps that some domains are seen as more worthy of curation than others (i.e., they’re awarded higher status) and similarly, some people are wrongly seen as having more expertise than others (i.e., because of superficial features––like class, race, gender, etc.––irrelevant to the domain itself). I think that’s a real issue––we should, obviously, aim to provide curated experiences for a larger set of domains and from a larger pool of curators––but I’m unfortunately not sure there’s an easy fix here.5

The vampire problem.

I’m going to conclude with something that may seem at first completely off-topic.

The philosopher L.A. Paul has written extensively about so-called transformative experiences. These are experiences that––as the name implies––fundamentally transform us in such a way that our very preferences and worldview are hard to reconcile with the “us” that existed before the experience. For example, prior to becoming a vampire, we might think that we wouldn’t enjoy being a vampire; we might even be horrified at the thought of drinking blood. But once we turn into a vampire, our preferences change: suddenly drinking blood sounds not so bad––maybe even quite good––and we have a great time flying around and turning invisible.6

The point here is that there are some experiences that change who we are. This makes it difficult to make rational decisions about the so-called optimal choice in some situations––decision theory assumes some amount of stability about our preferences, but if the act of making a decision can shape those preferences, it’s unclear how we can weigh up the pros and cons. Paul (and others, like Russ Roberts) have argued that many experiences in life are like this, such as having kids. Before we have kids, perhaps we worry that we’ll miss certain activities we enjoy now (like going out at night, traveling, etc.) and resent other activities we’re forced to do more often (like staying in more frequently, making lunch for our kid, etc.). And while that’s probably true to some extent, I’ve also heard anecdotally from many parents that they started enjoying things they never thought they’d enjoy so much. Put simply, their preferences shifted.

How does all this relate to algorithmic entombment?

My goal here is to emphasize the dynamic, ever-changing nature of the self. As time passes, we change in various ways––some small, some large. We can change in ways we never thought imaginable.

There are ways in which we stay stable too, and those constancies of spirit should be treasured.

But we should also see change as intrinsic to who we are and what we’re capable of, and thus resist all the more an entombment that threatens to turn us from warm-blooded creatures into cold, living taxidermy.

Which, to be honest, I’m not really complaining about.

This lack of design is an important feature of serendipity and something I’ll return to later in the post.

I should note that this is perhaps how some commercial recommender systems already work.

If I’m right, that is.

It’s certainly not a problem I’m going to solve in a short Substack essay not aimed directly at addressing the issue.

Depending, I guess, on what affordances we’re granted.

The Counterfactual

Discussion about this post