I’m trying out something new with The Counterfactual this year. As I described here, I’m offering a paid option, which will allow paying subscribers to vote in monthly polls on which topics they’d like to see me cover; this month’s poll was released last week. (Note that, as has been the case for the last ~year or so, my goal is still to have at least two free posts per month, diving deep into a topic—or reporting the results of an original, Counterfactual-funded study.)
With that in mind, let’s turn to the poll results.
Poll results
This month’s poll presented three options:
An empirical study using LLMs to quantify the readability of a text.
An empirical study on whether movies these days are more frequently about the past than they used to be.
An explainer piece on multimodal LLMs.
The winner was (1): an empirical study on LLMs and reading difficulty. That means you can expect to see the results of an original research study, conducted by me, sometime in February.
I’m excited to dig into a new literature—the idea of “readability” is by no means an uncontested construct, and just defining what we mean is going to be part of the challenge. I’ve also found what looks to be a suitable corpus for exactly the task I want to do, which was part of a broader “challenge” to build systems that predict human judgments of readability.
Now, because there were relatively few votes overall, and because “movies and the past” did receive some votes, my plan is to include that as an option in next month’s poll as well (among some other new options). So if you were rooting for that, or if there’s something else you’d like to see me explore, let me know—I’m always happy to hear new ideas.
What ChatGPT thinks you think
Just on a whim, I asked ChatGPT to predict how my subscribers would answer. This was my prompt:
Me: I'm running a poll on my Substack about which piece to write. Which of the following options do you think subscribers will choose? In general, my audience is interested in cognitive science and language models.
I then copied/pasted the description of each option from the original post. Here’s how ChatGPT responded:
ChatGPT: Predicting the preferences of your audience involves a bit of guesswork, especially since all three topics are intriguing and relevant to cognitive science and language models. However, based on the general interest in your field, I can venture a guess:
LLMs and the reading level of a text [study]: This topic seems to hit right at the core of what your audience is interested in – the practical application and analysis of language models. It combines technical detail with a practical, real-world application, which could be very appealing. I'd estimate a 45% chance of this being the top choice.
Movies and the past [study]: While this topic is interesting and connects to a broader cultural discussion, it may be slightly less appealing because it veers away from the more technical and direct aspects of cognitive science and language models. It's still relevant but might not be the primary interest of your audience. I'd say there's a 25% chance of this being selected.
Multimodal language models, explained [review]: This topic is extremely relevant and timely, given the recent advancements in MLMs. It's technical and directly in line with interests in language models and cognitive science. It also touches on philosophical and technical aspects, which could be very engaging for your readers. I'd estimate a 30% chance of this being the most popular choice.
Remember, these are just estimates based on the general interests of an audience keen on cognitive science and language models. Your audience's specific interests could sway these percentages significantly.
Readability wins again, though by a smaller margin than in the actual poll.
This isn’t really evidence of much—it’s a single data point that doesn’t even control for the order of the options presented. What it does show is that my description of the readability study is closest to what my audience is interested in (presumably because it involves both LLMs and a real-world application). That said, I’ll be curious to know if ChatGPT consistently predicts each month’s poll.
What else to expect
In addition to the readability study coming out in February, I’ve got a few other articles in the pipeline, including:
I’m teaching a class on LLMs and Cognitive Science, so I’ve been thinking a lot about the history of Artificial Intelligence and how it’s intertwined with the history of Cognitive Science—so you can expect at least one post delving into that.
I’ve also been thinking a lot about learning. I wrote a bit about learning (and forgetting) in my recent post about the New York Times lawsuit, but I’m also working on a bigger-picture post on the idea that “learning” can happen across all types of time-scales, from an individual organism’s lifespan to cross-generational “learning” via natural selection.
Now that this newsletter has a larger audience, I’m interested in learning more about who reads it. Thus, I’m also preparing a survey to learn more about your backgrounds and interests.
Thanks again to all of you who read the newsletter—I’m always grateful for your feedback on what you like, what you’d like to see more of, or what you disagree with.
"To think is to forget a difference, to generalize, to abstract." - J.L. Borges
"I’ve been thinking a lot about the history of Artificial Intelligence and how it’s intertwined with the history of Cognitive Science—so you can expect at least one post delving into that."
Wonderful!
As a newcomer to all this, I'm trying to figure out the taxonomy hierarchy/web of all the terms -- cogsci, AI, machine learning, etc.