Building the boat while sailing: studying social interaction in 2026
I have - slowly but surely - been making progress toward a long-standing goal of mine: reading Tolstoy’s War and Peace. It’s far from the slog I remember when I last tried to read it (~10 years ago). Instead I find myself drawn in by the evocative portraits drawn of the novel’s central characters, in particular the ornate descriptions of their beliefs and desires - Prince Andrei’s stoicism, Pierre’s search for purpose, Natasha’s romantic fervor - which color how they perceive the world.
This degree of interiority is striking. Rarely, it seems, do we have such unfettered access to inner machinations and traces of reasoning situated in other minds. But even if we don’t have Tolstoy on our shoulder to offer astute insights into our day-to-day interactions, we do have something else: the ability to infer otherwise inaccessible mental states through interactive conversation.
To see this, imagine helping a friend with a research talk. A key part of the presentation initially comes across as unclear, but after an extended back-and-forth with carefully targeted questions and clarifications, you both converge on the fleshed-out form of the idea. Crucially, this resolution can only be reached after multiple exchanges; it’s not a one-shot thing. More broadly, this example illustrates that while we cannot literally read each others’ minds, the affordance of language gives us a way to express thoughts, and a mechanism to coax them out through unfolding interaction.
For the past few years, I’ve been thinking about the cognitive principles that enable these dynamic forms of social interaction in intelligent agents: how is it that people can accomplish feats of joint action unparalleled by any other animal species, things like building a skyscraper, playing in a band or establishing the United Nations? Increasingly, I’ve also been thinking about how new, distinct forms of intelligence - namely, AI agents built on top of large language models - interact with people and with each other, and what might result from these encounters.
The obvious common thread across all these classes of interactions is the use of language to convey meaning and to shape other’s behavior. Yet for all the clear importance of interactive language use - both as a window into what people are thinking, and as an action-oriented tool for accomplishing goals - cognitive science is still very much in the early stages of study.
In the rest of this essay, I will lay out some thoughts about possible future directions for the science of interaction, broadly construed. My hope is that these directions can help us (in a small way) answer long-standing questions about the nature of human cognition, and provide some methodological grounding to study what is going on in our interactions with these new forms of intelligence.
The cognitive science of (human) social interaction
Human social interaction is largely realized through language; it’s the medium that we use to convey everything from immediate practical concerns to abstract ideas that transcend space and time. The corollary is that interaction is often the impetus for language use. If you perform a task only for yourself, there’s less of a reason to create an externalized trace of that in language, either spoken or written (although there are notable exceptions, such as journal entries in a diary). Accordingly, much of the language that we are exposed to reflects this explicitly functional purpose of transmitting information from one mind to another, or jointly constructing new knowledge by combining disparate knowledge distributed across multiple individuals.
There are academic fields devoted to the study of strategic interaction sans language (e.g., game theory). Within cognitive and psychological science, researchers studying nonlinguistic populations (very young children and nonhuman primates) have used techniques such as gaze analysis to study interactions. For full-fledged natural language conversational data, important methodological foundations have been established, perhaps most notably by practitioners of conversational analysis. But the surface has barely been scratched, even though language is arguably the richest behavioral data we can collect, the most high-resolution readout of otherwise inaccessible mental processes.
In fact, cognitive and psychological science has not really focused on the study of interaction, period. Even subareas of cognitive science studying social cognition have traditionally studied human sociality in a kind of asocial way, e.g. making inferences about the mental states of a single other person. Social psychology faces a similar dynamic, in that it’s often focused on documenting internal attitudes and beliefs; the unit of analysis is often not the interaction itself.
In this sense, psychology is the least social of the social sciences. Although this feels like an oversight, it’s hard to blame past researchers because historically, it has been methodologically very challenging to study interaction. Experiments in groups are more difficult to run than experiments with individuals, and the ensuing traces of interaction in language were hard to analyze at scale.
This latter challenge, of course, has now changed with LLMs. It is now doable to analyze open-ended language in a very fine-grained way: language models can be used to identify specific attributes in language data with a high degree of precision, or transform language into other formal representations more suitable for quantitative analysis. There is a lot of exciting work on new LLM-powered methods to extract structured signal in text data in principled ways.
But there’s a strange sort of circularity: the very tools that facilitate the study of interaction are becoming objects of study in their own right. LLMs are interacting with human users through various chatbot products, and increasingly in agentic setups with people and other agentic systems. Tasks that have been traditionally thought of as solitary endeavors (e.g. programming) are becoming fundamentally interactive; modern software engineering now entails iterating on a spec through back-and-forths with agents.
I am far from the first person to point out that AI models can occupy multiple distinct roles in behavioral science, both as a tool and a model system (some examples here and here). But what I would argue is different about this particular phenomenon is that the scientific study of social interaction is uniquely inchoate, because of the historical blockers to studying it. We’re building the boat while sailing: we’re trying to develop tools to study these kinds of interactions as they are already changing our workflows and ultimately shaping how we think.
Understanding new types of interactions
The rise of interactive AI systems has revealed gaps in our understanding of interaction, broadly construed, and has created new challenges of shared interest across many fields. I see this as a call-to-arms and opportunity for cognitive science. This presents a chance to develop new tools, frameworks, models and methods that not only help resolve longstanding theoretical questions about representation and computation, but can also be brought to bear on questions of broad societal relevance.
Just as the development of information technology and feedback systems during World War II exposed new fundamental questions about computation and led to scientific breakthroughs in information theory and cybernetics (control theory), today’s conversational AI systems can play a catalyzing role for the science of interactive intelligence, which transcends traditional disciplinary boundaries. In an ideal world, we’d work toward a science of interaction that offers a sufficiently broad and rigorous toolkit that is useful across all classes of interactions - purely human, purely AI and human-AI teams - and responsive to the ways in which these interactions will change, as model capabilities improve and human behavior changes.
As a starting point, it may be useful to reflect on what’s shared and what’s different across these classes of interactions. Of course, the involved agents themselves - and the process by which they acquire worldly knowledge - can be fundamentally different. Many have pointed out that the scale and content of training data differs across people and AI models. Yet there is one possible high-level commonality: similar to how a person might listen to conversation between others, LLMs in pre-training learn from traces of interactive reasoning, where the learning agent is not an involved party in the interaction. Even before all the post-training explicitly designed to make LLMs more useful as “interactive” conversational agents, they are exposed to interactions between multiple individuals with distinct epistemic states exchanging information over an unfolding conversation, played out in comment threads and online forums. In this sense, both humans and AI models occupy the role of an observer embedded in a society of interacting agents, uptaking the externalized traces.
The form factor of the interaction itself is also worth considering. As noted earlier, all classes of interactions are primarily realized through natural language - a point of commonality. Yet there are also clear differences, e.g. in the availability and latency of responses - a chatbot can answer instantaneously, without incurring a “cost” (in a psycholinguistic, not necessarily monetary, sense) to produce such a response, becoming a sort of assistant on demand.
Given this reduction in barriers, one of the most promising, although also potentially perilous, opportunities for interactive LLM use is as a tool for personalized learning. For most of human history, before widespread literacy, learning through the experiences of others was fundamentally interactive: there was no way to learn from another person other than being physically co-located and speaking directly with them. The advent of the written word changed that dynamic; all kinds of knowledge, from mathematics to poisonous mushroom species, became accessible in written, static texts that were broadly accessible. Learning through AI interaction feels like a kind of interpolation between these different modes, in that it’s primarily realized through the familiar form of written text, yet the text is no longer static; it is generated through unfolding interaction with a teacher-like figure. In this sense, learning with LLMs is both an extension of existing paradigms and a qualitatively new phenomenon.
Of course, these are very high-level reflections on the structural similarities and differences across different classes of interactions - what would it mean to actually interrogate this rigorously? There’s much to be discussed, which will likely rehash broader methodological debates on experimental control vs. naturalism, and formal modeling vs. qualitative insight - but one thing that seems clear is the need to design experiments that expose open-ended interaction in the first place, or otherwise aggregate naturally-occurring datasets that reveal traces of interaction. Maybe this seems like kind of a trivial point, but at the same time, much remains to be done in investigating these phenomena through the lens of cognitive science. It feels more attainable—and more important—than ever to study this fundamental aspect of intelligence.
Thank you to Bill Thompson for feedback on earlier drafts, as well as to the many people who have helped me develop the ideas here!