Studies of Human Cognition with Neural Language Models

Recollection versus Imagination: Exploring Human Memory and Cognition via Neural Language Models

ACL 2020

We started this project several years ago with a question: Might we employ advances in machine learning, particularly rich embeddings and neural language models built from large-scale language corpora, to explore human cognitive processes around the generation of narratives–with a focus on the language employed in stories about events that have been experienced versus imagined?  Our intuition was that the processes of recall versus generation might leave tell-tale traces of the different mixes of memory versus reasoning and projecting.

In the methods and results reported here, we investigate the use of neural language models to characterize cognitive processes involved in storytelling, contrasting imagination and recollection of events. To facilitate this, we collect and release Hippocorpus, a dataset of 7,000 stories about imagined and recalled events.

We introduce a measure of narrative flow and use this to examine the narratives for imagined and recalled events. Additionally, we measure the differential recruitment of knowledge attributed to semantic memory versus episodic memory (Tulving, 1972) for imagined and recalled storytelling by comparing the frequency of descriptions of general commonsense events with more specific realis events.

Our analyses show that imagined stories have a substantially more linear narrative flow, compared to recalled stories in which adjacent sentences are more disconnected. In addition, while recalled stories rely more on autobiographical events based on episodic memory, imagined stories express more commonsense knowledge based on semantic memory. Finally, our measures reveal the effect of narrativization of memories in stories (e.g., stories about frequently recalled memories flow more linearly; Bartlett, 1932). Our findings highlight the potential of using NLP tools to study the traces of human cognition in language.

ACL 2020 paper:

M. Sap, E. Horvitz, Y. Choi, N.A. Smith, J.W. Pennebaker. Recollection versus Imagination: Exploring Human Memory and Cognition via Neural Language Models, ACL 2020.

Dataset of recalled versus imagined events: Hippocorpus





Portrait of Eric Horvitz

Eric Horvitz

Technical Fellow and Chief Scientific Officer

Portrait of James W. Pennebaker

James W. Pennebaker


Portrait of Maarten Sap

Maarten Sap

PhD Student

Portrait of Noah A. Smith

Noah A. Smith


Portrait of Yejin Choi

Yejin Choi