Semantic dimensions of teaching that guide human reinforcement learning

Hernán Anlló (Language, Cognition, Computation & Respiration group (LCCR), Clinical and experimental respiratory neurophysiology lab, INSERM, Hôpital Pitié Salpêtrière)

Mardi 14 avril 2026 à 14h en salle 26-00/534

Language is central to teaching, yet we lack principled ways to characterise which aspects of instructional text make explanations effective, particularly in cases where people are simultaneously learning from experience while being taught. We combined a two-stage human behavioural paradigm with a large language model–based analysis pipeline. “Teacher” participants first learned probabilistic two-armed bandit tasks through trial and error, then wrote free-text lessons for future “Pupil” participants, who received one such lesson (or none) before performing the same tasks. Lessons judged as high quality by external experts improved pupils’ reinforcement learning performance relative to low-quality lessons and no-instruction controls. To understand why, we introduced LLM-DISC, an inferential use of multi-step large language model processing to uncover the latent semantic dimensions in instructional text. Four dimensions (Memorization, Pattern Recognition, Option Ranking, and Randomness) predicted both expert judges’ rankings of lessons and pupils’ behavioural outcomes. Finally, by manipulating these dimensions in LLM-generated “Good” and “Bad” lessons, we causally altered pupils’ learning in a manner that replicated the original “human teacher” effects. These results show that a low-dimensional semantic structure of teaching language measurably shapes experiential reinforcement learning, and that it can be systematically discovered, interpreted, and used to probe the cognitive mechanisms underlying teaching and learning.