A spatiotemporal hierarchy of surprisal sensitivity in the human brain listening to speech

Understanding speech typically feels effortless and automatic, despite substantial noise and ambiguity in the acoustic signal. One way the brain achieves robust understanding is by integrating the content of speech with the predictability of speech. I will present some ongoing work probing the brain’s sensitivity to speech content and predictability at different levels of linguistic abstraction. Leveraging the spatiotemporal precision of intracranial electroencephalography (iEEG), we map neural populations responsive to speech content and (un)predictability (‘surprisal’) in the brains of participants listening to continuous speech. Using acoustic and text-based transformer models, and a large-scale linguistic corpus, we compute surprisal estimates at the level of acoustics, phonemes, and words. With encoding models, we then identify neural populations sensitive to surprisal at these different levels of abstraction and examine their spatiotemporal distribution. We find evidence of a spatial hierarchy of surprisal sensitivity, ranging from early acoustic regions to more posterior temporal regions, with increasing left lateralisation. We also identify a temporal hierarchy of surprisal sensitivity within superior temporal gyrus. Finally, somewhat counterintuitively, we find that word surprisal estimates from larger LLMs are worse at predicting neural responses to speech.