Temporal encoding of the voice onset time phonetic parameter by field potentials recorded directly from human auditory cortex

Mitchell Steinschneider, Igor O. Volkov, M. Daniel Noh, P. Charles Garell, Matthew A. Howard

Research output: Contribution to journalArticlepeer-review

160 Scopus citations


Voice onset time (VOT) is an important parameter of speech that denotes the time interval between consonant onset and the onset of low-frequency periodicity generated by rhythmic vocal cord vibration. Voiced stop consonants (/b/, /g/, and /d/) in syllable initial position are characterized by short VOTs, whereas unvoiced stop consonants (/p/, /k/, and t/) contain prolonged VOTs. As the VOT is increased in incremental steps, perception rapidly changes from a voiced stop consonant to an unvoiced consonant at an interval of 20-40 ms. This abrupt change in consonant identification is an example of categorical speech perception and is a central feature of phonetic discrimination. This study tested the hypothesis that VOT is represented within auditory cortex by transient responses time-locked to consonant and voicing onset. Auditory evoked potentials (AEPs) elicited by stop consonant- vowel (CV) syllables were recorded directly from Heschl's gyms, the planum temporale, and the superior temporal gyms in three patients undergoing evaluation for surgical remediation of medically intractable epilepsy. Voiced CV syllables elicited a triphasic sequence of field potentials within Heschl's gyms. AEPs evoked by unvoiced CV syllables contained additional response components time-locked to voicing onset. Syllables with a VOT of 40, 60, or 80 ms evoked components time-locked to consonant release and voicing onset. In contrast, the syllable with a VOT of 20 ms evoked a markedly diminished response to voicing onset and elicited an AEP very similar in morphology to that evoked by the syllable with a O-ms VOT. Similar response features were observed in the AEPs evoked by click trains. In this case, there was a marked decrease in amplitude of the transient response to the second click in trains with interpulse intervals of 20-25 ms. Speech-evoked AEPs recorded from the posterior superior temporal gyms lateral to Heschl's gyms displayed comparable response features, whereas field potentials recorded from three locations in the planum temporale did not contain components time-locked to voicing onset. This study demonstrates that VOT at least partially is represented in primary and specific secondary auditory cortical fields by synchronized activity time-locked to consonant release and voicing onset. Furthermore, AEPs exhibit features that may facilitate categorical perception of stop consonants, and these response patterns appear to be based on temporal processing limitations within auditory cortex. Demonstrations of similar speech-evoked response patterns in animals support a role for these experimental models in clarifying selected features of speech encoding.

Original languageEnglish (US)
Pages (from-to)2346-2357
Number of pages12
JournalJournal of neurophysiology
Issue number5
StatePublished - 1999

ASJC Scopus subject areas

  • General Neuroscience
  • Physiology


Dive into the research topics of 'Temporal encoding of the voice onset time phonetic parameter by field potentials recorded directly from human auditory cortex'. Together they form a unique fingerprint.

Cite this