Abstract

Our recent development of a computational cochlear-nucleus-like network model for the study of speech encoding mechanisms associated with parts of the auditory system central to the auditory nerve is described in this paper. This network model is based on physiological grounds, and is designed to gracefully interface with a cochlear model established earlier that incorporates a biophysically motivated dynamic nonlinearity in the basilar-membrane filtering function. This study addresses a longstanding issue in auditory research; that is, how the unique spatial–temporal patterns, shaped by the cochlea’s nonlinear filtering, in the auditory nerve data in response to high-level speech sounds can be transformed into a rate-place code in the auditory system in a physiologically plausible manner? Detailed neural mechanisms implemented in the model include neural inhibition, coincidence detection, short-term temporal integration of post synaptic potentials for action potential generation, and a conjectural temporal-to-rate conversion mechanism requiring the membrane time constant of a neuron to monotonically decrease with the neuron’s CF. Model simulation experiments using both synthetic and natural speech utterances demonstrate that the auditory rate-place code constructed at the output of the network model is capable of reliable representation, with possible modification and/or enhancement, of the prominent spectral characteristics of the utterances displayed in wideband spectrograms. The effectiveness of the rate-place code is demonstrated, with examples taken from TIMIT corpus, to be universal across all classes of speech sounds including vowels, liquids, fricatives, nasals, and stops.

​