LaMIT - Lexical access Model for Italian

Vision

The goal of this research project is two-fold. First, it proposes to apply the Lexical Access model conceived by Stevens [Stevens, K. N. (2002). “Toward a model for lexical access based on acoustic landmarks and distinctive features,” J. Acoust. Soc. of Am., 111(4):1872–1891] to the Italian language. The second purpose is to develop an Italian Lexical Access system, i.e., a speech recognizer that is designed to imitate the process actuated by a listener in deriving words intended by a speaker. Modeling this process requires setting a hypothesis on how lexical items are stored in memory. Stevens’ model postulates that lexical items are stored in memory according to distinctive features, and that these features are hierarchically organized. The model highlights the importance of abrupt acoustic events, named landmarks, in the perception process, detection of landmarks being primary in human perception, corresponding to the first phase of recognition. The temporal area around the landmark is then further processed by the listener. The expected outcomes with significant impact of the work are:

The Lexical Access model has been so far only applied to American-English. The application of the model to a different language may lead to a better understanding of the underlying universal language-independent nature
To the best of our knowledge there exists no system of such kind for Italian. The development of an Italian speech recognizer based on landmarks and acoustic features detection, and the possibility to provide access to data and algorithms, may lead to establishing a reference record for the speech community in Europe and abroad, based on the impact the project would provide in supporting evidence for universal strategies of speech perception.

An ambitious goal of the project is in fact to understand whether in the above process there exist language independent mechanisms, and to discriminate them against language dependent ones. Our vision postulates that lexical objects are stored in memory according to distinctive features that are hierarchically organized in three classes, roughly summarized as follows: a) articulator-free features, mainly referring to the manner of articulation, that are acoustically correlated with the presence of abrupt acoustic variations named landmarks, corresponding to the first phase of recognition by a listener; b) Primary articulator-bound features reflecting the presence of an active articulator, providing further information on the place of articulation; c) Secondary articulator-bound features related to active adjustment of articulators that are not implied in the primary features (mainly the larynx, the soft palate and the tongue body). Our hypothesis is that cues to landmarks may be language independent. Supported evidence for universal strategies of speech perception would lead the project to have a tremendous impact.
The acoustical analyses carried out as part of the work leading to the scientific articles listed below were performed using the xkl software tool, developed by Dennis Klatt at the Massachusetts Institute of Technologies.

Scientific outcomes of the project

J. Arango, S. Yao, A. DeCaprio, S. Baik, S. Shattuck-Hufnagel and M.-G. Di Benedetto, “Estimation of the Frequency of Occurrence of Italian Phonemes,” 179th Meeting of the Acoustical Society of America, Chicago, Illinois, 8-12 Dec. 2020.
M.-G. Di Benedetto, J.-Y. Choi, S. Shattuck-Hufnagel, L. De Nardis, S. Budoni, J. Vivaldi, J. Arango, A. DeCaprio and S. Yao, “Speech recognition of spoken Italian based on detection of landmarks and other acoustic cues to distinctive features,” 179th Meeting of the Acoustical Society of America, Chicago, Illinois, 8-12 Dec. 2020.

SPEECH COMMUNICATION

Research projects

LaMIT - Lexical access Model for Italian

Vision

Scientific outcomes of the project