Intra- and inter-speaker variability of sibilant fricative /s/ in Argentine Spanish
Keywords:
sibilant fricatives, speaker recognition, acoustic parameters, intra-speaker variability, inter-speaker variabilityAbstract
This paper focuses on the analysis of the discriminative power of the sibilant fricative /s/, in order to incorporate this knowledge in future automatic speaker recognition systems. The selected fricative is the most frequent consonant in the corpus. An acoustical parameter ranking of /s/ was performed based on minor intra-speaker variability and maximun inter-speaker variability. Evaluation is performed on Argentine-Spanish voice samples from the SpeechDat database recorded on a fixed phone environment. The intensity, the third formant (F3), the first formant (F1) and the first spectral moment or Center of Gravity (CG) were the best ranked parameters. The sibilant fricative /s/, considered in isolation, has a speaker recognition equal error rate (EER) of 35% lower than the average of the total of 30 phonemes involved, confirming the importance of this phoneme for the discrimination of speakers as the sixth phoneme in importance, preceded by the vowels /e/, /a/, /o/ and /i/, and the nasal /n/.
References
BAUM, S. y S. E. BLUMSTEIN (1987): «Preliminary observations on the use of duration as a cue to syllable-initial fricative consonant voicing in English», Journal of the Acoustical Society of America, 82, pp. 1073-1077.
BEHRENS, S. J. y S. E. BLUMSTEIN (1988): «On the role of the amplitude of the fricative noise in the perception of place of articulation in voiceless fricative consonants», Journal of the Acoustical Society of America, 84, pp. 861–867.
BOERSMA, P. y D. WEENINK (2005): Praat software (version 5.2.01), Amsterdam, Universidad de Amsterdam. http://www.fon.hum.uva.nl/praat. [11/11/2012]
BORZONE DE MANRIQUE, A. M. (1980): Manual de fonética acústica, Buenos Aires, Hachette.
BORZONE DE MANRIQUE, A. M. y M. I. MASSONE (1979): «On the identification of Argentine Spanish Fricatives», en E. Fischer-Jörgensen, J. Rischel y N. Thorsen (eds): Proceedings 9th International Congress of Phonetic Sciences, Copenhague, Universidad de Copenhague, vol. I, p. 237.
BORZONE DE MANRIQUE, A. M. y M. I. MASSONE (1981): «Acoustic analysis and perception of Spanish fricative consonants», Journal of the Acoustical Society of America, 69, pp. 1145-53.
CAMPBELL, J. P. JR. (1997): «Speaker recognition: A tutorial» en Proceedings of the Institut of Electrical and Electronics Engineers, Nueva York, IEEE, vol. 85, 9, pp. 1437-1462.
CARNEY, P. J. y K. L. MOLL (1971): «A cinefluorographic investigation of fricative consonant-to-vowel coarticulation», Phonetica, 23, pp. 193-202.
CICRES, J. (2011): «Los sonidos fricativos sordos y sus implicaciones forenses», Estudios filológicos, 48, pp. 33-48.
CUADRADO, L. A. H. (1995): Introducción a la teoría y estructura del lenguaje, Madrid, Verbum Editorial.
DE SAUSSURE, F. (1916): Curso de lingüística general (C. Bally, y A. Sechehaye, eds.), Madrid, Alianza, 1987.
DELATTRE, P. (1967): «Acoustic or articulatory invariance», The General Phonetic Characteristics of Languages, Santa Bárbara, Universidad de California.
DUCROT, O. y T. TODOROV (2005): Diccionario enciclopédico de las ciencias del lenguaje, Buenos Aires, Siglo XXI Editores Argentina.
FLIPSEN, P. JR.; L. SHRIBERG; G. WEISMER; H. KARLSSON y J. MCSWEENY (1999): «Acoustic characteristics of /s/ in Adolescents», Journal of Speech, Language and Hearing Research, 42, 3, pp. 663-677.
FORREST, K.; G. WEISMER; P. MILENKOVIC y R. DOUGALL (1988): «Statistical analysis of word- initial obstruents: Preliminary data», Journal of the Acoustical Society of America, 84, pp. 115-123.
GIBBONS, J. y M. T. TURELL (eds.) (2008): Dimensions of forensic linguistics, Amsterdam/Filadelfia, John Benjamins Publishing.
GURLEKIAN, J. A. (1981): «Recognition of Spanish Fricatives /s/ and /f/», Journal of the Acoustical Society of America, 70, 6, pp. 1624-1627.
GURLEKIAN, J. A.; L. COLANTONI; H. TORRES; A. RINCÓN; A. MORENO y J. MARIÑO (2001a): «Database for an Automatic Speech Recognition System for Argentine Spanish», en S. Bird, P. Buneman y M. Liberman (eds.): Proceedings of the IRCS Workshop on Linguistic Databases, Filadefia, Editorial LDC-Upenn, Research in Cognitive Sciences and the NSF Project on International Standards in Language Engineering, 1, pp. 219-227.
GURLEKIAN, J. A.; COLANTONI, L. y TORRES, H. (2001): «El alfabeto fonético SAMPA y el diseño de córpora fonéticamente balanceados», Fonoaudio-lógica, Editorial ASALFA, 47, 3, pp. 58-69.
HALL, M.; E. FRANK; G. HOLMES; B. PFAHRINGER; P. REUTEMANN e I. WITTE (2009): «The WEKA Data Mining Software: An Update», SIGKDD Explorations, vol. 11, 1, pp. 10-18.
HEINZ, J. M. (1961): «Analysis of fricative consonants», MIT Research Lab of Electronics Quartely Progress Report, 60, pp. 181-84.
HEINZ, J. M. y K. N. STEVENS (1961): «On the Properties of Voiceless Fricatives Consonants», Journal of the Acoustical Society of America, 33, pp. 589-96.
HUGHES, G. W. y M. HALLE (1956): «Spectral Properties of Fricative Conso-nants», Journal of the Acoustical Society of America, 28, 2, pp. 303-310.
JASSEM, W. (1965): «The formants of fricative consonants», Language and Speech, 8, 1, pp. 1-16.
JASSEM, W. (1968): «Acoustical description of voiceless fricatives in terms of spectral parameters», Speech analysis and synthesis, 1, pp. 189-206.
JONGMAN, A.; R. WAYLAND y S. WONG (2000): «Acoustic characteristics of En-glish fricatives», Journal of the Acoustical Society of America, 108, p. 1252.
KAHN, J.; N. AUDIBERT; S. ROSSATO y J. F. BONASTRE (2010): «Intra-speaker variability effects on speaker verification performance», Odyssey 2010, Brno (República Checa), pp. 109-116.
KAHN, J.; N. AUDIBERT; J. F. BONASTRE y S. ROSSATO (2011): «Inter and intra-speaker variability in French: an analysis of oral vowels and its implication for automatic speaker verification», en W.-.S Lee y E. Zee (eds): International Congress of Phonetic Science, Hong Kong, Universidad de Hong Kong, pp. 1002-1005.
MAGRIN-CHAGNOLLEAU, I.; J. F. BONASTRE y F. BIMBOT (1995): «Effect of utterance duration and phonetic content on speaker identification using second-order statistical methods», en J. M. Pardo (ed): Eurospeech ’95, Madrid, ISCA, pp. 337-340.
MARRERO, V.; J. GIL y E. BATTANER (2003): «Inter-speaker variation in Spanish. an experimental and acoustic preliminary approach», en Ma. J. Solé y J. Romero (eds): Proceedings of the 15th International Congress of Phonetic Sciences, Barcelona, UAB, pp. 703-706.
MORENO, A.; R. COMEYNE; K. HASLAM; H. VAN DEN HEUVEL; H. HÖGE; S. HORBACH y G. MICCA (2000): «SALA: SpeechDat Across Latin America. Results of the first phase», en M. Gavrilidou, G. Carayannis, S. Markantonatou, S. Piperidis y G. Steinhaouer (eds): Proceedings of the Second International Conference on Language Resources and Evaluation, Universidad Técnica Nacional de Atenas, II, pp. 877–882.
NORDSTRÖM, P. E. y B. LINDBLOM (1975): «A normalization procedure for vowel formant data», International Congress on Phonetic Sciences, Leeds, Uni-versidad de Leeds, paper 212.
ROSE, P. (2002): Forensic Speaker Identification, London, Taylor & Francis.
SAMBUR, M. (1975): «Selection of Acoustic Features for Speaker Identification», Acoustics, Speech and Signal Processing, IEEE Transactions on, 23, 2, pp. 176-182.
STEVENS, K. N. (1972): «Sources of inter- and intra- speaker variability in the acoustic properties of speech sounds», en A. Rigault y R. Charbonneau (eds.): Proceedings of the 7th International Congress of Phonetic Sciences, La Haya, Mouton, pp. 206-232.
STREVENS, P. (1960): «Spectra of fricative noise in human speech», Language and Speech, 3, 1, pp. 32-49.
TABAIN, M. (2001): «Variability in Fricative Production and Spectra. Implications for the Hyper-and Hypo-and Quantal Theories of Speech Production», Language and Speech, 44, 1, pp. 57-93.
TODA, M.; S. MAEDA y K. HONDA (2010): «Formant-cavity affiliation in sibilant fricatives», Turbulent Sounds: An Interdisciplinary Guide, 21, pp. 343-374.
UNIVASO, P.; M. MARTÍNEZ SOLER; D. EVIN y J. A. GURLEKIAN (2012): «A preliminary approach to forensic speaker recognition using phonemes», en D. Torre, A. Ortega, A. Teixeira, J. González, L. Hernández, R. San Segun-do y D. Ramos Castro (eds.): IberSPEECH 2012, VII Jornadas en Tecnolo-gía del Habla and III Iberian SLTech Workshop, Madrid, UAM. http://iberspeech2012.ii.uam.es/IberSPEECH2012_OnlineProceedings.pdf [01/12/2012]
VIDAL DE BATTINI, B. (1964): El Español de Argentina, Buenos Aires, Consejo Nacional de Educación, 1983.
WEIRICH, M. (2010): «Articulatory and Acoustic Inter-Speaker Variability in the Production of German Vowels», ZAS Papers in Linguistics, 52, pp. 19-42.
WINSKY, R. (1997): «Definition of Corpus, Scripts and Standards for Fixed Networks», SpeechDat project, doc ref LE2-4001-SD1, vol. 1,
Published
How to Cite
Issue
Section
License
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.
All articles published online by Estudios de Fonética Experimental are licensed under Creative Commons Attribution-NonCommercial-NoDerivs 4.0 International (CC BY-NC-ND 4.0 DEED), unless otherwise noted. Estudios de Fonética Experimental is an open access journal. Estudios de Fonética Experimental is hosted by RCUB (Revistes Científiques de la Universitat de Barcelona), powered by Open Journal Systems (OJS) software. The copyright is not transferred to the journal: authors hold the copyright and publishing rights without restrictions. The author is free to use and distribute pre and post-prints versions of his/her article. However, preprint versions are regarded as a work-in-progress version used as internal communication with the authors, and we prefer to share postprint versions.