Abstract: After the appliance of different statistical norms to validate the quality of synthesized voices applied to an HTS-based spanish synthesizer, which uses LSP and Cepstral Coefficients parameterizations. The authors firmly concluded both things: LSP parameterization can be as good as the standard Mel-Cepstral parameterization. Nevertheless, both parameterizations are still insufficient to qualify as natural sounding speech synthesis.
Keywords: Speech Synthesis, Voice Parameterization, Line Spectral Pair, Mel-Cepstral Parameterization.
[1]. K. Tokuda, Y. Nankaku, T. Toda, H. Zen, J. Yamagishi, and K. Oura, "Speech Synthesis Based on Hidden Markov Models," Proc. IEEE, vol. 101, no. 5, pp. 1234–1252, May 2013.
[2]. A. Herrera-Camacho and F. D. R. Ávila, "Development of a Mexican Spanish Synthetic Voice Using Synthesizer Modules of Festival Speech and HTSStraight," Int. J. Comput. Electr. Eng., pp. 36–39, 2013.
[3]. C. Franco, F. Del Rio, and A. Herrera, "ATINER Conference Paper Series Speech Synthesis of Central Mexico Spanish using Hidden Markov Models," pp. 1–12, 2016.
[4]. N. Nakatani, K. Yamamoto, and H. Matsumoto, "Mel-LSP Parameterization for HMM-based Speech Synthesis," Eurasip Proc. SPECOM 2006, 2006.
[5]. C. Franco, A. Herrera, and B. Escalante, "Speech Synthesis in Mexican Spanish using LSP as voice parameterization," iiisci.org, 2017.