Research on the Recognition and Application of Montreal Forced Aligner for Singing Audio

Authors

  • Jinyu Liu

DOI:

https://doi.org/10.54097/ohpdubg1

Keywords:

MFA, Singing audio, Phoneme alignment, Model training

Abstract

This paper discusses the feasibility of obtaining phoneme-aligned time segments for singing audio using the Montreal Forced Aligner (MFA) tool. Initially, the recognition effectiveness of singing audio data is tested using the open-source MFA model. Subsequently, samples with high recognition accuracy are manually annotated and used to train the MFA model. Finally, the recognition effectiveness of the open-source MFA model and the MFA model trained on singing audio data is observed. It is found that the performance of the trained MFA model is significantly improved. This also confirms the feasibility of MFA in recognizing singing audio.

References

McAuliffe, Michael, et al. "Montreal forced aligner: Trainable text-speech alignment using kaldi." Interspeech. Vol. 2017. 2017.

Kelley, Matthew C., and Benjamin V. Tucker. "A comparison of input types to a deep neural network-based forced aligner." (2018).

Tan, Xu, et al. "A survey on neural speech synthesis." arXiv preprint arXiv:2106.15561 (2021)

Rahmatullah, Griffani Megiyanto, and Shanq-Jang Ruan. "Performance Evaluation of Indonesian Language Forced Alignment Using Montreal Forced Aligner." 2023 Sixth International Symposium on Computer, Consumer and Control (IS3C). IEEE, 2023.

Gupta, Chitralekha, Haizhou Li, and Ye Wang. "Perceptual evaluation of singing quality." 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 2017.

Sisman, Berrak, et al. "An overview of voice conversion and its challenges: From statistical modeling to deep learning." IEEE/ACM Transactions on Audio, Speech, and Language Processing 29 (2020): 132-157.

Majaro-Majesty, Henry. "TEACHING ADULTS TO READ: INTRODUCING THE TONIC SOL-FA METHOD." Edukacja Dorosłych 2 (2017): 241-253.

Styler, Will. "Using Praat for linguistic research." University of Colorado at Boulder Phonetics Lab (2013).

Downloads

Published

30-04-2024

Issue

Section

Articles

How to Cite

Liu, J. (2024). Research on the Recognition and Application of Montreal Forced Aligner for Singing Audio. Journal of Computing and Electronic Information Management, 12(3), 19-21. https://doi.org/10.54097/ohpdubg1