Research on the Recognition and Application of Montreal Forced Aligner for Singing Audio


  • Jinyu Liu



MFA, Singing audio, Phoneme alignment, Model training


This paper discusses the feasibility of obtaining phoneme-aligned time segments for singing audio using the Montreal Forced Aligner (MFA) tool. Initially, the recognition effectiveness of singing audio data is tested using the open-source MFA model. Subsequently, samples with high recognition accuracy are manually annotated and used to train the MFA model. Finally, the recognition effectiveness of the open-source MFA model and the MFA model trained on singing audio data is observed. It is found that the performance of the trained MFA model is significantly improved. This also confirms the feasibility of MFA in recognizing singing audio.


McAuliffe, Michael, et al. "Montreal forced aligner: Trainable text-speech alignment using kaldi." Interspeech. Vol. 2017. 2017.

Kelley, Matthew C., and Benjamin V. Tucker. "A comparison of input types to a deep neural network-based forced aligner." (2018).

Tan, Xu, et al. "A survey on neural speech synthesis." arXiv preprint arXiv:2106.15561 (2021)

Rahmatullah, Griffani Megiyanto, and Shanq-Jang Ruan. "Performance Evaluation of Indonesian Language Forced Alignment Using Montreal Forced Aligner." 2023 Sixth International Symposium on Computer, Consumer and Control (IS3C). IEEE, 2023.

Gupta, Chitralekha, Haizhou Li, and Ye Wang. "Perceptual evaluation of singing quality." 2017 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC). IEEE, 2017.

Sisman, Berrak, et al. "An overview of voice conversion and its challenges: From statistical modeling to deep learning." IEEE/ACM Transactions on Audio, Speech, and Language Processing 29 (2020): 132-157.

Majaro-Majesty, Henry. "TEACHING ADULTS TO READ: INTRODUCING THE TONIC SOL-FA METHOD." Edukacja Dorosłych 2 (2017): 241-253.

Styler, Will. "Using Praat for linguistic research." University of Colorado at Boulder Phonetics Lab (2013).







How to Cite

Liu, J. (2024). Research on the Recognition and Application of Montreal Forced Aligner for Singing Audio. Journal of Computing and Electronic Information Management, 12(3), 19-21.

Similar Articles

1-10 of 81

You may also start an advanced similarity search for this article.