Research of Speech Style Transfer Based on Neural Network
DOI:
https://doi.org/10.54097/ajst.v1i3.390Keywords:
Image style transfer, Speech style transfer, Convolutional neural network, 2D spectrogram.Abstract
This paper draws inspiration from image style transfer model - neural style transfer, which leads to the research topic of speech style transfer based on neural network. First, the article describes the extraction process on 2D spectrogram of speech signal. Then, the speech style transfer based on convolutional neural network is constructed.
Downloads
References
Childers D G, Wu K, Hicks D M, et al. Voice conversion[J]. Speech Communication, 1989, 8(2):147-158.
Byron D K, Pikovsky A, Woods E. Text-to-speech for digital literature, US9183831[P]. 2015.
Schwardt L. C., Du Preez J. A., Voice conversion based on static speaker Characteristics IEEE COMSIG-98, Cape Town, September 1998, 57~62.
Qi Yingyong, Weinbery B. Bi Ning. Enhancement of female esophageal and tracheoesophageal speech, J. Acoust. Soc. Am., Nov. 1998, (5): 2461~2465.
Sundermann D., Ney H., Hoge H., VTLN-based cross-language voice conversion. In IEEE Automatic Speech Recognition and Understanding Workshop, 2003, 676~681.
M. Abe, S. Nakamura, K. Shikano, and H. Kuwabara, “Voice conversion through vector quantization,” ICASSP, 1988:655-658.
M. Savic, and I. Nam, “Voice personality transformation,” Digital Signal Process, no.1, pp.107-110, 1991.
M. Narendranath, H. A. Murthy, S. Rajendran, and B. Yegnanarayana, “Transformation of formants for voice conversion using artificial neural networks,” Speech communication, vol.16, no.2, pp.207-216, 1995.
T. Watanabe, T. Murakami, M. Namba, T. Hoya, and Y. Ishida, “Transformation of spectral envelope for voice conversion based on radial basis function networks,” International Conference on Spoken Language processing, pp.789-793, 2002.
R. C. Guido, L. Sasso. Vieira, S. Barbon Junior, F. L. Sanchez, C. Dias Maciel, E. Silva Fonseca, and J. Carlos Pereira, “ A neural-wavelet architecture for voice conversion,” Neurocomputing, vol.71, no.1-3, pp.174-180, Dec.2007.
Nirmal J, Zaveri M, Patnaik S, et al. Voice conversion using General Regression Neural Network[J]. Applied Soft Computing, 2014, 24(24):1–12.
Ghorbandoost M, Sayadiyan A, Ahangar M, et al. Voice conversion based on feature combination with limited training data[J]. Speech Communication, 2015, 67(67):113-128.