Research on Intelligent Coding Optimization and Speech Denoising & Enhancement Methods for Audio Processing

Shuxuan Li

doi:10.54097/rqms7j12

Authors

Shuxuan Li

DOI:

https://doi.org/10.54097/rqms7j12

Keywords:

Intelligent Audio Analysis, Adaptive Coding, Random Forest, Spectral Subtraction, Wiener Filtering, Speech Denoising.

Abstract

To address the trade-off between audio quality and file size as well as the demand for enhancing noisy speech intelligibility, this paper proposes a full-process audio processing scheme integrating intelligent analysis, adaptive coding optimization, speech denoising and quantitative evaluation. First, 34 music and speech samples with diverse coding parameters are used to construct the Quality-Size Index (QSI), and combined with a random forest regression model, bit rate and compression algorithm are identified as the core factors affecting coding performance. Second, a random forest classifier based on audio time-frequency features achieves 100% accuracy in music-speech classification, enabling adaptive coding optimization via matching optimal parameters for different audio types. Finally, an optimized spectral subtraction-Wiener filtering cascade denoising algorithm is designed for non-reference noisy speech. Experimental results show that the adaptive coding scheme effectively balances audio quality and file size; the proposed denoising algorithm improves the SNR of two speech samples by 6.22 dB and 5.51 dB respectively, significantly reduces spectral flatness, eliminates musical noise and enhances speech intelligibility. This scheme provides an intelligent technical solution for audio coding and speech denoising with high practical application value.

Downloads

Download data is not yet available.

References

[1] Peng M.The application of digital media technology in the post-production of film and television animation[J].Media and Communication Research,2024,5(2).

[2] M. K L,Dai P,Rachel B, et al.Assessing the Efficiency and Quality of Audio-Coding Versus Transcript Coding[J].Nursing Research,2026.

[3] Liu L,Liang L,Huang K, et al.Spatial Distribution and Influencing Factors of Speech Intelligibility in Round-Table Conversation Scenarios[J].Buildings,2026,16(6):1258-1258.

[4] Dionelis N,Brookes M.Modulation-Domain Kalman Filtering for Monaural Blind Speech Denoising and Dereverberation. [J]. IEEE/ACM Trans. Audio, Speech & Language Processing, 2019, 27(4): 799-814.

[5] Shenghuan Z,Ye C.Masking and noise reduction processing of music signals in reverberant music [J]. Journal of Intelligent Systems, 2022, 31(1):420-427.

[6] Nahavandi M A.Noise segmentation for improving performance of Wiener filter method in spectral reflectance estimation[J].Color Research & Application,2018,43(3):341-348.

[7] Lee M,Kim M,Lee J, et al.Real-time battery safety diagnosis via Siamese convolutional neural network combined with online passive electrochemical impedance spectroscopy pattern extraction based on driving data and short-time Fourier transform[J].Journal of Energy Storage,2026,154(PC):121274-121274.

[8] L. J A,A. L K.Effect of the use of music on definitional knowledge in an introductory statistics course: Evidence from a Pareto chart activity[J].Decision Sciences Journal of Innovative Education,2021,19(4):265-274.