Paper
8 June 2023 Application of subspace-based CNN speech noise reduction model in medical field
Xian Fu, Sisi Dong, Weiqi Zhou, Zhuzhu Zhang, Ruiwen Ye, Ting Ren
Author Affiliations +
Proceedings Volume 12707, International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2023); 127074J (2023) https://doi.org/10.1117/12.2680947
Event: International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2023), 2023, Changsha, China
Abstract
In order to enhance the noise reduction capability and recognition efficiency of speech recognition technology applied in medical devices, this paper adopts an improved subspace CNN based noise reduction model, adding a subspace projection module with two orthogonal self-attentive mechanisms to the original model, inputting the feature vectors extracted from the convolutional layer to the self-attentive layer, and the two self-attentive layers learn and project orthogonally to each other to obtain the noise embedding vector and speech embedding vectors to extract cleaner speech. In this paper, the improved subspace CNN noise reduction model is compared with the CNN noise reduction model by comparing the distortion degree, segmental signal-to-noise ratio and quality at different signal-to-noise ratios to demonstrate the reliability of the model. The results show that the overall distortion level of the model is reduced by about 7.27%, the overall segmental SNR performance is increased by about 15.27% and the quality is improved by about 8.7%, which is a certain improvement in performance compared with the traditional CNN noise reduction model.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Xian Fu, Sisi Dong, Weiqi Zhou, Zhuzhu Zhang, Ruiwen Ye, and Ting Ren "Application of subspace-based CNN speech noise reduction model in medical field", Proc. SPIE 12707, International Conference on Image, Signal Processing, and Pattern Recognition (ISPP 2023), 127074J (8 June 2023); https://doi.org/10.1117/12.2680947
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Denoising

Signal to noise ratio

Speech recognition

Performance modeling

Education and training

Distortion

Feature extraction

Back to Top