Paper
29 March 2023 A time-domain network for laser speech bandwidth extension
Jian Fang, Shenglai Zhen, Xin Chen, Shanjing Tao, Tao Lv, Benli Yu
Author Affiliations +
Proceedings Volume 12594, Second International Conference on Electronic Information Engineering and Computer Communication (EIECC 2022); 125940Q (2023) https://doi.org/10.1117/12.2671462
Event: Second International Conference on Electronic Information Engineering and Computer Communication (EIECC 2022), 2022, Xi'an, China
Abstract
In the process of signal acquisition by laser microphones, the high frequency components of speech are missing due to nonadditive distortion. In this paper, we proposed an end-to-end speech bandwidth extension (BWE) approach to recover narrow-band speech acquired by laser microphones. Our preliminary research showed that speech enhancement algorithms based on log-magnitude spectrogram in the frequency domain could not achieve satisfactory performance for this task. Therefore, we designed a speech BWE model in time domain, this model was modified by Wave-U-Net structure, we introduced the time convolution module (TCM), the dilation of convolution is helpful to increase receptive field, improves speech long-range correlation, at the same time introduced the multi-resolution loss function (LMSTFT) instead of the mean square error (MSE), the time-domain Wave-U-Net method avoided the decoupling of magnitude and phase in the frequency domain. The results showed that the signal-to-noise ratio (SNR) of speech was improved significantly compared with approach in the frequency domain, and obtained elaborate high-frequency components than frequency-domain convolutional recurrent network (CRN). We chose laser speech to test the model in an actual scene, which further verifies the practicability of the structure through the speech spectrum analysis, and had better performance and generalization ability than the original Wave-U-Net model.
© (2023) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Jian Fang, Shenglai Zhen, Xin Chen, Shanjing Tao, Tao Lv, and Benli Yu "A time-domain network for laser speech bandwidth extension", Proc. SPIE 12594, Second International Conference on Electronic Information Engineering and Computer Communication (EIECC 2022), 125940Q (29 March 2023); https://doi.org/10.1117/12.2671462
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Education and training

Signal to noise ratio

Convolution

Phase reconstruction

Image restoration

Laser frequency

Performance modeling

Back to Top