Learning-Based Personal Speech Enhancement for Teleconferencing by Exploiting Spatial-Spectral Features

التفاصيل البيبلوغرافية
العنوان: Learning-Based Personal Speech Enhancement for Teleconferencing by Exploiting Spatial-Spectral Features
المؤلفون: Hsu, Yicheng, Lee, Yonghan, Bai, Mingsian R.
المصدر: ICASSP 2022 - 2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
بيانات النشر: IEEE, 2022.
سنة النشر: 2022
مصطلحات موضوعية: FOS: Computer and information sciences, Sound (cs.SD), Audio and Speech Processing (eess.AS), FOS: Electrical engineering, electronic engineering, information engineering, Computer Science - Sound, Electrical Engineering and Systems Science - Audio and Speech Processing
الوصف: Teleconferencing is becoming essential during the COVID-19 pandemic. However, in real-world applications, speech quality can deteriorate due to, for example, background interference, noise, or reverberation. To solve this problem, target speech extraction from the mixture signals can be performed with the aid of the user's vocal features. Various features are accounted for in this study's proposed system, including speaker embeddings derived from user enrollment and a novel long-short-term spatial coherence feature pertaining to the target speaker activity. As a learning-based approach, a target speech sifting network was employed to extract the relevant features. The network trained with LSTSC in the proposed approach is robust to microphone array geometries and the number of microphones. Furthermore, the proposed enhancement system was compared with a baseline system with speaker embeddings and interchannel phase difference. The results demonstrated the superior performance of the proposed system over the baseline in enhancement performance and robustness.
Comment: accepted by ICASSP 2022
DOI: 10.1109/icassp43922.2022.9746859
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::43e3d70c853d586df1eae153ad279a23
https://doi.org/10.1109/icassp43922.2022.9746859
Rights: OPEN
رقم الانضمام: edsair.doi.dedup.....43e3d70c853d586df1eae153ad279a23
قاعدة البيانات: OpenAIRE
الوصف
DOI:10.1109/icassp43922.2022.9746859