التفاصيل البيبلوغرافية
العنوان: |
Investigating the Role of Speaker Counter in Handling Overlapping Speeches in Speaker Diarization Systems |
المؤلفون: |
Duong, Thanh Thi-Hien, Nguyen, Phi-Le, Nguyen, Hong-Son, Duong, Ngoc Q. K. |
بيانات النشر: |
Authorea, Inc. |
سنة النشر: |
2023 |
المجموعة: |
The Winnower (via CrossRef) |
الوصف: |
In real-life conversations, meetings, or debates, there are often situations where many people speak at the same time, leading to overlapping speech segments. Such overlapping speech is an extremely challenging problem for the speaker diarization task. The widely used clustering-based diarization approaches perform quite poorly under such situations due to their limited capabilities in handling overlapping speeches. This paper investigates a speaker diarization framework in which a new building block, called speaker count, is integrated. Such speaker counter predicts the number of active speakers in each analyzing audio window, then its output is used in the conventional re-segmentation step of the diarization pipelines in order to better label the active speakers in each considered segment. We also investigate the effect of the analyzing audio window size on diarization performance by theoretical analysis. We claim that the speaker count block ensures a lower diarization error rate when the analyzing window size is small enough. Experiment results obtained from two state-of-the-art diarization systems with different settings on two benchmark datasets, AMI Headset mix and DIHARD III, confirmed the effectiveness of the proposed approach. |
نوع الوثيقة: |
other/unknown material |
اللغة: |
unknown |
DOI: |
10.22541/au.169227844.46576615/v1 |
الاتاحة: |
http://dx.doi.org/10.22541/au.169227844.46576615/v1 |
رقم الانضمام: |
edsbas.7AB86A3D |
قاعدة البيانات: |
BASE |