What Makes Sound Event Localization and Detection Difficult? Insights from Error Analysis

التفاصيل البيبلوغرافية
العنوان: What Makes Sound Event Localization and Detection Difficult? Insights from Error Analysis
المؤلفون: Nguyen, Thi Ngoc Tho, Watcharasupat, Karn N., Lee, Zhen Jian, Nguyen, Ngoc Khanh, Jones, Douglas L., Gan, Woon Seng
المصدر: Proceedings of the Detection and Classification of Acoustic Scenes and Events 2021 Workshop, pp. 120-124
سنة النشر: 2021
المجموعة: Computer Science
مصطلحات موضوعية: Electrical Engineering and Systems Science - Audio and Speech Processing, Computer Science - Artificial Intelligence, Computer Science - Machine Learning, Computer Science - Sound, Electrical Engineering and Systems Science - Signal Processing
الوصف: Sound event localization and detection (SELD) is an emerging research topic that aims to unify the tasks of sound event detection and direction-of-arrival estimation. As a result, SELD inherits the challenges of both tasks, such as noise, reverberation, interference, polyphony, and non-stationarity of sound sources. Furthermore, SELD often faces an additional challenge of assigning correct correspondences between the detected sound classes and directions of arrival to multiple overlapping sound events. Previous studies have shown that unknown interferences in reverberant environments often cause major degradation in the performance of SELD systems. To further understand the challenges of the SELD task, we performed a detailed error analysis on two of our SELD systems, which both ranked second in the team category of DCASE SELD Challenge, one in 2020 and one in 2021. Experimental results indicate polyphony as the main challenge in SELD, due to the difficulty in detecting all sound events of interest. In addition, the SELD systems tend to make fewer errors for the polyphonic scenario that is dominant in the training set.
Comment: Accepted for the 6th Workshop on Detection and Classification of Acoustic Scenes and Events (DCASE), 2021
نوع الوثيقة: Working Paper
URL الوصول: http://arxiv.org/abs/2107.10469
رقم الانضمام: edsarx.2107.10469
قاعدة البيانات: arXiv