Dissertation/ Thesis

Robust and Interpretable Visual Perception Using Deep Neural Networks

التفاصيل البيبلوغرافية
العنوان: Robust and Interpretable Visual Perception Using Deep Neural Networks
المؤلفون: Wagner, Jörg
المساهمون: Behnke, Sven, Gall, Jürgen
بيانات النشر: Universitäts- und Landesbibliothek Bonn
سنة النشر: 2023
المجموعة: bonndoc - The Repository of the University of Bonn
مصطلحات موضوعية: Deep Learning, Recurrent Neural Networks, Sensor Fusion, Semantic Forecasting, Semantic Segmentation, Multispectral Pedestrian Detection, Temporal Filtering, Interpretability by Design, Visual Explanation Methods, ddc:004
الوصف: Autonomous vehicles promise to revolutionize the transportation of people and goods by increasing road safety, reducing resource consumption, and improving quality of life. To achieve an unrestricted and large-scale deployment in the real world without any human supervision, many challenges still need to be solved. A key challenge is the robust perception and interpretation of the surroundings. Deep learning-based approaches have significantly advanced the creation of robust environment representations in recent years. However, further improvements are required, for example, to cope with difficult environment conditions (adverse weather, low lighting conditions, .). In the first part of this thesis, we investigate approaches to improve the robustness of vision-based perception models. One promising approach is to fuse data of multiple complementary sensors. Building on previous deep learning-based pedestrian detectors operating on visible images, we develop a multispectral detector. Our detector combines the data of a visible and a thermal camera using a deep fusion network and provides significantly better results than comparable single sensor models. To the best of our knowledge, this is the first work to use a deep learning-based approach for multispectral pedestrian detection. A complementary method for improving perception performance is the temporal filtering of information. The filtering task can be divided into a prediction and an update step. Initially, we explore the prediction step and propose an approach for generating semantic forecasting models by transforming trained non-predictive feed-forward networks. The predictive transformation is based on a structural extension of the network using a recurrent predictive module and a teacher-student training strategy. The resulting semantic forecasting architecture models the dynamics of the scene, enabling meaningful predictions. Building on the knowledge gained, we design a parameter efficient approach to temporally filter the representations of Fully ...
نوع الوثيقة: doctoral or postdoctoral thesis
وصف الملف: application/pdf
اللغة: English
Relation: info:eu-repo/semantics/altIdentifier/urn/urn:nbn:de:hbz:5-69413; info:eu-repo/semantics/altIdentifier/arxiv/1908.02686; https://hdl.handle.net/20.500.11811/10573
الاتاحة: https://hdl.handle.net/20.500.11811/10573
Rights: In Copyright ; http://rightsstatements.org/vocab/InC/1.0/ ; openAccess
رقم الانضمام: edsbas.67D0EAE6
قاعدة البيانات: BASE