Dissertation/ Thesis

Offline reinforcement learning for ambulance dispatch

التفاصيل البيبلوغرافية
العنوان: Offline reinforcement learning for ambulance dispatch
المؤلفون: Lamarca Ferrés, Enric
المساهمون: Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Stephan Robert, Martín Muñoz, Mario
بيانات النشر: Universitat Politècnica de Catalunya
سنة النشر: 2023
المجموعة: Universitat Politècnica de Catalunya, BarcelonaTech: UPCommons - Global access to UPC knowledge
مصطلحات موضوعية: Àrees temàtiques de la UPC::Informàtica::Intel·ligència artificial, Reinforcement learning, Data sets, Aprenentatge de Reforç Fora de Línia, Aprenentatge d'Imitació, Clonació Conductual, Conservative Q-Learning, Behaviour Regularized Actor-Critic, Q-Learning, Xarxes Neuronals Profundes, Enviament d'ambulàncies, Cerca Aleatòria, Construcció de conjunt de dades, Offline Reinforcement Learning, Imitation Learning, Behavioural Cloning, Deep Neural Networks, Ambulance Dispatch, Random Search, Dataset Building, Aprenentatge per reforç, Conjunts de dades
الوصف: This master's thesis is focused on applying offline reinforcement learning techniques to the ambulance dispatch problem, which involves selecting the most appropriate ambulance to dispatch when an incident occurs. The research is part of the SIA-REMU project, conducted both in France and Switzerland. An incident has an associated priority level. Incidents of priority 0 are vital emergencies, incidents of priority 1 are non-vital emergencies, and incidents of priority 2 are non-emergencies. The primary objective of the presented work is to train reinforcement learning agents capable of prioritizing incidents appropriately when dispatching ambulances. Initially, a dataset of experiences was constructed using data provided by the Centre de régulation du Centre Hospitalier Universitaire Vaudois (CHUV), which contains valuable information about incidents, interventions, and resources. This dataset of experiences served as a static dataset for training reinforcement learning agents in an offline setting, without interacting with an environment. State-of-the-art offline reinforcement learning algorithms were employed to train the agents, and their hyperparameters were tuned performing a Random Search. To evaluate and test the trained agents, a virtual environment was implemented. Finally, the policies learned by the agents were analyzed to draw meaningful conclusions from the obtained results.
نوع الوثيقة: master thesis
وصف الملف: application/pdf
اللغة: English
Relation: http://hdl.handle.net/2117/405475; 178087
الاتاحة: http://hdl.handle.net/2117/405475
Rights: Open Access
رقم الانضمام: edsbas.C62FFB50
قاعدة البيانات: BASE