Planning with a Learned Policy Basis to Optimally Solve Complex Tasks

التفاصيل البيبلوغرافية
العنوان:	Planning with a Learned Policy Basis to Optimally Solve Complex Tasks
المؤلفون:	Kuric, David, Infante, Guillermo, Gómez, Vicenç, Jonsson, Anders, van Hoof, Herke
المصدر:	Proceedings of the International Conference on Automated Planning and Scheduling; Vol. 34 (2024): Proceedings of the Thirty-Fourth International Conference on Automated Planning and Scheduling; 333-341 ; 2334-0843 ; 2334-0835
بيانات النشر:	Association for the Advancement of Artificial Intelligence
سنة النشر:	2024
المجموعة:	Association for the Advancement of Artificial Intelligence: AAAI Publications
الوصف:	Conventional reinforcement learning (RL) methods can successfully solve a wide range of sequential decision problems. However, learning policies that can generalize predictably across multiple tasks in a setting with non-Markovian reward specifications is a challenging problem. We propose to use successor features to learn a set of local policies that each solves a well-defined subproblem. In a task described by a finite state automaton (FSA) that involves the same set of subproblems, the combination of these local policies can then be used to generate an optimal solution without additional learning. In contrast to other methods that combine local policies via planning, our method asymptotically attains global optimality, even in stochastic environments.
نوع الوثيقة:	article in journal/newspaper
وصف الملف:	application/pdf
اللغة:	English
Relation:	https://ojs.aaai.org/index.php/ICAPS/article/view/31492/33652; https://ojs.aaai.org/index.php/ICAPS/article/view/31492
DOI:	10.1609/icaps.v34i1.31492
الاتاحة:	https://ojs.aaai.org/index.php/ICAPS/article/view/31492 https://doi.org/10.1609/icaps.v34i1.31492
Rights:	Copyright (c) 2024 Association for the Advancement of Artificial Intelligence
رقم الانضمام:	edsbas.F92030C9
قاعدة البيانات:	BASE

View record in BASE

الوصف
DOI:	10.1609/icaps.v34i1.31492