Report
Automatic Curriculum Learning for Large-Scale Cooperative Multiagent Systems
العنوان: | Automatic Curriculum Learning for Large-Scale Cooperative Multiagent Systems |
---|---|
المؤلفون: | Zhang, Tianle, Liu, Zhen, Pu, Zhiqiang, Yi, Jianqiang |
بيانات النشر: | IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC |
سنة النشر: | 2023 |
المجموعة: | Institute of Automation: CASIA OpenIR (Chinese Academy of Sciences) / 中国科学院自动化研究所机构知识库 |
مصطلحات موضوعية: | Task analysis, Games, Markov processes, Training, Multi-agent systems, Computational intelligence, Observability, Automatic curriculum learning, large-scale multiagent systems, multiagent reinforcement learning, Computer Science, Artificial Intelligence |
الوصف: | Recently, a lot of works have been devoted to researching how agents can learn efficient cooperation in multiagent systems. However, it still remains challenging in large-scale multiagent systems (MASs) due to the complex dynamics between the agents and environment and the dimension explosion of state-action space. In this paper, we propose a novel MultiAgent Automatic Curriculum Learning method (MA-ACL) to solve learning problems of large-scale cooperative MASs by beginning from learning on a multiagent scenario with a few agents and automatically progressively increasing the number of agents. An evaluation mechanism based on self-supervised learning is innovatively designed to automatically generate appropriate curricula with a progressively increasing number of agents. Moreover, since the observation state dimension of agents varies across curricula and the learned policy knowledge needs to be effectively encoded, we design a new Distributed Transferable Relation-modeling Policy network structure (DTRP) to handle the dynamic size of the network input and model relational knowledge between agents and their surrounding environment. Simulation results show that the proposed MA-ACL using DTRP can significantly improve the performance of large-scale multiagent learning compared with manual or non curriculum learning methods, and DTRP greatly boosts the performance of MA-ACL. |
نوع الوثيقة: | report |
اللغة: | English |
Relation: | IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE; http://ir.ia.ac.cn/handle/173211/52555 |
DOI: | 10.1109/TETCI.2022.3209655 |
الاتاحة: | http://ir.ia.ac.cn/handle/173211/52555 https://doi.org/10.1109/TETCI.2022.3209655 |
رقم الانضمام: | edsbas.CD6EE14D |
قاعدة البيانات: | BASE |
DOI: | 10.1109/TETCI.2022.3209655 |
---|