Meta-Reinforcement Learning With Dynamic Adaptiveness Distillation
العنوان: | Meta-Reinforcement Learning With Dynamic Adaptiveness Distillation |
---|---|
المؤلفون: | Shiji Song, Hangkai Hu, Xiang Li, Gao Huang |
المصدر: | IEEE Transactions on Neural Networks and Learning Systems. 34:1454-1464 |
بيانات النشر: | Institute of Electrical and Electronics Engineers (IEEE), 2023. |
سنة النشر: | 2023 |
مصطلحات موضوعية: | Computer Networks and Communications, business.industry, Computer science, Process (engineering), Probabilistic logic, Context (language use), Machine learning, computer.software_genre, Computer Science Applications, Task (computing), PEARL (programming language), Artificial Intelligence, Generalization (learning), Reinforcement learning, Artificial intelligence, business, Inefficiency, computer, Software, computer.programming_language |
الوصف: | Deep reinforcement learning is confronted with problems of sampling inefficiency and poor task migration capability. Meta-reinforcement learning (meta-RL) enables meta-learners to utilize the task-solving skills trained on similar tasks and quickly adapt to new tasks. However, meta-RL methods lack enough queries toward the relationship between task-agnostic exploitation of data and task-related knowledge introduced by latent context, limiting their effectiveness and generalization ability. In this article, we develop an algorithm for off-policy meta-RL that can provide the meta-learners with self-oriented cognition toward how they adapt to the family of tasks. In our approach, we perform dynamic task-adaptiveness distillation to describe how the meta-learners adjust the exploration strategy in the meta-training process. Our approach also enables the meta-learners to balance the influence of task-agnostic self-oriented adaption and task-related information through latent context reorganization. In our experiments, our method achieves 10%-20% higher asymptotic reward than probabilistic embeddings for actor-critic RL (PEARL). |
تدمد: | 2162-2388 2162-237X |
DOI: | 10.1109/tnnls.2021.3105407 |
URL الوصول: | https://explore.openaire.eu/search/publication?articleId=doi_dedup___::2442ef49c2016e4e288fbfa906b07688 https://doi.org/10.1109/tnnls.2021.3105407 |
Rights: | CLOSED |
رقم الانضمام: | edsair.doi.dedup.....2442ef49c2016e4e288fbfa906b07688 |
قاعدة البيانات: | OpenAIRE |
تدمد: | 21622388 2162237X |
---|---|
DOI: | 10.1109/tnnls.2021.3105407 |