Comparison of PPO and SAC Algorithms Towards Decision Making Strategies for Collision Avoidance Among Multiple Autonomous Vehicles

التفاصيل البيبلوغرافية
العنوان: Comparison of PPO and SAC Algorithms Towards Decision Making Strategies for Collision Avoidance Among Multiple Autonomous Vehicles
المؤلفون: Syafiq Fauzi Kamarulzaman, Arafatur Rahman, Abu Jafar Md Muzahid
المصدر: 2021 International Conference on Software Engineering & Computer Systems and 4th International Conference on Computational Science and Information Management (ICSECS-ICOCSIM).
بيانات النشر: IEEE, 2021.
سنة النشر: 2021
مصطلحات موضوعية: Computer science, Control (management), Stability (learning theory), In vehicle, Reinforcement learning, Traffic flow, Collision, Set (psychology), Algorithm, Collision avoidance
الوصف: Multiple vehicle collision avoidance strategies with safe lane changing strategy for vehicle control using learning base technique are the most crucial concern in autonomous driving system. Statistics shows that the latest autonomous driving systems are usually prone to rear-end collision. Rear-end collisions often result in severe injuries as well as traffic jam and the consequences are much worse for multiple-vehicle collision. Many previous autonomous driving research focused solely on collision avoidance strategies for two consecutive vehicles. This study proposes a centralised control strategy for multiple vehicles using reinforcement learning focused on partner consideration and goal attainment. The system depicted as a group of vehicles are communicate and coordinate each others by a set of rays and maintain a short following move away. In order to address this challenge, a simulation was implemented in the Unity3D game engine and two state-of-the-art RL algorithms PPO (Proximal Policy Optimization) and SAC (Soft Actor-Critic) were trained by an agent using Unity ML-Agents Toolkit. In terms of success rate, performance, training speed and stability two algorithms are comparable. The potency of algorithms has been assessed by the traffic flow (1) change in vehicle speed, (2) differ in the vehicle beginning positions, and (3) switch to next lane. The agent performed similarly at a 91% success rate in PPO or SAC applications
DOI: 10.1109/icsecs52883.2021.00043
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_________::ed5511e756e1de5b69b2a738180e2570
https://doi.org/10.1109/icsecs52883.2021.00043
Rights: CLOSED
رقم الانضمام: edsair.doi...........ed5511e756e1de5b69b2a738180e2570
قاعدة البيانات: OpenAIRE
الوصف
DOI:10.1109/icsecs52883.2021.00043