Reinforcement co-Learning of Deep and Spiking Neural Networks for Energy-Efficient Mapless Navigation with Neuromorphic Hardware

التفاصيل البيبلوغرافية
العنوان: Reinforcement co-Learning of Deep and Spiking Neural Networks for Energy-Efficient Mapless Navigation with Neuromorphic Hardware
المؤلفون: Neelesh Kumar, Konstantinos P. Michmizos, Guangzhi Tang
المصدر: IROS
بيانات النشر: arXiv, 2020.
سنة النشر: 2020
مصطلحات موضوعية: FOS: Computer and information sciences, Computer Science - Machine Learning, Computer science, 02 engineering and technology, 010501 environmental sciences, 01 natural sciences, Machine Learning (cs.LG), Computer Science - Robotics, 0202 electrical engineering, electronic engineering, information engineering, Neural and Evolutionary Computing (cs.NE), Reinforcement, 0105 earth and related environmental sciences, Spiking neural network, business.industry, Computer Science - Neural and Evolutionary Computing, Mobile robot, 020202 computer hardware & architecture, Neuromorphic engineering, Benchmark (computing), Robot, Artificial intelligence, Gradient descent, business, Feature learning, Robotics (cs.RO), Efficient energy use
الوصف: Energy-efficient mapless navigation is crucial for mobile robots as they explore unknown environments with limited on-board resources. Although the recent deep reinforcement learning (DRL) approaches have been successfully applied to navigation, their high energy consumption limits their use in several robotic applications. Here, we propose a neuromorphic approach that combines the energy-efficiency of spiking neural networks with the optimality of DRL and benchmark it in learning control policies for mapless navigation. Our hybrid framework, spiking deep deterministic policy gradient (SDDPG), consists of a spiking actor network (SAN) and a deep critic network, where the two networks were trained jointly using gradient descent. The co-learning enabled synergistic information exchange between the two networks, allowing them to overcome each other's limitations through a shared representation learning. To evaluate our approach, we deployed the trained SAN on Intel's Loihi neuromorphic processor. When validated on simulated and real-world complex environments, our method on Loihi consumed 75 times less energy per inference as compared to DDPG on Jetson TX2, and also exhibited a higher rate of successful navigation to the goal, which ranged from 1% to 4.2% and depended on the forward-propagation timestep size. These results reinforce our ongoing efforts to design brain-inspired algorithms for controlling autonomous robots with neuromorphic hardware.
Comment: 8 pages, 7 figures
DOI: 10.48550/arxiv.2003.01157
URL الوصول: https://explore.openaire.eu/search/publication?articleId=doi_dedup___::01cbadb89b3e3d5fd688ead3c90b5a37
Rights: OPEN
رقم الانضمام: edsair.doi.dedup.....01cbadb89b3e3d5fd688ead3c90b5a37
قاعدة البيانات: OpenAIRE
الوصف
DOI:10.48550/arxiv.2003.01157