التفاصيل البيبلوغرافية
العنوان: |
Causally aware reinforcement learning agents for autonomous cyber defence |
المؤلفون: |
Tom Purves, Kostas Kyriakopoulos, Sian Jenkins, Iain Phillips, Tim Dudman |
سنة النشر: |
2024 |
المجموعة: |
Loughborough University: Figshare |
مصطلحات موضوعية: |
Commerce, management, tourism and services, Information and computing sciences, Artificial intelligence, Data management and data science, Machine learning, Psychology, Autonomous cyber defence, Reinforcement learning, Casual inference, Structural Causal Model |
الوصف: |
Artificial Intelligence (AI) is seen as a disruptive solution to the ever increasing security threats on network infrastructures. To automate the process of defending networked environments from such threats, approaches such as Reinforcement Learning (RL) have been used to train agents in cyber adversarial games. One primary challenge is how contextual information could be integrated into RL models to create agents which adapt their behaviour to adversarial posture. Two desirable characteristics identified for such models are that they should be interpretable and causal.To address this challenge, we propose an approach through the integration of a causal rewards model with a modified Proximal Policy Optimisation (PPO) agent in Meta’s MBRL-Lib framework. Our RL agents are trained and evaluated against a range of cyber-relevant scenarios in the Dstl YAWNING-TITAN (YT) environment. We have constructed and experimented with two types of reward functions to facilitate the agent’s learning process. Evaluation metrics include, among others, games won by the defence agent (blue wins), episode length, healthy nodes and isolated nodes.Results show that, over all scenarios, our causally aware agent achieves better performance than causally-blind state-of-the-art benchmarks in these scenarios for the above evaluation metrics. In particular, with our proposed High Value Target (HVT) rewards function, which aims not to disrupt HVT nodes, the number of isolated nodes is improved by 17% and 18% against the model-free and Neural Network (NN) model-based agents across all scenarios. More importantly, the overall performance improvement for the blue wins metric exceeded that of model-free and NN model-based agents by 40% and 17%, respectively, across all scenarios. |
نوع الوثيقة: |
article in journal/newspaper |
اللغة: |
unknown |
Relation: |
2134/27014740.v1; https://figshare.com/articles/journal_contribution/Causally_aware_reinforcement_learning_agents_for_autonomous_cyber_defence/27014740 |
الاتاحة: |
https://figshare.com/articles/journal_contribution/Causally_aware_reinforcement_learning_agents_for_autonomous_cyber_defence/27014740 |
Rights: |
CC BY 4.0 |
رقم الانضمام: |
edsbas.F24ADF30 |
قاعدة البيانات: |
BASE |