Publications

Hicks C, Mavroudis V, Foley M, Davies T, Highnam K, Watson Tet al., 2023, Canaries and Whistles: Resilient Drone Communication Networks with (or without) Deep Reinforcement Learning, Pages: 91-101

Communication networks able to withstand hostile environments are critically important for disaster relief operations. In this paper, we consider a challenging scenario where drones have been compromised in the supply chain, during their manufacture, and harbour malicious software capable of wide-ranging and infectious disruption. We investigate multi-Agent deep reinforcement learning as a tool for learning defensive strategies that maximise communications bandwidth despite continual adversarial interference. Using a public challenge for learning network resilience strategies, we propose a state-of-The-Art symbolic technique and study its superiority over deep reinforcement learning agents. Correspondingly, we identify three specific methods for improving the performance of our neural agents: (1) ensuring each observation contains the necessary information, (2) using symbolic agents to provide a curriculum for learning, and (3) paying close attention to reward. We apply our methods and present a new mixed strategy enabling symbolic and neural agents to work together and improve on all prior results.

Abstract
Cite
Citations: 1

Conference paper

Highnam K, Hanif Z, Van Vogt E, Parbhoo S, Maffeis S, Jennings NRet al., 2023, Adaptive Experimental Design for Intrusion Data Collection, CEUR Workshop Proceedings, Vol: 3652, Pages: 134-151, ISSN: 1613-0073

Intrusion research frequently collects data on attack techniques currently employed and their potential symptoms. This includes deploying honeypots, logging events from existing devices, employing a red team for a sample attack campaign, or simulating system activity. However, these observational studies do not clearly discern the cause-and-effect relationships between the design of the environment and the data recorded. Neglecting such relationships increases the chance of drawing biased conclusions due to unconsidered factors, such as spurious correlations between features and errors in measurement or classification. In this paper, we present the theory and empirical data on methods that aim to discover such causal relationships efficiently. Our adaptive design (AD) is inspired by the clinical trial community: a variant of a randomized control trial (RCT) to measure how a particular “treatment” affects a population. To contrast our method with observational studies and RCT, we run the first controlled and adaptive honeypot deployment study, identifying the causal relationship between an ssh vulnerability and the rate of server exploitation. We demonstrate that our AD method decreases the total time needed to run the deployment by at least 33%, while still confidently stating the impact of our change in the environment. Compared to an analogous honeypot study with a control group, our AD requests 17% fewer honeypots while collecting 19% more attack recordings than an analogous honeypot study with a control group.

Abstract
Cite

Journal article

Foley M, Hicks C, Highnam K, Mavroudis Vet al., 2022, Autonomous network defence using reinforcement learning, ASIA CCS '22: ACM Asia Conference on Computer and Communications Security, Publisher: ACM, Pages: 1252-1254

In the network security arms race, the defender is significantly disadvantaged as they need to successfully detect and counter every malicious attack. In contrast, the attacker needs to succeed only once. To level the playing field, we investigate the effectiveness of autonomous agents in a realistic network defence scenario. We first outline the problem, provide the background on reinforcement learning and detail our proposed agent design. Using a network environment simulation, with 13 hosts spanning 3 subnets, we train a novel reinforcement learning agent and show that it can reliably defend continual attacks by two advanced persistent threat (APT) red agents: one with complete knowledge of the network layout and another which must discover resources through exploration but is more general.

Conference paper

Highnam K, Arulkumaran K, Hanif Z, Jennings Net al., 2021, BETH dataset: real cybersecurity data for anomaly detection research, Publisher: Gatsby Computational Neuroscience Unit

We present the BETH cybersecurity dataset for anomaly detection and out-of-distribution analysis. With real “anomalies” collected using a novel tracking system, our dataset contains over eight million data points tracking 23 hosts. Each host has captured benign activity and, at most, a single attack, enabling cleaner behavioural analysis. In addition to being one of the most modern and extensive cybersecurity datasets available, BETH enables the development of anomaly detection algorithms on heterogeneously-structured real-world data, with clear downstream applications. We give details on the data collection, suggestions on pre-processing, and analysis with initial anomaly detection benchmarks on a subset of the data.

Working paper

Highnam K, Puzio D, Luo S, Jennings NRet al., 2021, Real-time detection of dictionary DGA network traffic using deep learning., SN Computer Science, Vol: 2, Pages: 110-110, ISSN: 2661-8907

Botnets and malware continue to avoid detection by static rule engines when using domain generation algorithms (DGAs) for callouts to unique, dynamically generated web addresses. Common DGA detection techniques fail to reliably detect DGA variants that combine random dictionary words to create domain names that closely mirror legitimate domains. To combat this, we created a novel hybrid neural network, Bilbo the “bagging” model, that analyses domains and scores the likelihood they are generated by such algorithms and therefore are potentially malicious. Bilbo is the first parallel usage of a convolutional neural network (CNN) and a long short-term memory (LSTM) network for DGA detection. Our unique architecture is found to be the most consistent in performance in terms of AUC, F1 score, and accuracy when generalising across different dictionary DGA classification tasks compared to current state-of-the-art deep learning architectures. We validate using reverse-engineered dictionary DGA domains and detail our real-time implementation strategy for scoring real-world network logs within a large enterprise. In 4 h of actual network traffic, the model discovered at least five potential command-and-control networks that commercial vendor tools did not flag.

Journal article

Lengyel D, Petangoda J, Falk I, Highnam K, Lazarou M, Kolbeinsson A, Deisenroth MP, Jennings NRet al., 2020, GENNI: Visualising the geometry of equivalences for neural network identifiability, Publisher: arXiv

We propose an efficient algorithm to visualise symmetries in neural networks.Typically, models are defined with respect to a parameter space, wherenon-equal parameters can produce the same input-output map. Our proposedmethod, GENNI, allows us to efficiently identify parameters that arefunctionally equivalent and then visualise the subspace of the resultingequivalence class. By doing so, we are now able to better explore questionssurrounding identifiability, with applications to optimisation andgeneralizability, for commonly used or newly developed neural networkarchitectures.

Working paper

Highnam K, Puzio D, Luo S, Jennings NRet al., 2020, Real-time detection of dictionary DGA network traffic using deep learning, Publisher: arXiv

Botnets and malware continue to avoid detection by static rules engines whenusing domain generation algorithms (DGAs) for callouts to unique, dynamicallygenerated web addresses. Common DGA detection techniques fail to reliablydetect DGA variants that combine random dictionary words to create domain namesthat closely mirror legitimate domains. To combat this, we created a novelhybrid neural network, Bilbo the `bagging` model, that analyses domains andscores the likelihood they are generated by such algorithms and therefore arepotentially malicious. Bilbo is the first parallel usage of a convolutionalneural network (CNN) and a long short-term memory (LSTM) network for DGAdetection. Our unique architecture is found to be the most consistent inperformance in terms of AUC, F1 score, and accuracy when generalising acrossdifferent dictionary DGA classification tasks compared to currentstate-of-the-art deep learning architectures. We validate usingreverse-engineered dictionary DGA domains and detail our real-timeimplementation strategy for scoring real-world network logs within a largefinancial enterprise. In four hours of actual network traffic, the modeldiscovered at least five potential command-and-control networks that commercialvendor tools did not flag.

Working paper

Highnam K, Angstadt K, Leach K, Weimer W, Paulos A, Hurley Pet al., 2016, An Uncrewed Aerial Vehicle Attack Scenario and Trustworthy Repair Architecture, 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W), Publisher: IEEE, Pages: 222-225, ISSN: 2325-6648

Author Web Link
Cite
Citations: 17

Conference paper

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

MissKateHighnam

Contact

Location

Summary