Publications

Ardon L, Furelos-Blanco D, Russo A, 2024, Learning Reward Machines in Cooperative Multi-agent Tasks, Pages: 43-59, ISSN: 0302-9743

This paper presents a novel approach to Multi-Agent Reinforcement Learning (MARL) that combines cooperative task decomposition with the learning of Reward Machines (RMs) encoding the structure of the sub-tasks. The proposed method helps deal with the non-Markovian nature of the rewards in partially observable environments and improves the interpretability of the learnt policies required to complete a cooperative task. The RMs associated with the sub-tasks are learnt in a decentralised manner and then used to guide the behaviour of each agent in a team acting towards a common goal. By doing so, the complexity of a cooperative multi-agent problem is reduced, allowing for more effective learning. The results suggest that our approach is a promising direction for future research in cooperative MARL, especially in complex and partially observable environments.

Abstract
Cite

Conference paper

Belle V, Fisher M, Russo A, Komendantskaya E, Nottle Aet al., 2024, Neuro-Symbolic AI + Agent Systems: A First Reflection on Trends, Opportunities and Challenges, Pages: 180-200, ISSN: 0302-9743

To get one step closer to “human-like” intelligence, we need systems capable of seamlessly combining the neural learning power of symbolic feature extraction from raw data with sophisticated symbolic inference mechanisms for reasoning about “high-level” concepts. It is important to also incorporate existing prior knowledge about a given problem domain, especially since modern machine learning frameworks are typically data-hungry. Recently the field of neuro-symbolic AI has emerged as a promising paradigm for precisely such an integration. However, coming up with a single, clear, concise definition of this area is not an easy task. There are plenty of variations on this topic, and there is no “one true way” that the community can coalesce around. Recently, a workshop was organized at AAMAS-2023 (London, UK) to discuss how this definition should be broadened to also consider reasoning about agents. This article is a collection of ideas, opinions, and positions from computer scientists who were invited for a panel discussion at the workshop. This collection is not meant to be comprehensive but is rather intended to stimulate further conversation on the field of “Neuro-Symbolic Multi-Agent Systems.”

Abstract
Cite

Conference paper

Costantini S, Pontelli E, Calegari R, Dodaro C, Fabiano F, Gaggl S, Garcez A, Mileo A, Russo A, Toni Fet al., 2023, Preface, Electronic Proceedings in Theoretical Computer Science, EPTCS, Vol: 385, ISSN: 2075-2180

Cite

Journal article

Furelos-Blanco D, Law M, Jonsson A, Broda K, Russo Aet al., 2023, Hierarchies of reward machines, International Conference on Machine Learning, Publisher: PMLR, ISSN: 2640-3498

Reward machines (RMs) are a recent formalismfor representing the reward function of a reinforcement learning task through a finite-state machinewhose edges encode subgoals of the task usinghigh-level events. The structure of RMs enablesthe decomposition of a task into simpler and independently solvable subtasks that help tackle longhorizon and/or sparse reward tasks. We proposea formalism for further abstracting the subtaskstructure by endowing an RM with the ability tocall other RMs, thus composing a hierarchy ofRMs (HRM). We exploit HRMs by treating eachcall to an RM as an independently solvable subtask using the options framework, and describea curriculum-based method to learn HRMs fromtraces observed by the agent. Our experimentsreveal that exploiting a handcrafted HRM leadsto faster convergence than with a flat HRM, andthat learning an HRM is feasible in cases whereits equivalent flat representation is not.

Conference paper

Baugh KG, Cingillioglu N, Russo A, 2023, Neuro-symbolic rule learning in real-world classification tasks, AAAI-MAKE 2023, Publisher: CEUR, Pages: 3210-3220, ISSN: 1613-0073

Neuro-symbolic rule learning has attracted lots of attention as it offers better interpretability than pure neural models and scales better than symbolic rule learning. A recent approach named pix2rule proposes a neural Disjunctive Normal Form (neural DNF) module to learn symbolic rules with feed-forward layers. Although proved to be effective in synthetic binary classification, pix2rule has not been applied to more challenging tasks such as multi-label and multi-class classifications over real-world data. In this paper, we address this limitation by extending the neural DNF module to (i) support rule learning in real-world multi-class and multi-label classification tasks, (ii) enforce the symbolic property of mutual exclusivity (i.e. predicting exactly one class) in multi-class classification, and (iii) explore its scalability over large inputs and outputs. We train a vanilla neural DNF model similar to pix2rule's neural DNF module for multi-label classification, and we propose a novel extended model called neural DNF-EO (Exactly One) which enforces mutual exclusivity in multi-class classification. We evaluate the classification performance, scalability and interpretability of our neural DNF-based models, and compare them against pure neural models and a state-of-the-art symbolic rule learner named FastLAS. We demonstrate that our neural DNF-based models perform similarly to neural networks, but provide better interpretability by enabling the extraction of logical rules. Our models also scale well when the rule search space grows in size, in contrast to FastLAS, which fails to learn in multi-class classification tasks with 200 classes and in all multi-label settings.

Conference paper

COSTANTINI S, PONTELLI E, RUSSO A, TONI Fet al., 2023, Introduction to the 39th International Conference on Logic Programming Special Issue, Theory and Practice of Logic Programming, Pages: 1-8, ISSN: 1471-0684

Journal article

Al-Negheimish H, Madhyastha P, Russo A, 2023, Towards preserving word order importance through Forced Invalidation, EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Publisher: Association for Computational Linguistics, Pages: 2555-2562

Large pre-trained language models such as BERT have been widely used as a framework for natural language understanding (NLU) tasks. However, recent findings have revealed that pre-trained language models are insensitive to word order. The performance on NLU tasks remains unchanged even after randomly permuting the word of a sentence, where crucial syntactic information is destroyed. To help preserve the importance of word order, we propose a simple approach called FORCED INVALIDATION (FI): forcing the model to identify permuted sequences as invalid samples. We perform an extensive evaluation of our approach on various English NLU and QA based tasks over BERT-based and attention-based models over word embeddings. Our experiments demonstrate that FI significantly improves the sensitivity of the models to word order.

Conference paper

Jabal AA, Bertino E, Lobo J, Verma D, Calo S, Russo Aet al., 2023, FLAP - A Federated Learning Framework for Attribute-based Access Control Policies, Pages: 263-272

Technology advances in areas such as sensors, IoT, and robotics, enable new collaborative applications (e.g., autonomous devices). A primary requirement for such collaborations is to have a secure system that enables information sharing and information flow protection. A policy-based management system is a key mechanism for secure selective sharing of protected resources. However, policies in each party of a collaborative environment cannot be static as they have to adapt to different contexts and situations. One advantage of collaborative applications is that each party in the collaboration can take advantage of the knowledge of the other parties for learning or enhancing its own policies. We refer to this learning mechanism as policy transfer. The design of a policy transfer framework has challenges, including policy conflicts and privacy issues. Policy conflicts typically arise because of differences in the obligations of the parties, whereas privacy issues result because of data sharing constraints for sensitive data. Hence, the policy transfer framework should be able to tackle such challenges by considering minimal sharing of data and supporting policy adaptation to address conflict. In the paper, we propose a framework that aims at addressing such challenges. We introduce a formal definition of the policy transfer problem for attribute-based access control policies. We then introduce the transfer methodology which consists of three sequential steps. Finally, we report experimental results.

Abstract
Cite
Citations: 2

Conference paper

Cunnington D, Law M, Lobo J, Russo Aet al., 2023, FFNSL: feed-forward neural-symbolic learner, Machine Learning, Vol: 112, Pages: 515-569, ISSN: 0885-6125

Logic-based machine learning aims to learn general, interpretable knowledge in a data-efficient manner. However, labelled data must be specified in a structured logical form. To address this limitation, we propose a neural-symbolic learning framework, called Feed-Forward Neural-Symbolic Learner (FFNSL), that integrates a logic-based machine learning system capable of learning from noisy examples, with neural networks, in order to learn interpretable knowledge from labelled unstructured data. We demonstrate the generality of FFNSL on four neural-symbolic classification problems, where different pre-trained neural network models and logic-based machine learning systems are integrated to learn interpretable knowledge from sequences of images. We evaluate the robustness of our framework by using images subject to distributional shifts, for which the pre-trained neural networks may predict incorrectly and with high confidence. We analyse the impact that these shifts have on the accuracy of the learned knowledge and run-time performance, comparing FFNSL to tree-based and pure neural approaches. Our experimental results show that FFNSL outperforms the baselines by learning more accurate and interpretable knowledge with fewer examples.

Journal article

Ielo A, Law M, Fionda V, Ricca F, De Giacomo G, Russo Aet al., 2023, Towards ILP-Based LTL<inf>f</inf> Passive Learning, Pages: 30-45, ISSN: 0302-9743

Inferring a LTLf formula from a set of example traces, also known as passive learning, is a challenging task for model-based techniques. Despite the combinatorial nature of the problem, current state-of-the-art solutions are based on exhaustive search. They use an example at the time to discard a single candidate formula at the time, instead of exploiting the full set of examples to prune the search space. This hinders their applicability when examples involve many atomic propositions or when the target formula is not small. This short paper proposes the first ILP-based approach for learning LTLf formula from a set of example traces, using a learning from answer sets system called ILASP. It compares it to both pure SAT-based techniques and the exhaustive search method. Preliminary experimental results show that our approach improves on previous SAT-based techniques and that has the potential to overcome the limitation of an exhaustive search by optimizing over the full set of examples. Further research directions for the ILP-based LTLf passive learning problem are also discussed.

Abstract
Cite
Citations: 1

Conference paper

Rader AP, Russo A, 2023, Active Learning in Neurosymbolic AI with Embed2Sym, ISSN: 1613-0073

Neurosymbolic AI combines neural networks with symbolic reasoners in an effort to create robust and logical machine learning frameworks. In one approach, a neural component processes raw data and outputs latent concepts. A symbolic component then conducts logical reasoning with the concepts to produce the final result. A major hurdle lies in the propagation of the end label signal to the latent space when no latent labels are available. We investigate the use of active learning to alleviate this problem. In particular, we consider the neurosymbolic framework Embed2Sym. We adapt the learning framework to incorporate active learning by gaining a latent learning signal for misclassified examples. An oracle, such as a human in the loop, provides latent labels, which are used to finetune the neural component. Using the same benchmark datasets as the original paper, we empirically evaluate our method. We demonstrate that even a small amount of labelled latent data leads to a sizeable increase in accuracy.

Abstract
Cite

Conference paper

Cunnington D, Law M, Lobo J, Russo Aet al., 2023, Neuro-symbolic learning of answer set programs from raw data, IJCAI 2023, Publisher: International Joint Conferences on Artificial Intelligence, Pages: 3586-3596, ISSN: 1045-0823

One of the ultimate goals of Artificial Intelligence is to assist humans in complex decision making. A promising direction for achieving this goal is Neuro-Symbolic AI, which aims to combine the interpretability of symbolic techniques with the ability of deep learning to learn from raw data. However, most current approaches require manually engineered symbolic knowledge, and where end-to-end training is considered, such approaches are either restricted to learning definite programs, or are restricted to training binary neural networks. In this paper, we introduce Neuro-Symbolic Inductive Learner (NSIL), an approach that trains a general neural network to extract latent concepts from raw data, whilst learning symbolic knowledge that maps latent concepts to target labels. The novelty of our approach is a method for biasing the learning of symbolic knowledge, based on the in-training performance of both neural and symbolic components. We evaluate NSIL on three problem domains of different complexity, including an NP-complete problem. Our results demonstrate that NSIL learns expressive knowledge, solves computationally complex problems, and achieves state-of-the-art performance in terms of accuracy and data efficiency. Code and technical appendix: https://github.com/DanCunnington/NSIL.

Conference paper

Mekhtieva RL, Forbes B, Alrajeh D, Delaney B, Russo Aet al., 2023, RECAP-KG: Mining Knowledge Graphs from Raw Primary Care Physician Notes for Remote COVID-19 Assessment in Primary Care., AMIA Annu Symp Proc, Vol: 2023, Pages: 1145-1154

Building Clinical Decision Support Systems, whether from regression models or machine learning requires clinical data either in standard terminology or as text for Natural Language Processing (NLP). Unfortunately, many clinical notes are written quickly during the consultation and contain many abbreviations, typographical errors, and a lack of grammar and punctuation Processing these highly unstructured clinical notes is an open challenge for NLP that we address in this paper. We present RECAP-KG - a knowledge graph construction frame workfrom primary care clinical notes. Our framework extracts structured knowledge graphs from the clinical record by utilising the SNOMED-CT ontology both the entire finding hierarchy and a COVID-relevant curated subset. We apply our framework to consultation notes in the UK COVID-19 Clinical Assessment Service (CCAS) dataset and provide a quantitative evaluation of our framework demonstrating that our approach has better accuracy than traditional NLP methods when answering questions about patients.

Journal article

Stromfelt H, Dickens L, Garcez A, Russo Aet al., 2022, Formalizing consistency and coherence of representation learning, 36th Conference on Neural Information Processing Systems (NeurIPS 2022), ISSN: 1049-5258

In the study of reasoning in neural networks, recent efforts have sought to improve consistency and coherence of sequence models, leading to important developments in the area of neuro-symbolic AI. In symbolic AI, the concepts of consistency and coherence can be defined and verified formally, but for neural networks these definitions are lacking. The provision of such formal definitions is crucial to offer a common basis for the quantitative evaluation and systematic comparison of connectionist, neuro-symbolic and transfer learning approaches. In this paper, we introduce formal definitions of consistency and coherence for neural systems. To illustrate the usefulness of our definitions, we propose a new dynamic relation-decoder model built around the principles of consistency and coherence. We compare our results with several existing relation-decoders using a partial transfer learning task based on a novel data set introduced in this paper. Our experiments show that relation-decoders that maintain consistency over unobserved regions of representation space retain coherence across domains, whilst achieving better transfer learning performance.

Conference paper

Russo A, Dickens L, Stromfelt H, Garcez Aet al., 2022, Formalizing Coherence and Consistency Applied to Transfer Learning in Neuro-Symbolic Autoencoders, Thirty-sixth Conference on Neural Information Processing Systems

Cite

Conference paper

Aspis Y, Broda K, Lobo J, Russo Aet al., 2022, Embed2Sym - scalable neuro-symbolic reasoning via clustered embeddings, The 19th International Conference on Principles of Knowledge Representation and Reasoning, Publisher: K Proceedings, Pages: 421-431

Neuro-symbolic reasoning approaches proposed in recent years combine a neural perception component with a symbolic reasoning component to solve a downstream task. By doing so, these approaches can provide neural networks with symbolic reasoning capabilities, improve their interpretability and enable generalization beyond the training task. However, this often comes at the cost of poor training time, with potential scalability issues. In this paper, we propose a scalable neuro-symbolic approach, called Embed2Sym. We complement a two-stage (perception and reasoning) neural network architecture designed to solve a downstream task end-to-end with a symbolic optimisation method for extracting learned latent concepts. Specifically, the trained perception network generates clusters in embedding space that are identified and labelled using symbolic knowledge and a symbolic solver. With the latent concepts identified, a neuro-symbolic model is constructed by combining the perception network with the symbolic knowledge of the downstream task, resulting in a model that is interpretable and transferable. Our evaluation shows that Embed2Sym outperforms state-of-the-art neuro-symbolic systems on benchmark tasks in terms of training time by several orders of magnitude while providing similar if not better accuracy.

Conference paper

Law M, Broda K, Russo A, 2022, Search space expansion for efficient incremental inductive logic programming from streamed data, THE 31ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, Pages: 2697-2704, ISSN: 1045-0823

In the past decade, several systems for learning Answer Set Programs (ASP) have been proposed, including the recent FastLAS system. Compared to other state-of-the-art approaches to learning ASP, FastLAS is more scalable, as rather than computing the hypothesis space in full, it computes a much smaller subset relative to a given set of examples that is nonetheless guaranteed to contain an optimal solution to the task (called an OPT-sufficient subset). On the other hand, like many other Inductive Logic Programming (ILP) systems, FastLAS is designed to be run on a fixed learning task meaning that if new examples are discovered after learning, the whole process must be run again. In many real applications, data arrives in a stream. Rerunning an ILP system from scratch each time new examples arrive is inefficient. In this paper we address this problem by presenting IncrementalLAS, a system that uses a new technique, called hypothesis space expansion, to enable a FastLAS-like OPT-sufficient subset to be expanded each time new examples are discovered. We prove that this preserves FastLAS's guarantee of finding an optimal solution to the full task (including the new examples), while removing the need to repeat previous computations. Through our evaluation, we demonstrate that running IncrementalLAS on tasks updated with sequences of new examples is significantly faster than re-running FastLAS from scratch on each updated task.

Conference paper

Mitchener L, Tuckey D, Crosby M, Russo Aet al., 2022, Detect, understand, act: a neuro-symbolic hierarchical reinforcement learning framework (extended abstract), THE 31ST INTERNATIONAL JOINT CONFERENCE ON ARTIFICIAL INTELLIGENCE, Publisher: IJCAI, Pages: 5314-5318, ISSN: 1045-0823

We introduce Detect, Understand, Act (DUA), a neuro-symbolic reinforcement learning framework. The Detect component is composed of a traditional computer vision object detector and tracker. The Act component houses a set of options, high-level actions enacted by pre-trained deep reinforcement learning (DRL) policies. The Understand component provides a novel answer set programming (ASP) paradigm for effectively learning symbolic meta-policies over options using inductive logic programming (ILP). We evaluate our framework on the Animal-AI (AAI) competition testbed, a set of physical cognitive reasoning problems. Given a set of pre-trained DRL policies, DUA requires only a few examples to learn a meta-policy that allows it to improve the state-of-the-art on multiple of the most challenging categories from the testbed. DUA constitutes the first holistic hybrid integration of computer vision, ILP and DRL applied to an AAI-like environment and sets the foundations for further use of ILP in complex DRL challenges.

Conference paper

Mitchener L, Tuckey D, Crosby M, Russo Aet al., 2022, Detect, understand, act: a neuro-symbolic hierarchical reinforcement learning framework, Machine Learning, Vol: 111, Pages: 1523-1549, ISSN: 0885-6125

In this paper we introduce Detect, Understand, Act (DUA), a neuro-symbolic reinforcement learning framework. The Detect component is composed of a traditional computer vision object detector and tracker. The Act component houses a set of options, high-level actions enacted by pre-trained deep reinforcement learning (DRL) policies. The Understand component provides a novel answer set programming (ASP) paradigm for symbolically implementing a meta-policy over options and effectively learning it using inductive logic programming (ILP). We evaluate our framework on the Animal-AI (AAI) competition testbed, a set of physical cognitive reasoning problems. Given a set of pre-trained DRL policies, DUA requires only a few examples to learn a meta-policy that allows it to improve the state-of-the-art on multiple of the most challenging categories from the testbed. DUA constitutes the first holistic hybrid integration of computer vision, ILP and DRL applied to an AAI-like environment and sets the foundations for further use of ILP in complex DRL challenges.

Journal article

Tuckey D, Broda K, Russo A, 2022, A semantics for probabilistic answer set programs with incomplete stochastic knowledge, CEUR Workshop, Publisher: CEUR Workshop Proceedings, Pages: 1-14, ISSN: 1613-0073

Some probabilistic answer set programs (PASP) semantics assign probabilities to sets of answer sets and implicitly assume these answer sets to be equiprobable. While this is a common choice in probability theory, it leads to unnatural behaviours with PASPs. We argue that the user should have a level of control over what assumption is used to obtain a probability distribution when the stochastic knowledge is incomplete. To this end, we introduce the Incomplete Knowledge Semantics (IKS) for probabilistic answer set programs. We take inspiration from the field of decision making under ignorance. Given a cost function, represented by a user-defined ordering over answer sets through weak constraints, we use the notion of Ordered Weighted Averaging (OWA) operator to distribute the probability over a set of answer sets accordingly to the user’s level of optimism. The more optimistic (or pessimistic) a user is, the more (or less) probability is assigned to the more optimal answer sets. We present an implementation and showcase the behaviour of this semantics on simple examples. We also highlight the impact that different OWA operators have on weight learning, showing that the equiprobability assumption is not always the best option.

Conference paper

Cunnington D, Law M, Lobo J, Russo Aet al., 2022, Inductive Learning of Complex Knowledge from Raw Data, ISSN: 1613-0073

One of the ultimate goals of Artificial Intelligence is to learn generalised and human-interpretable knowledge from raw data. Neuro-symbolic reasoning approaches partly tackle this problem by improving the training of a neural network using a manually engineered symbolic knowledge base. In the case where symbolic knowledge is learned from raw data, this knowledge lacks the expressivity required to solve complex problems. In this paper, we introduce Neuro-Symbolic Inductive Learner (NSIL), an approach that trains a neural network to extract latent concepts from raw data, whilst learning symbolic knowledge that solves complex problems, defined in terms of these latent concepts. The novelty of our approach is a method for biasing a symbolic learner to learn improved knowledge, based on the in-training performance of both neural and symbolic components. We evaluate NSIL on two problem domains that require learning knowledge with different levels of complexity, and demonstrate that NSIL learns knowledge that is not possible to learn with other neuro-symbolic systems, whilst outperforming baseline models in terms of accuracy and data efficiency.

Abstract
Cite

Conference paper

Russo A, Law M, Cunnington D, Furelos-Blanco D, Broda Ket al., 2022, Logic-Based Machine Learning: Recent Advances and Their Role in Neuro-Symbolic AI, 16th International Conference on Logic Programming and Non-Monotonic Reasoning (LPNMR), Publisher: SPRINGER INTERNATIONAL PUBLISHING AG, Pages: XVIII-XXI, ISSN: 0302-9743

Conference paper

Law M, Russo A, Broda K, Bertino Eet al., 2021, Scalable non-observational predicate learning in ASP, IJCAI, Publisher: IJCAI, Pages: 1936-1943, ISSN: 1045-0823

Recently, novel ILP systems under the answer set semantics have been proposed, some of which are robust to noise and scalable over large hypothesis spaces. One such system is FastLAS, which is significantly faster than other state-of-the-art ASP-based ILP systems. FastLAS is, however, only capable of Observational Predicate Learning (OPL),where the learned hypothesis defines predicates that are directly observed in the examples. It cannot learn knowledge that is indirectly observable, such as learning causes of observed events. This class of problems, known as non-OPL, is known to be difficult to handle in the context of non-monotonic semantics. Solving non-OPL learning tasks whilst preserving scalability is a challenging open problem. We address this problem with a new abductive method for translating examples of a non-OPL task to a set of examples, called possibilities, such that the original example is covered if at least one of the possibilities is covered. This new method al-lows an ILP system capable of performing OPL tasks to be “upgraded” to solve non-OPL tasks. In particular, we present our new FastNonOPL system, which upgrades FastLAS with the new possibility generation. We compare it to other state-of-the-art ASP-based ILP systems capable of solving non-OPL tasks, showing that FastNonOPL is significantly faster, and in many cases more accurate, than these other systems.

Conference paper

Koschate M, Naserian E, Dickens L, Stuart A, Russo A, Levine Met al., 2021, ASIA: Automated Social Identity Assessment using linguistic style, Behavior Research Methods, Vol: 53, Pages: 1762-1781, ISSN: 1554-351X

The various group and category memberships that we hold are at the heart of who we are. They have been shown to affect our thoughts, emotions, behavior, and social relations in a variety of social contexts, and have more recently been linked to our mental and physical well-being. Questions remain, however, over the dynamics between different group memberships and the ways in which we cognitively and emotionally acquire these. In particular, current assessment methods are missing that can be applied to naturally occurring data, such as online interactions, to better understand the dynamics and impact of group memberships in naturalistic settings. To provide researchers with a method for assessing specific group memberships of interest, we have developed ASIA (Automated Social Identity Assessment), an analytical protocol that uses linguistic style indicators in text to infer which group membership is salient in a given moment, accompanied by an in-depth open-source Jupyter Notebook tutorial (https://github.com/Identity-lab/Tutorial-on-salient-social-Identity-detection-model). Here, we first discuss the challenges in the study of salient group memberships, and how ASIA can address some of these. We then demonstrate how our analytical protocol can be used to create a method for assessing which of two specific group memberships—parents and feminists—is salient using online forum data, and how the quality (validity) of the measurement and its interpretation can be tested using two further corpora as well as an experimental study. We conclude by discussing future developments in the field.

Journal article

Furelos-Blanco D, Law M, Jonsson A, Broda K, Russo Aet al., 2021, Induction and exploitation of subgoal automata for reinforcement learning, Journal of Artificial Intelligence Research, Vol: 70, Pages: 1031-1116, ISSN: 1076-9757

In this paper we present ISA, an approach for learning and exploiting subgoals in episodic reinforcement learning (RL) tasks. ISA interleaves reinforcement learning with the induction of a subgoal automaton, an automaton whose edges are labeled by the task’s subgoals expressed as propositional logic formulas over a set of high-level events. A subgoal automaton also consists of two special states: a state indicating the successful completion of the task, and a state indicating that the task has finished without succeeding. A state-of-the-art inductive logic programming system is used to learn a subgoal automaton that covers the traces of high-level events observed by the RL agent. When the currently exploited automaton does not correctly recognize a trace, the automaton learner induces a new automaton that covers that trace. The interleaving process guarantees the induction of automata with the minimum number of states, and applies a symmetry breaking mechanism to shrink the search space whilst remaining complete. We evaluate ISA in several gridworld and continuous state space problems using different RL algorithms that leverage the automaton structures. We provide an in-depth empirical analysis of the automaton learning performance in terms of the traces, the symmetry breaking and specific restrictions imposed on the final learnable automaton. For each class of RL problem, we show that the learned automata can be successfully exploited to learn policies that reach the goal, achieving an average reward comparable to the case where automata are not learned but handcrafted and given beforehand.

Journal article

Cingillioglu N, Russo A, 2021, pix2rule: End-to-end Neuro-symbolic Rule Learning, 15th International Workshop on Neural-Symbolic Learning and Reasoning (NeSy) as part of the 1st International Joint Conference on Learning and Reasoning (IJCLR), Publisher: RWTH AACHEN, Pages: 15-56, ISSN: 1613-0073

Conference paper

Cunnington D, Law M, Russo A, Lobo J, Kaplan Let al., 2021, Towards Neural-Symbolic Learning to support Human-Agent Operations, 24th IEEE International Conference on Information Fusion (FUSION), Publisher: IEEE, Pages: 223-230

Conference paper

Furelos-Blanco D, Law M, Jonsson A, Broda K, Russo Aet al., 2021, Induction and Exploitation of Subgoal Automata for Reinforcement Learning, JOURNAL OF ARTIFICIAL INTELLIGENCE RESEARCH, Vol: 70, Pages: 1031-1116, ISSN: 1076-9757

Author Web Link
Cite
Citations: 4

Journal article

Al-Negheimish H, Madhyastha P, Russo A, 2021, Numerical reasoning in machine reading comprehension tasks: are we there yet?, Conference on Empirical Methods in Natural Language Processing (EMNLP), Publisher: ASSOC COMPUTATIONAL LINGUISTICS-ACL, Pages: 9643-9649

Conference paper

Drozdov A, Law M, Lobo J, Russo A, Don MWet al., 2021, Online Symbolic Learning of Policies for Explainable Security, 3rd EEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA), Publisher: IEEE COMPUTER SOC, Pages: 269-278

Author Web Link
Cite
Citations: 1

Conference paper

ProfessorAlessandraRusso

Contact

Location

Summary