Papers from the Department of Computing accepted at AAAI 2026

by Ruth Ntumba

South Kensington main entrance taken from Exhibition Road

We are pleased to announce that multiple papers co-authored by members of the Department of Computing were accepted to the 40th AAAI Conference on Artificial Intelligence (AAAI-26).

AAAI promotes research in Artificial Intelligence (AI) and fosters scientific exchange between researchers, practitioners, scientists, students, and engineers across AI and its affiliated disciplines. AAAI‑26 featured technical paper presentations, special tracks, speakers, workshops, tutorials, poster sessions, senior member presentations, competitions, exhibits, and other activities to be announced.

Papers accepted at AAAI 2026

Behaviour Policy Optimisation: Provably Lower Variance Return Estimates for Off-policy Reinforcement Learning

Authors: Alex Goodall, Dr Edwin Hamel-de le Court, Dr Francesco Belardinelli

Short summary: The paper focuses on improving sample efficiency and stability in reinforcement learning. It highlights that high-variance return estimates can hinder learning, but using well-designed behavior policies to collect off-policy data can reduce this variance. This approach challenges the traditional view that on-policy data collection is optimal for variance reduction. The paper extends these insights to online reinforcement learning, where policy evaluation and improvement occur simultaneously.

Expressive Temporal Specifications for Reward Monitoring

Authors: Omar Adalat, Dr Francesco Belardinelli

Short summary: This paper addresses the challenge of creating effective reward systems in reinforcement learning. It introduces a method using quantitative Linear Temporal Logic to generate continuous and detailed rewards, which help AI agents learn and make optimal decisions. This approach tackles the issue of sparse rewards in long-term decision-making and is versatile, being compatible with various algorithms. Empirical results demonstrate that these reward monitors outperform traditional Boolean-based systems in task completion and reducing learning time.

Feature Attribution for Human Sensing with Radio Signals

Authors: Shuokang Huang, Professor Julie A. McCann

Short summary: This paper introduces MatryMask, a new “Matryoshka-style” method for identifying which parts of radio signals matter most when sensing human activity. It is an early step toward making feature attribution more understandable in radio-based human sensing. Compared to existing methods that require empirical knowledge about the sparsity of important features, MatryMask regularises multiple masks to highlight salient areas at different scales, adapting to the uncertain and varying sparsity of important features in radio signals. To effectively perturb radio signals, this paper further devises a novel frequency-removal perturbation beyond existing spatial/time-domain perturbations.

Heterogeneous Graph Neural Networks for Assumption-Based Argumentation

Authors: Preesha Gehlot, Dr Anna Rapberger, Dr Fabrizio Russo, Professor Francesca Toni

Short summary: This paper reimagines how AI can resolve complex debates. It models a debate using Assumption-Based Argumentation (ABA), where different assumptions are pitched against each other based on rules, and then represents the structure of the debate as a graph. This graph representation enables Graph Neural Networks (GNNs) to learn directly from the debate’s structure. Trained on these graphs, the neural models can quickly predict how a debate will resolve, combining the rigour of formal reasoning and symbolic AI with the speed and scalability of neural method.

Argumentative Debates for Transparent Bias Detection

Authors: Dr Hamed Ayoobi, Dr Nico Potyka, Dr Anna Rapberger, Professor Francesca Toni

Short summary: As AI becomes more common in everyday life, it is increasingly important to ensure it does not unfairly disadvantage certain groups of people. While many existing methods detect bias in AI systems, most operate as “black boxes,” offering little insight into how their conclusions are reached. Transparency—being able to understand and explain decisions—is especially important for fairness, as these issues directly affect people.

In this paper, the authors introduce ABIDE, a new transparent approach to bias detection that frames the process as a structured debate. This debate is organised through a clear map of arguments that weighs evidence about how different groups perform in similar situations and how meaningful those comparisons are. The authors show that ABIDE performs comparably to existing methods while providing clearer and more understandable explanations of its results.

Beyond Fixed Tasks: Unsupervised Environment Design for Task-Level Pairs 

Authors: Dr Daniel Furelos-Blanco, Charles Pert, Frederik Kelbel, Dr Alex F. Spies, Professor Alessandra Russo, Michael Dennis

Short Summary: The paper looks at a core problem in training AI agents: they’re usually trained on a fixed task while the environment changes, but in the real world both the task and the environment vary. When tasks and environments are paired at random, most combinations turn out to be impossible, which wastes training time and limits what the agent can learn.

To address this, the paper introduces a method called ATLAS that automatically designs tasks and environments together. Instead of holding the task constant, ATLAS jointly creates task–environment pairs that are solvable and appropriately challenging and adapts them as the agent improves. The results show that co-designing tasks and environments leads to much more effective learning than random pairing, especially in difficult settings where workable combinations are rare.

Enhancing Binary Encoded Crime Linkage Analysis Using Siamese Network

Authors: Yicheng Zhan, Fahim Ahmed, Dr Amy Burrell, Prof Matthew Tonkin, Sarah Galambos, Professor Jessica Woodhams, Dr Dalal Alrajeh

Short Summary: Linking crimes that may have been committed by the same offender is important for identifying serial criminals and improving public safety. Traditional crime linkage methods struggle because crime data is often very complex: it contains many variables, lots of missing information, and different types of data (such as behavior, location, and time).

To address this, the paper introduces a Siamese Autoencoder, a type of machine-learning model designed to learn compact, meaningful representations of crimes and identify patterns that suggest links between them. The model is trained using data from the Violent Crime Linkage Analysis System (ViCLAS), which is maintained by the UK National Crime Agency.

Dr Sergio Maffeis' group presented a paper at an affiliated workshop, the AAAI 2026 Workshop on Trust and Control in Agentic AI (TrustAgent):

Blue Teaming Function-Calling Agents

Authors: Greta Dolcetti, Dr Giulio Zizzo, Dr Sergio Maffeis

Short summary: Function-calling agents extend the capabilities of Large Language Models (LLMs) by enabling them to perform actions and interact with their environment, increasing their flexibility beyond text generation alone.

In this paper, the authors present an experimental evaluation of the robustness of four open-source LLMs with claimed function-calling capabilities against three different attacks, and assess the effectiveness of eight defence mechanisms. The results show that these models are not safe by default and that the proposed defences are not yet suitable for real-world deployment.

Congratulations to the authors and their collaborators on this important achievement. These acceptances highlight the Department’s active research in reinforcement learning, formal specification, and trustworthy AI.

Article text (excluding photos or graphics) © Imperial College London.

Photos and graphics subject to third party copyright used with permission or © Imperial College London.

Article people, mentions and related links

Reporters

Ruth Ntumba

Faculty of Medicine