Imperial College London


Faculty of EngineeringDepartment of Computing

Professor in Computational Logic



+44 (0)20 7594 8228f.toni Website




430Huxley BuildingSouth Kensington Campus





Publication Type

389 results found

Zhang K, Toni F, Williams M, 2022, A federated cox model with non-proportional hazards, The 6th International Workshop on ​Health Intelligence, Publisher: Springer, ISSN: 1860-949X

Recent research has shown the potential for neural networksto improve upon classical survival models such as the Cox model, whichis widely used in clinical practice. Neural networks, however, typicallyrely on data that are centrally available, whereas healthcare data arefrequently held in secure silos. We present a federated Cox model thataccommodates this data setting and also relaxes the proportional hazardsassumption, allowing time-varying covariate effects. In this latter respect,our model does not require explicit specification of the time-varying ef-fects, reducing upfront organisational costs compared to previous works.We experiment with publicly available clinical datasets and demonstratethat the federated model is able to perform as well as a standard model.

Conference paper

Potyka N, Yin X, Toni F, 2022, Explaining random forests using bipolar argumentation and Markov networks, AAAI 23, Publisher: AAAI, ISSN: 2159-5399

Random forests are decision tree ensembles that can be used to solve a variety of machine learning problems. However, as the number of trees and their individual size can be large, their decision making process is often incomprehensible. We show that their decision process can be naturally represented as an argumentation problem, which allows creating global explanations via argumentative reasoning. We generalize sufficientand necessary argumentative explanations using a Markov network encoding, discuss the relevance of these explanations and establish relationships to families of abductive explanations from the literature. As the complexity of the explanation problems is high, we present an efficient approximation algorithm with probabilistic approximation guarantees.

Conference paper

Jiang J, Leofante F, Rago A, Toni Fet al., 2022, Formalising the robustness of counterfactual explanations for neural networks, The 37th AAAI Conference on Artificial Intelligence, Publisher: Association for the Advancement of Artificial Intelligence

The use of counterfactual explanations (CFXs) is an increasingly popular explanation strategy for machine learning models. However, recent studies have shown that these explanations may not be robust to changes in the underlying model (e.g., following retraining), which raises questions about their reliability in real-world applications. Existing attempts towards solving this problem are heuristic, and the robustness to model changes of the resulting CFXs is evaluated with only a small number of retrained models, failing to provide exhaustive guarantees. To remedy this, we propose the first notion to formally and deterministically assess the robustness (to model changes) of CFXs for neural networks, that we call ∆-robustness. We introduce an abstraction framework based on interval neural networks to verify the ∆-robustness of CFXs against a possibly infinite set of changes to the model parameters, i.e., weights and biases. We then demonstrate the utility of this approach in two distinct ways. First, we analyse the ∆-robustness of a number of CFX generation methods from the literature and show that they unanimously host significant deficiencies in this regard. Second, we demonstrate how embedding ∆-robustness within existing methods can provide CFXs which are provably robust.

Conference paper

Albini E, Rago A, Baroni P, Toni Fet al., 2022, Descriptive accuracy in explanations: the case of probabilistic classifiers, 15th International Conference on Scalable Uncertainty Management (SUM 2022)

A user receiving an explanation for outcomes produced byan artificially intelligent system expects that it satisfies the key propertyof descriptive accuracy (DA), i.e. that the explanation contents are incorrespondence with the internal working of the system. Crucial as thisproperty appears to be, it has been somehow overlooked in the XAI literature to date. To address this problem, we consider the questions offormalising DA and of analysing its satisfaction by explanation methods. We provide formal definitions of naive, structural and dialecticalDA, using the family of probabilistic classifiers as the context for ouranalysis. We evaluate the satisfaction of our given notions of DA by several explanation methods, amounting to two popular feature-attributionmethods from the literature and a novel form of explanation that wepropose and complement our analysis with experiments carried out on avaried selection of concrete probabilistic classifiers.

Conference paper

Maurizio P, Toni F, 2022, Learning assumption-based argumentation frameworks, 31st International Conference on Inductive Logic Programming (ILP 2022)

. We propose a novel approach to logic-based learning whichgenerates assumption-based argumentation (ABA) frameworks from positive and negative examples, using a given background knowledge. TheseABA frameworks can be mapped onto logic programs with negationas failure that may be non-stratified. Whereas existing argumentationbased methods learn exceptions to general rules by interpreting the exceptions as rebuttal attacks, our approach interprets them as undercutting attacks. Our learning technique is based on the use of transformationrules, including some adapted from logic program transformation rules(notably folding) as well as others, such as rote learning and assumptionintroduction. We present a general strategy that applies the transformation rules in a suitable order to learn stratified frameworks, and we alsopropose a variant that handles the non-stratified case. We illustrate thebenefits of our approach with a number of examples, which show that,on one hand, we are able to easily reconstruct other logic-based learningapproaches and, on the other hand, we can work out in a very simpleand natural way problems that seem to be hard for existing techniques.

Conference paper

Potyka N, Yin X, Toni F, 2022, On the tradeoff between correctness and completeness in argumentative explainable AI, 1st International Workshop on Argumentation for eXplainable AI, Publisher: CEUR Workshop Proceedings, Pages: 1-8, ISSN: 1613-0073

Explainable AI aims at making the decisions of autonomous systems human-understandable. Argumentation frameworks are a natural tool for this purpose. Among them, bipolar abstract argumentation frameworks seem well suited to explain the effect of features on a classification decision and their formal properties can potentially be used to derive formal guarantees for explanations. Two particular interesting properties are correctness (if the explanation says that X affects Y, then X affects Y ) and completeness (if X affects Y, then the explanation says that X affects Y ). The reinforcement property of bipolar argumentation frameworks has been used as a natural correctness counterpart in previous work. Applied to the classification context, it basically states that attacking features should decrease and supporting features should increase the confidence of a classifier. In this short discussion paper, we revisit this idea, discuss potential limitations when considering reinforcement without a corresponding completeness property and how these limitations can potentially be overcome.

Conference paper

Toni F, Polberg S, Booth R, Caminada M, Kido Het al., 2022, Preface, ISBN: 9781643683065


Sukpanichnant P, Rago A, Lertvittayakumjorn P, Toni Fet al., 2022, Neural QBAFs: explaining neural networks under LRP-based argumentation frameworks, International Conference of the Italian Association for Artificial Intelligence, Publisher: Springer International Publishing, Pages: 429-444, ISSN: 0302-9743

In recent years, there have been many attempts to combine XAI with the field of symbolic AI in order to generate explanations for neural networks that are more interpretable and better align with human reasoning, with one prominent candidate for this synergy being the sub-field of computational argumentation. One method is to represent neural networks with quantitative bipolar argumentation frameworks (QBAFs) equipped with a particular semantics. The resulting QBAF can then be viewed as an explanation for the associated neural network. In this paper, we explore a novel LRP-based semantics under a new QBAF variant, namely neural QBAFs (nQBAFs). Since an nQBAF of a neural network is typically large, the nQBAF must be simplified before being used as an explanation. Our empirical evaluation indicates that the manner of this simplification is all important for the quality of the resulting explanation.

Conference paper

Kori A, Toni F, Glocker B, 2022, GLANCE: Global to Local Architecture-Neutral Concept-based Explanations

Most of the current explainability techniques focus on capturing the importance of features in input space. However, given the complexity of models and data-generating processes, the resulting explanations are far from being `complete', in that they lack an indication of feature interactions and visualization of their `effect'. In this work, we propose a novel twin-surrogate explainability framework to explain the decisions made by any CNN-based image classifier (irrespective of the architecture). For this, we first disentangle latent features from the classifier, followed by aligning these features to observed/human-defined `context' features. These aligned features form semantically meaningful concepts that are used for extracting a causal graph depicting the `perceived' data-generating process, describing the inter- and intra-feature interactions between unobserved latent features and observed `context' features. This causal graph serves as a global model from which local explanations of different forms can be extracted. Specifically, we provide a generator to visualize the `effect' of interactions among features in latent space and draw feature importance therefrom as local explanations. Our framework utilizes adversarial knowledge distillation to faithfully learn a representation from the classifiers' latent space and use it for extracting visual explanations. We use the styleGAN-v2 architecture with an additional regularization term to enforce disentanglement and alignment. We demonstrate and evaluate explanations obtained with our framework on Morpho-MNIST and on the FFHQ human faces dataset. Our framework is available at \url{}.

Working paper

Ward F, Belardinelli F, Toni F, 2022, Argumentative Reward Learning: Reasoning About Human Preferences, HMCaT 2022 (ICML)

Conference paper

Jiang J, Rago A, Toni F, 2022, Should counterfactual explanations always be data instances?, XLoKR 2022: The Third Workshop on Explainable Logic-Based Knowledge Representation

Counterfactual explanations (CEs) are an increasingly popular way of explaining machine learning classifiers. Predominantly, they amount to data instances pointing to potential changes to the inputs that would lead to alternative outputs. In this position paper we question the widespread assumption that CEs should always be data instances, and argue instead that in some cases they may be better understood in terms of special types of relations between input features and classification variables. We illustrate how a special type of these relations, amounting to critical influences, can characterise and guide the search for data instances deemed suitable as CEs. These relations also provide compact indications of which input features - rather than their specific values in data instances - have counterfactual value.

Conference paper

Ward F, Belardinelli F, Toni F, 2022, Argumentative Reward Learning: Reasoning About Human Preferences, MPREF 2022 (IJCAI-ECAI 2022)

Conference paper

Ward F, Toni F, Belardinelli F, 2022, A Casual Perspective on AI Deception, CAUSAL 22 (ICLP)

Conference paper

Ward F, Toni F, Belardinelli F, 2022, A Causal Perspective on AI Deception in Games, AI Safety 2022 (IJCAI-ECAI-22)

Conference paper

Ward F, Toni F, Belardinelli F, 2022, On agent incentives to manipulate human feedback in multi-agent reward learning scenarios, AAMAS 22, Publisher: ACM, Pages: 1759-1761

In settings without well-defined goals, methods for reward learningallow reinforcement learning agents to infer goals from humanfeedback. Existing work has discussed the problem that such agentsmay manipulate humans, or the reward learning process, in orderto gain higher reward. We introduce the neglected problem that, inmulti-agent settings, agents may have incentives to manipulate oneanother’s reward functions in order to change each other’s behav-ioral policies. We focus on the setting with humans acting alongsideassistive (artificial) agents who must learn the reward function byinteracting with these humans. We propose a possible solution tomanipulation of human feedback in this setting: the Shared ValuePrior (SVP). The SVP equips agents with an assumption that thereward functions of all humans are similar. Given this assumption,the actions of any human provide information to an agent aboutits reward, and so the agent is incentivised to observe these actionsrather than to manipulate them. We present an expository examplein which the SVP prevents manipulation.

Conference paper

Irwin B, Rago A, Toni F, 2022, Argumentative forecasting, AAMAS 2022, Publisher: ACM, Pages: 1636-1638

We introduce the Forecasting Argumentation Framework (FAF), anovel argumentation framework for forecasting informed by re-cent judgmental forecasting research. FAFs comprise update frame-works which empower (human or artificial) agents to argue overtime with and about probability of scenarios, whilst flagging per-ceived irrationality in their behaviour with a view to improvingtheir forecasting accuracy. FAFs include three argument types withfuture forecasts and aggregate the strength of these arguments toinform estimates of the likelihood of scenarios. We describe animplementation of FAFs for supporting forecasting agents.

Conference paper

Gaskell A, Miao Y, Toni F, Specia Let al., 2022, Logically consistent adversarial attacks for soft theorem provers, 31st International Joint Conference on Artificial Intelligence and the 25th European Conference on Artificial Intelligence, Publisher: International Joint Conferences on Artificial Intelligence

Recent efforts within the AI community haveyielded impressive results towards “soft theoremproving” over natural language sentences using lan-guage models. We propose a novel, generativeadversarial framework for probing and improvingthese models’ reasoning capabilities. Adversarialattacks in this domain suffer from the logical in-consistency problem, whereby perturbations to theinput may alter the label. Our Logically consis-tent AdVersarial Attacker, LAVA, addresses this bycombining a structured generative process with asymbolic solver, guaranteeing logical consistency.Our framework successfully generates adversarialattacks and identifies global weaknesses commonacross multiple target models. Our analyses revealnaive heuristics and vulnerabilities in these mod-els’ reasoning capabilities, exposing an incompletegrasp of logical deduction under logic programs.Finally, in addition to effective probing of thesemodels, we show that training on the generatedsamples improves the target model’s performance.

Conference paper

Rago A, Baroni P, Toni F, 2022, Explaining causal models with argumentation: the case of bi-variate reinforcement, 19th International Conference on Principles of Knowledge Representation and Reasoning (KR 2022), Publisher: IJCAI Organisation, ISSN: 2334-1033

Causal models are playing an increasingly important role inmachine learning, particularly in the realm of explainable AI.We introduce a conceptualisation for generating argumenta-tion frameworks (AFs) from causal models for the purposeof forging explanations for the models’ outputs. The concep-tualisation is based on reinterpreting desirable properties ofsemantics of AFs as explanation moulds, which are meansfor characterising the relations in the causal model argumen-tatively. We demonstrate our methodology by reinterpretingthe property of bi-variate reinforcement as an explanationmould to forge bipolar AFs as explanations for the outputs ofcausal models. We perform a theoretical evaluation of theseargumentative explanations, examining whether they satisfy arange of desirable explanatory and argumentative propertie

Conference paper

Irwin B, Rago A, Toni F, 2022, Forecasting argumentation frameworks, 19th International Conference on Principles of Knowledge Representation and Reasoning (KR 2022), Publisher: IJCAI Organisation, ISSN: 2334-1033

We introduce Forecasting Argumentation Frameworks(FAFs), a novel argumentation-based methodology forforecasting informed by recent judgmental forecastingresearch. FAFs comprise update frameworks which empower(human or artificial) agents to argue over time about theprobability of outcomes, e.g. the winner of a politicalelection or a fluctuation in inflation rates, whilst flaggingperceived irrationality in the agents’ behaviour with a viewto improving their forecasting accuracy. FAFs include fiveargument types, amounting to standard pro/con arguments,as in bipolar argumentation, as well as novel proposalarguments and increase/decrease amendment arguments. Weadapt an existing gradual semantics for bipolar argumen-tation to determine the aggregated dialectical strength ofproposal arguments and define irrational behaviour. We thengive a simple aggregation function which produces a finalgroup forecast from rational agents’ individual forecasts.We identify and study properties of FAFs and conductan empirical evaluation which signals FAFs’ potential toincrease the forecasting accuracy of participants.

Conference paper

Paulino-Passos G, Toni F, 2022, On Monotonicity of Dispute Trees as Explanations for Case-Based Reasoning with Abstract Argumentation, ISSN: 1613-0073

Recent work on explainability raises the question of what different types of explanations actually mean. One idea is that explanations can reveal information about the behaviour of the model on a subset of the input space. When this way of interpreting explanations is thought as an interactive process, inferences from explanations can be seen as a form of reasoning. In the case of case-based reasoning with abstract argumentation (AA-CBR), previous work has used arbitrated dispute trees as a methodology for explanation. Those are dispute trees where nodes are seen as losing or winning depending on the outcome for the new case under consideration. In this work we show how arbitrated dispute trees can be readapted for different inputs, which allows a broader interpretation of them, capturing more of the input-output behaviour of the model. We show this readaptation is correct by construction, and thus the resulting reasoning based on this reuse is monotonic and thus necessarily a faithful explanation.

Conference paper

Oksanen J, Majumder A, Saunack K, Toni F, Dhondiyal Aet al., 2022, A Graph-Based Method for Unsupervised Knowledge Discovery from Financial Texts, 13th International Conference on Language Resources and Evaluation (LREC), Publisher: EUROPEAN LANGUAGE RESOURCES ASSOC-ELRA, Pages: 5412-5417

Conference paper

Sukpanichnant P, Rago A, Lertvittayakumjorn P, Toni Fet al., 2021, LRP-based argumentative explanations for neural networks, 2021 - Italian Workshop on Explainable Artificial Intelligence, Pages: 71-84, ISSN: 1613-0073

In recent years, there have been many attempts to combine XAI with the field of symbolic AI in order to generate explanations for neural networks that are more interpretable and better align with human reasoning, with one prominent candidate for this synergy being the sub-field of computational argumentation. One method is to represent neural networks with quantitative bipolar argumentation frameworks (QBAFs) equipped with a particular semantics. The resulting QBAF can then be viewed as an explanation for the associated neural network. In this paper, we explore a novel LRP-based semantics under a new QBAF variant, namely neural QBAFs (nQBAFs). Since an nQBAF of a neural network is typically large, the nQBAF must be simplified before being used as an explanation. Our empirical evaluation indicates that the manner of this simplification is all important for the quality of the resulting explanation.

Conference paper

Albini E, Rago A, Baroni P, Toni Fet al., 2021, Influence-driven explanations for bayesian network classifiers, PRICAI 2021, Publisher: Springer Verlag, Pages: 88-100, ISSN: 0302-9743

We propose a novel approach to buildinginfluence-driven ex-planations(IDXs) for (discrete) Bayesian network classifiers (BCs). IDXsfeature two main advantages wrt other commonly adopted explanationmethods. First, IDXs may be generated using the (causal) influences between intermediate, in addition to merely input and output, variables within BCs, thus providing adeep, rather than shallow, account of theBCs’ behaviour. Second, IDXs are generated according to a configurable set of properties, specifying which influences between variables count to-wards explanations. Our approach is thusflexible and can be tailored to the requirements of particular contexts or users. Leveraging on this flexibility, we propose novel IDX instances as well as IDX instances cap-turing existing approaches. We demonstrate IDXs’ capability to explainvarious forms of BCs, and assess the advantages of our proposed IDX instances with both theoretical and empirical analyses.

Conference paper

Rago A, Cocarascu O, Bechlivanidis C, Toni Fet al., 2021, Argumentation as a framework for interactive explanations for recommendations, KR 2020, 17th International Conference on Principles of Knowledge Representation and Reasoning, Publisher: IJCAI, Pages: 805-815, ISSN: 2334-1033

As AI systems become ever more intertwined in our personallives, the way in which they explain themselves to and inter-act with humans is an increasingly critical research area. Theexplanation of recommendations is, thus a pivotal function-ality in a user’s experience of a recommender system (RS),providing the possibility of enhancing many of its desirablefeatures in addition to itseffectiveness(accuracy wrt users’preferences). For an RS that we prove empirically is effective,we show how argumentative abstractions underpinning rec-ommendations can provide the structural scaffolding for (dif-ferent types of) interactive explanations (IEs), i.e. explana-tions empowering interactions with users. We prove formallythat these IEs empower feedback mechanisms that guaranteethat recommendations will improve with time, hence render-ing the RSscrutable. Finally, we prove experimentally thatthe various forms of IE (tabular, textual and conversational)inducetrustin the recommendations and provide a high de-gree oftransparencyin the RS’s functionality.

Conference paper

Kotonya N, Spooner T, Magazzeni D, Toni Fet al., 2021, Graph Reasoning with Context-Aware Linearization for Interpretable Fact Extraction and Verification, FEVER 2021

Conference paper

Cyras K, Rago A, Emanuele A, Baroni P, Toni Fet al., 2021, Argumentative XAI: a survey, The 30th International Joint Conference on Artificial Intelligence (IJCAI-21), Publisher: International Joint Conferences on Artificial Intelligence, Pages: 4392-4399

Explainable AI (XAI) has been investigated for decades and, together with AI itself, has witnessed unprecedented growth in recent years. Among various approaches to XAI, argumentative models have been advocated in both the AI and social science literature, as their dialectical nature appears to match some basic desirable features of the explanation activity. In this survey we overview XAI approaches built using methods from the field of computational argumentation, leveraging its wide array of reasoning abstractions and explanation delivery methods. We overview the literature focusing on different types of explanation (intrinsic and post-hoc), different models with which argumentation-based explanations are deployed, different forms of delivery, and different argumentation frameworks they use. We also lay out a roadmap for future work.

Conference paper

Cocarascu O, Cyras K, Rago A, Toni Fet al., 2021, Mining property-driven graphical explanations for data-centric AI from argumentation frameworks, Human-Like Machine Intelligence, Pages: 93-113, ISBN: 9780198862536

Book chapter

Zylberajch H, Lertvittayakumjorn P, Toni F, 2021, HILDIF: interactive debugging of NLI models using influence functions, 1st Workshop on Interactive Learning for Natural Language Processing (InterNLP), Publisher: ASSOC COMPUTATIONAL LINGUISTICS-ACL, Pages: 1-6

Biases and artifacts in training data can cause unwelcome behavior in text classifiers (such as shallow pattern matching), leading to lack of generalizability. One solution to this problem is to include users in the loop and leverage their feedback to improve models. We propose a novel explanatory debugging pipeline called HILDIF, enabling humans to improve deep text classifiers using influence functions as an explanation method. We experiment on the Natural Language Inference (NLI) task, showing that HILDIF can effectively alleviate artifact problems in fine-tuned BERT models and result in increased model generalizability.

Conference paper

Albini E, Baroni P, Rago A, Toni Fet al., 2021, Interpreting and explaining pagerank through argumentation semantics, Intelligenza Artificiale, Vol: 15, Pages: 17-34, ISSN: 1724-8035

In this paper we show how re-interpreting PageRank as an argumentation semantics for a bipolar argumentation framework empowers its explainability. After showing that PageRank, naively re-interpreted as an argumentation semantics for support frameworks, fails to satisfy some generally desirable properties, we propose a novel approach able to reconstruct PageRank as a gradual semantics of a suitably defined bipolar argumentation framework, while satisfying these properties. We then show how the theoretical advantages afforded by this approach also enjoy an enhanced explanatory power: we propose several types of argument-based explanations for PageRank, each of which focuses on different aspects of the algorithm and uncovers information useful for the comprehension of its results.

Journal article

Paulino-Passos G, Toni F, 2021, Monotonicity and Noise-Tolerance in Case-Based Reasoning with Abstract Argumentation (with Appendix)

Recently, abstract argumentation-based models of case-based reasoning($AA{\text -} CBR$ in short) have been proposed, originally inspired by thelegal domain, but also applicable as classifiers in different scenarios.However, the formal properties of $AA{\text -} CBR$ as a reasoning systemremain largely unexplored. In this paper, we focus on analysing thenon-monotonicity properties of a regular version of $AA{\text -} CBR$ (that wecall $AA{\text -} CBR_{\succeq}$). Specifically, we prove that $AA{\text -}CBR_{\succeq}$ is not cautiously monotonic, a property frequently considereddesirable in the literature. We then define a variation of $AA{\text -}CBR_{\succeq}$ which is cautiously monotonic. Further, we prove that suchvariation is equivalent to using $AA{\text -} CBR_{\succeq}$ with a restrictedcasebase consisting of all "surprising" and "sufficient" cases in the originalcasebase. As a by-product, we prove that this variation of $AA{\text -}CBR_{\succeq}$ is cumulative, rationally monotonic, and empowers a principledtreatment of noise in "incoherent" casebases. Finally, we illustrate $AA{\text-} CBR$ and cautious monotonicity questions on a case study on the U.S. TradeSecrets domain, a legal casebase.


This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00154121&limit=30&person=true