Publications

Al Wahaibi S, Foley M, Maffeis S, 2023, SQIRL: Grey-box detection of SQL injection vulnerabilities using reinforcement learning, USENIX Security, Publisher: USENIX Security, Pages: 6097-6114

Web security scanners are used to discover SQL injectionvulnerabilities in deployed web applications. Scanners tendto use static rules to cover the most common injection cases,missing diversity in their payloads, leading to a high volumeof requests and false negatives. Moreover, scanners oftenrely on the presence of error messages or other significantfeedback on the target web pages, as a result of additionalinsecure programming practices by web developers.In this paper we develop SQIRL, a novel approach to detecting SQL injection vulnerabilities based on deep reinforcementlearning, using multiple worker agents and grey-box feedback.Each worker intelligently fuzzes the input fields discoveredby an automated crawling component. This approach generates a more varied set of payloads than existing scanners,leading to the discovery of more vulnerabilities. Moreover,SQIRL attempts fewer payloads, because they are generatedin a targeted fashion.SQIRL finds all vulnerabilities in our microbenchmark forSQL injection, with substantially fewer requests than mostof the state-of-the-art scanners compared with. It also significantly outperforms other scanners on a set of 14 productiongrade web applications, discovering 33 vulnerabilities, withzero false positives. We have responsibly disclosed 22 novelvulnerabilities found by SQIRL, grouped in 6 CVEs.

Conference paper

Foley M, Maffeis S, 2023, HAXSS: Hierarchical reinforcement learning for XSS payload generation, IEEE TrustCom 2022, Publisher: IEEE, Pages: 147-158

Web application vulnerabilities are an ongoing problem that current black-box techniques and scanners do not entirely solve, suffering in particular from a lack of payload diversity that prevents them from capturing the long tail of vulnerabilities caused by uncommon sanitisation mistakes.In order to increase the diversity of payloads that can be automatically generated in a black-box fashion, we develop a hierarchical reinforcement learning approach where agents focus separately on the tasks of escaping the current context, and evading sanitisation. We implement this in an end-to-end prototype we call HAXSS. We compare our approach against a number of state-of-the-art black-box scanners on a new micro-benchmark for XSS payload generation, and on a macro-benchmark of established vulnerableweb applications. HAXSS outperforms the other scanners on both benchmarks, identifying 131 vulnerabilities (a 20% improvement over the closest scanner), reporting 0 false positives. Finally, we demonstrate that our approach is practically useful, as HAXSS re-discovers 4 existing CVEs and discovers 5 new CVEs in 3 production-grade web applications.

Conference paper

Highnam K, Hanif Z, Van Vogt E, Parbhoo S, Maffeis S, Jennings NRet al., 2023, Adaptive Experimental Design for Intrusion Data Collection, CEUR Workshop Proceedings, Vol: 3652, Pages: 134-151, ISSN: 1613-0073

Intrusion research frequently collects data on attack techniques currently employed and their potential symptoms. This includes deploying honeypots, logging events from existing devices, employing a red team for a sample attack campaign, or simulating system activity. However, these observational studies do not clearly discern the cause-and-effect relationships between the design of the environment and the data recorded. Neglecting such relationships increases the chance of drawing biased conclusions due to unconsidered factors, such as spurious correlations between features and errors in measurement or classification. In this paper, we present the theory and empirical data on methods that aim to discover such causal relationships efficiently. Our adaptive design (AD) is inspired by the clinical trial community: a variant of a randomized control trial (RCT) to measure how a particular “treatment” affects a population. To contrast our method with observational studies and RCT, we run the first controlled and adaptive honeypot deployment study, identifying the causal relationship between an ssh vulnerability and the rate of server exploitation. We demonstrate that our AD method decreases the total time needed to run the deployment by at least 33%, while still confidently stating the impact of our change in the environment. Compared to an analogous honeypot study with a control group, our AD requests 17% fewer honeypots while collecting 19% more attack recordings than an analogous honeypot study with a control group.

Abstract
Cite

Journal article

Alageel A, Maffeis S, 2022, EARLYCROW: Detecting APT Malware Command and Control over HTTP(S) Using Contextual Summaries, 25th International Conference, ISC 2022, Publisher: Springer International Publishing, Pages: 290-316

Advanced Persistent Threats (APTs) are among the most sophisticated threats facing critical organizations worldwide. APTs employ specific tactics, techniques, and procedures (TTPs) which make them difficult to detect in comparison to frequent and aggressive attacks. In fact, current network intrusion detection systems struggle to detect APTs communications, allowing such threats to persist unnoticed on victims' machines for months or even years.

Conference paper

Rabheru R, Hanif H, Maffeis S, 2022, A Hybrid Graph Neural Network Approach for Detecting PHP Vulnerabilities, 5th IEEE Conference on Dependable and Secure Computing (IEEE DSC), Publisher: IEEE

Author Web Link
Cite
Citations: 1

Conference paper

Hanif H, Maffeis S, 2022, VulBERTa: Simplified Source Code Pre-Training for Vulnerability Detection, IEEE International Conference on Fuzzy Systems (FUZZ-IEEE) / IEEE World Congress on Computational Intelligence (IEEE WCCI) / International Joint Conference on Neural Networks (IJCNN) / IEEE Congress on Evolutionary Computation (IEEE CEC), Publisher: IEEE, ISSN: 2161-4393

Author Web Link
Cite
Citations: 5

Conference paper

Rabheru R, Hanif H, Maffeis S, 2021, DeepTective: Detection of PHP vulnerabilities using hybrid graph neural networks, Pages: 1687-1690

This paper presents DeepTective, a deep learning-based approach to detect vulnerabilities in PHP source code. DeepTective implements a novel hybrid technique that combines Gated Recurrent Units and Graph Convolutional Networks to detect SQLi, XSS and OSCI vulnerabilities leveraging both syntactic and semantic information. Experimental results show that our model outperformed related solutions on both synthetic and realistic datasets, and was able to discover 4 novel vulnerabilities in established WordPress plugins.

Abstract
Cite
Citations: 8

Conference paper

Alageel A, Maffeis S, 2021, Hawk-Eye: holistic detection of APT command and control domains, SAC '21: The 36th ACM/SIGAPP Symposium on Applied Computing, Publisher: ACM, Pages: 1664-1673

The high complexity and low volume of APT attacks has lead to limited insight into their behavior and to a scarcity of data, hindering research on effective detection techniques. In this paper we present a comprehensive study of the usage of domains in the context of the Command and Control (C&C) infrastructure of APTs, covering 63 APT campaigns spanning the last 13 years. We discuss the APT threat model, focusing in particular on evasion techniques, and collect an extensive dataset for studying APT C&C domains.Based on the gained insight, we propose a number of novel features to detect APTs, leveraging both semantic properties of the domains themselves and structural properties of their DNS infrastructure. We build Hawk-Eye, a system to classify domain names extracted from PCAP files, and use it to evaluate the performance of the various features we studied, and compare them to malicious domain detection features from the literature. We find that a holistic approach combining selected orthogonal features achieves the best performance, with an F1-score of 98.53% and a FPR of 0.35%.

Conference paper

Zizzo G, Rawat A, Sinn M, Maffeis S, Hankin Cet al., 2021, Certified Federated Adversarial Training., CoRR, Vol: abs/2112.10525

Cite

Journal article

Zizzo G, Hankin C, Maffeis S, Jones Ket al., 2020, Adversarial attacks on time-series intrusion detection for industrial control systems, The 19th IEEE International Conference on Trust, Security and Privacy in Computing and Communications, Publisher: Institute of Electrical and Electronics Engineers

Neural networks are increasingly used for intrusiondetection on industrial control systems (ICS). With neuralnetworks being vulnerable to adversarial examples, attackerswho wish to cause damage to an ICS can attempt to hidetheir attacks from detection by using adversarial exampletechniques. In this work we address the domain specificchallenges of constructing such attacks against autoregressivebased intrusion detection systems (IDS) in a ICS setting.We model an attacker that can compromise a subset ofsensors in a ICS which has a LSTM based IDS. The attackermanipulates the data sent to the IDS, and seeks to hide thepresence of real cyber-physical attacks occurring in the ICS.We evaluate our adversarial attack methodology on theSecure Water Treatment system when examining solely continuous data, and on data containing a mixture of discrete andcontinuous variables. In the continuous data domain our attacksuccessfully hides the cyber-physical attacks requiring 2.87 outof 12 monitored sensors to be compromised on average. Withboth discrete and continuous data our attack required, onaverage, 3.74 out of 26 monitored sensors to be compromised.

Conference paper

Zizzo G, Hankin C, Maffeis S, Jones Ket al., 2019, Adversarial machine learning beyond the image domain, the 56th Annual Design Automation Conference 2019, Publisher: ACM Press

Machine learning systems have had enormous success in a wide range of fields from computer vision, natural language processing, and anomaly detection. However, such systems are vulnerable to attackers who can cause deliberate misclassification by introducing small perturbations. With machine learning systems being proposed for cyber attack detection such attackers are cause for serious concern. Despite this the vast majority of adversarial machine learning security research is focused on the image domain. This work gives a brief overview of adversarial machine learning and machine learning used in cyber attack detection and suggests key differences between the traditional image domain of adversarial machine learning and the cyber domain. Finally we show an adversarial machine learning attack on an industrial control system.

Conference paper

Barrere Cambrun M, Hankin C, Barboni A, Zizzo G, Boem F, Maffeis S, Parisini Tet al., 2019, CPS-MT: a real-time cyber-physical system monitoring tool for security Research, 24th IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA2018), Publisher: IEEE

Monitoring systems are essential to understand and control the behaviour of systems and networks. Cyber-physical systems (CPS) are particularly delicate under that perspective since they involve real-time constraints and physical phenomena that are not usually considered in common IT solutions. Therefore, there is a need for publicly available monitoring tools able to contemplate these aspects. In this poster/demo, we present our initiative, called CPS-MT, towards a versatile, real-time CPS monitoring tool, with a particular focus on security research. We first present its architecture and main components, followed by a MiniCPS-based case study. We also describe a performance analysis and preliminary results. During the demo, we will discuss CPS-MT’s capabilities and limitations for security applications.

Conference paper

Zizzo G, Hankin C, Maffeis S, Jones Ket al., 2019, Intrusion Detection for Industrial Control Systems: Evaluation Analysis and Adversarial Attacks., CoRR, Vol: abs/1911.04278

Cite

Journal article

Zizzo G, Hankin C, Maffeis S, Jones Ket al., 2019, Deep Latent Defence., CoRR, Vol: abs/1910.03916

Cite

Journal article

Arceri V, Maffeis S, 2016, Abstract domains for type juggling, Numerical and Symbolic Abstract Domains (NSAD), Publisher: Elsevier, ISSN: 1571-0661

Web scripting languages, such as PHP and JavaScript, provide a wide range of dynamic features that makethem both flexible and error-prone. In order to prevent bugs in web applications, there is a sore need forpowerful static analysis tools. In this paper, we investigate how Abstract Interpretation may be leveragedto provide a precise value analysis providing rich typing information that can be a useful component forsuch tools.In particular, we define the formal semantics for a core of PHP that illustratestype juggling, the implicittype conversions typical of PHP, and investigate the design of abstract domains and operations that, whilestill scalable, are expressive enough to cope with type juggling. We believe that our approach can also beapplied to other languages with implicit type conversions.

Conference paper

Bella G, Maffeis S, 2016, Special track on computer security: editorial message, SAC 20116, Publisher: ACM, Pages: 2031-2032

Conference paper

Hothersall-Thomas C, Maffeis S, Novakovic C, 2015, BrowserAudit: Automated testing of browser security features, New York, NY, 2015 International Symposium on Software Testing and Analysis, Publisher: Association for Computing Machinery, Pages: 37-47

The security of the client side of a web application relies on browser features such as cookies, the same-origin policy and HTTPS. As the client side grows increasingly powerful and sophisticated, browser vendors have stepped up their offering of security mechanisms which can be leveraged to protect it. These are often introduced experimentally and informally and, as adoption increases, gradually become standardised (e.g., CSP, CORS and HSTS). Considering the diverse landscape of browser vendors, releases, and customised versions for mobile and embedded devices, there is a compelling need for a systematic assessment of browser security. We present BrowserAudit, a tool for testing that a deployed browser enforces the guarantees implied by the main standardised and experimental security mechanisms. It includes more than 400 fully-automated tests that exercise a broad range of security features, helping web users, application developers and security researchers to make an informed security assessment of a deployed browser. We validate BrowserAudit by discovering both fresh and known security-related bugs in major browsers.

Conference paper

Bella G, Maffeis S, 2015, 2015 Special Track on Computer Security, Pages: 2125-2126

Cite

Conference paper

Bansal C, Bhargavan K, Delignat-Lavaud A, Maffeis Set al., 2014, Discovering concrete attacks on website authorization by formal analysis, Journal of Computer Security, Vol: 22, Pages: 601-657

Cite

Journal article

Bodin M, Chargueraud A, Filaretti D, Gardner P, Maffeis S, Naudziuniene D, Schmitt A, Smith Get al., 2014, A Trusted Mechanised JavaScript Specification, 41st Annual ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), Publisher: Association for Computing Machinery (ACM), Pages: 87-100, ISSN: 1523-2867

Conference paper

Bhargavan K, Delignat-Lavaud A, Maffeis S, 2014, Defensive javascript building and verifying secure web components, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol: 8604, Pages: 88-123, ISSN: 0302-9743

Defensive JavaScript (DJS) is a typed subset of JavaScript that guarantees that the functional behavior of a program cannot be tampered with even if it is loaded by and executed within a malicious environment under the control of the attacker. As such, DJS is ideal for writing JavaScript security components, such as bookmarklets, single sign-on widgets, and cryptographic libraries, that may be loaded within untrusted web pages alongside unknown scripts from arbitrary third parties. We present a tutorial of the DJS language along with motivations for its design. We show how to program security components in DJS, how to verify their defensiveness using the DJS typechecker, and how to analyze their security properties automatically using ProVerif.

Abstract
Cite
Citations: 6

Journal article

Filaretti D, Maffeis S, 2014, An Executable Formal Semantics of PHP, European Conference on Object-Oriented Programming (ECOOP'14), Pages: 120-145

Cite

Conference paper

Bansal C, Bhargavan K, Delignat-Lavaud A, Maffeis Set al., 2013, Keys to the Cloud: Formal Analysis and Concrete Attacks on Encrypted Web Storage, Conference on Principles of Security and Trust (POST'13), Pages: 126-146

Cite

Conference paper

Bhargavan K, Delignat-Lavaud A, Maffeis S, 2013, Language-based defenses against untrusted browser origins, 22nd Usenix Security Symposium, Pages: 653-670

Conference paper

Maffeis S, Rezk T, 2012, PLAS'12 - Proceedings of Programming Languages and Analysis for Security: Preface, PLAS'12 - Proceedings of Programming Languages and Analysis for Security

Cite

Journal article

Gardner P, Maffeis S, Smith G, 2012, Towards a program logic for JavaScript, POPL 2012, Publisher: ACM, Pages: 31-44

JavaScript has become the most widely used language for clientsideweb programming. The dynamic nature of JavaScript makesunderstanding its code notoriously difficult, leading to buggy programsand a lack of adequate static-analysis tools. We believe thatlogical reasoning has much to offer JavaScript: a simple descriptionof program behaviour, a clear understanding of module boundaries,and the ability to verify security contracts.We introduce a program logic for reasoning about a broad subsetof JavaScript, including challenging features such as prototypeinheritance and with. We adapt ideas from separation logic toprovide tractable reasoning about JavaScript code: reasoning abouteasy programs is easy; reasoning about hard programs is possible.We prove a strong soundness result. All libraries written in oursubset and proved correct with respect to their specifications willbe well-behaved, even when called by arbitrary JavaScript code.

Conference paper

Bansal C, Bhargavan K, Maffeis S, 2012, Discovering Concrete Attacks on Website Authorization by Formal Analysis, 25th Computer Security Foundations Symposium, Pages: 247-262, ISSN: 1940-1434

Cite

Conference paper

Gardner P, Maffeis S, Smith G, 2011, Towards a program logic of JavaScript, Departmental Technical Report: 11/11, Publisher: Department of Computing, Imperial College London, 11/11

JavaScript has become the most widely used language for clientsideweb programming. The dynamic nature of JavaScript makesunderstanding its code notoriously difficult, leading to buggy programsand a lack of adequate static-analysis tools. We believe thatlogical reasoning has much to offer JavaScript: a simple descriptionof program behaviour, a clear understanding of module boundaries,and the ability to verify security contracts.We introduce a program logic for reasoning about a broad subsetof JavaScript, including challenging features such as prototypeinheritance and with. We adapt ideas from separation logic toprovide tractable reasoning about JavaScript code: reasoning abouteasy programs is easy; reasoning about hard programs is possible.We prove a strong soundness result. All libraries written in oursubset and proved correct with respect to their specifications willbe well-behaved, even when called by arbitrary JavaScript code.

Abstract
Cite

Report

Bengtson J, Bhargavan K, Fournet C, Gordon AD, Maffeis Set al., 2011, Refinement Types for Secure Implementations, ACM TRANSACTIONS ON PROGRAMMING LANGUAGES AND SYSTEMS, Vol: 33, ISSN: 0164-0925

Author Web Link
Cite
Citations: 61

Journal article

Maffeis S, Mitchell JC, Taly A, 2010, Object capabilities and isolation of untrusted web applications, Departmental Technical Report: 10/6, Publisher: Department of Computing, Imperial College London, 10/6

A growing number of current web sites combine active content(applications) from untrusted sources, as in so-called mashups. The object-capability model provides an appealing approach for isolating untrusted con-tent: if separate applications are provided disjoint capabilities, a sound object-capability framework should prevent untrusted applications from interferingwith each other, without preventing interaction with the user or the hostingpage. In developing language-based foundations for isolation proofs basedon object-capability concepts, we identify a more general notion of author-ity safety that also implies resource isolation. After proving that capabilitysafety implies authority safety, we show the applicability of our frameworkfor a speci c class of mashups. In addition to proving that a JavaScript sub-set based on Google Caja is capability safe, we prove that a more expressivesubset of JavaScript is authority safe, even though it is not based on theobject-capability model.

Report

DrSergioMaffeis

Contact

Location

Summary