ProfessorCristianCadar

Faculty of Engineering, Department of Computing

Professor of Software Reliability

Contact

c.cadar Website

Location

435Huxley BuildingSouth Kensington Campus

Summary

Publications

Schemmel D, Büning J, Busse F, Nowack M, Cadar Cet al., 2023, KDAlloc: The KLEE Deterministic Allocator: Deterministic Memory Allocation during Symbolic Execution and Test Case Replay, Pages: 1491-1494

The memory allocator can have an important impact in symbolic execution. Taking a user-centric view, this tool demonstration paper discusses some of the main benefits provided by KLEE's new allocator KDAlloc in terms of improved deterministic execution and bug-finding capabilities. We then introduce a new replay tool for KLEE which enables the native execution to integrate KDAlloc and receive the same heap addresses as during symbolic execution.

Abstract
Cite

Conference paper

Even-Mendoza K, Sharma A, Donaldson A, Cadar Cet al., 2023, GrayC: Greybox fuzzing of compilers and analysers for C, International Symposium on Software Testing and Analysis, Publisher: ACM, Pages: 1219-1231

Fuzzing of compilers and code analysers has led to a large number of bugs being found and fixed in widely-used frameworks such as LLVM, GCC and Frama-C. Most such fuzzing techniques have taken a blackbox approach, with compilers and code analysers startingto become relatively immune to such fuzzers. We propose a coverage-directed, mutation-based approach for fuzzing C compilers and code analysers, inspired by the success of this type of greybox fuzzing in other application domains. The main challenge of applying mutation-based fuzzing in this context is that naive mutations are likely to generate programs that do not compile. Such programs are not useful for finding deep bugs that affect optimisation, analysis, and code generation routines.We have designed a novel greybox fuzzer for C compilers and analysers by developing a new set of mutations to target common C constructs, and transforming fuzzed programs so that they producemeaningful output, allowing differential testing to be used as a test oracle, and paving the way for fuzzer-generated programs to be integrated into compiler and code analyser regression test suites.We have implemented our approach in GrayC, a new open-source LibFuzzer-based tool, and present experiments showing that it provides more coverage on the middle- and back-end stages of compilers and analysers compared to other mutation-based approaches, including Clang-Fuzzer, PolyGlot, and a technique similar to LangFuzz.We have used GrayC to identify 30 confirmed compiler and code analyser bugs: 25 previously unknown bugs (with 22 of them already fixed in response to our reports) and 5 confirmed bugs reported independently shortly before we found them. A further 3 bug reports are under investigation. Apart from the results above, wehave contributed 24 simplified versions of coverage-enhancing test cases produced by GrayC to the Clang/LLVM test suite, targeting 78 previously uncovered functions in the LLVM codebase.

Conference paper

Cadar C, Schemmel D, Sharma A, 2023, Patch specifications via product programs, International Conference on Formal Methods in Software Engineering (FormaliSE 2023), Publisher: IEEE, Pages: 39-43

Code patches are the basic blocks of softwareevolution and several testing and analysis techniques have been proposed to validate them. However, due to lack of specifications, most of these techniques focus on generic errors, such as crashes. In this vision paper, we propose to adopt product programs as a practical means of writing patch specifications that could be checked using existing testing techniques. Future work will investigate the feasibility of automatically generating product programs for real-world code patches, the ease of writing useful patch specifications, and the integration of such patch specifications with existing testing techniques.

Conference paper

Roychoudhury A, Cadar C, Kim M, 2022, Message from the Chairs, Pages: iii-viii

Cite

Conference paper

Andronidis A, Cadar C, 2022, SnapFuzz: High-throughput fuzzing of network applications, International Symposium on Software Testing and Analysis (ISSTA 2022), Publisher: ACM, Pages: 340-351

In recent years, fuzz testing has benefited from increased com-putational power and important algorithmic advances, leading tosystems that have discovered many critical bugs and vulnerabilitiesin production software. Despite these successes, not all applicationscan be fuzzed efficiently. In particular, stateful applications such asnetwork protocol implementations are constrained by a low fuzzingthroughput and the need to develop complex fuzzing harnessesthat involve custom time delays and clean-up scripts.In this paper, we present SnapFuzz, a novel fuzzing frameworkfor network applications. SnapFuzz offers a robust architecturethat transforms slow asynchronous network communication intofast synchronous communication, snapshots the target at the latestpoint at which it is safe to do so, speeds up file operations byredirecting them to a custom in-memory filesystem, and removesthe need for many fragile modifications, such as configuring timedelays or writing clean-up scripts.Using SnapFuzz, we fuzzed five popular networking applications:LightFTP, TinyDTLS, Dnsmasq, LIVE555 and Dcmqrscp. We reportimpressive performance speedups of 62.8 x, 41.2 x, 30.6 x, 24.6 x, and8.4 x, respectively, with significantly simpler fuzzing harnesses inall cases. Due to its advantages, SnapFuzz has also found 12 extracrashes compared to AFLNet in these applications.

Conference paper

Even-Mendoza K, Cadar C, Donaldson A, 2022, CsmithEdge: more effective compiler testing by handling undefined behaviour less conservatively, Empirical Software Engineering: an international journal, Vol: 27, ISSN: 1382-3256

Compiler fuzzing techniques require a means of generating programs that are free from undefined behaviour (UB) to reliably reveal miscompilation bugs. Existing program generators such as CSMITH achieve UB-freedom by heavily restricting the form of generated programs. The idiomatic nature of the resulting programs risks limiting the test coverage they can offer, and thus the compiler bugs they can discover. We investigate the idea of adapting existing fuzzers to be less restrictive concerning UB, in the practical setting of C compiler testing via a new tool, CSMITHEDGE, which extends CSMITH. CSMITHEDGE probabilistically weakens the constraints used to enforce UB-freedom, thus generated programs are no longer guaranteed to be UB-free. It then employs several off-the-shelf UB detection tools and a novel dynamic analysis to (a) detect cases where the generated program exhibits UB and (b) determine where CSMITH has been too conservative in its use of safe math wrappers that guarantee UB-freedom for arithmetic operations, removing the use of redundant ones. The resulting UB-free programs can be used to test for miscompilation bugs via differential testing. The non-UB-free programs can still be used to check that the compiler under test does not crash or hang.Our experiments on recent versions of GCC, LLVM and the Microsoft VisualStudio Compiler show that CSMITHEDGE was able to discover 7 previously unknown miscompilation bugs (5 already fixed in response to our reports) that could not be found via intensive testing using CSMITH, and 2 compiler-hang bugs that were fixed independently shortly before we considered reporting them.

Journal article

Busse F, Gharat P, Cadar C, Donaldson Aet al., 2022, Combining static analysis error traces with dynamic symbolic execution (experience paper), International Symposium on Software Testing and Analysis (ISSTA 2022), Publisher: ACM, Pages: 568-579

This paper reports on our experience implementing a techniquefor sifting through static analysis reports using dynamic symbolicexecution. Our insight is that if a static analysis tool produces apartial trace through the program under analysis, annotated withconditions that the analyser believes are important for the bugto trigger, then a dynamic symbolic execution tool may be ableto exploit the trace by (a) guiding the search heuristically so thatpaths that follow the trace most closely are prioritised for explo-ration, and (b) pruning the search using the conditions associatedwith each step of the trace. This may allow the bug to be quicklyconfirmed using dynamic symbolic execution, if it turns out to be atrue positive, yielding an input that triggers the bug.To experiment with this approach, we have implemented the ideain a tool chain that allows the popular open-source static analysistools Clang Static Analyzer (CSA) and Infer to be combined withthe popular open-source dynamic symbolic execution engine KLEE.Our findings highlight two interesting negative results. First, whilefault injection experiments show the promise of our technique,they also reveal that the traces provided by static analysis tools arenot that useful in guiding search. Second, we have systematicallyapplied CSA and Infer to a large corpus of real-world applicationsthat are suitable for analysis with KLEE, and find that the staticanalysers are rarely able to find non-trivial true positive bugs forthis set of applications.We believe our case study can inform static analysis and dynamicsymbolic execution tool developers as to where improvements maybe necessary, and serve as a call to arms for researchers interestedin combining symbolic execution and static analysis to identifymore suitable benchmark suites for evaluation of research ideas.

Conference paper

Schemmel D, Büning J, Busse F, Nowack M, Cadar Cet al., 2022, A deterministic memory allocator for dynamic symbolic execution, 36th European Conference on Object-Oriented Programming (ECOOP 2022), Publisher: Schloss Dagstuhl -- Leibniz-Zentrum fuer Informatik, Pages: 9:1-9:26, ISSN: 1868-8969

Dynamic symbolic execution (DSE) has established itself as an effective testing and analysis technique. While the memory model in DSE has attracted significant attention, the memory allocator has beenlargely ignored, despite its significant influence on DSE.In this paper, we discuss the different ways in which the memory allocator can influence DSE and the main design principles that a memory allocator for DSE needs to follow: support for external calls, cross-run and cross-path determinism, spatially and temporally distanced allocations, and stability. We then present KDAlloc, a deterministic allocator for DSE that is guided by these six design principles.We implement KDAlloc in KLEE, a popular DSE engine, and first show that it is competitive with KLEE’s default allocator in terms of performance and memory overhead, and in fact significantly improves performance in several cases. We then highlight its benefits for use-after-free error detection and two distinct DSE-based techniques: MoKlee, a system for saving DSE runs to disk and later (partially) restoring them, and SymLive, a system for finding infinite-loop bugs.

Conference paper

Arras P-A, Andronidis A, Pina L, Mituzas K, Shu Q, Grumberg D, Cadar Cet al., 2022, SaBRe: load-time selective binary rewriting, International Journal on Software Tools for Technology Transfer, Vol: 24, Pages: 205-223, ISSN: 0945-8115

Binary rewriting consists in disassembling a program to modify its instructions. However, existing solutions suffer from shortcomings in terms of soundness and performance. We present SaBRe, a load-time system for selective binary rewriting. SaBRe rewrites specific constructs—particularly system calls and functions—when the program is loaded into memory, and intercepts them using plugins through a simple API. We also discuss the theoretical underpinnings of disassembling and rewriting. We developed two backends—for x86_64 and RISC-V—which were used to implement three plugins: a fast system call tracer, a multi-version executor, and a fault injector. Our evaluation shows that SaBRe imposes little overhead, typically below 3%.

Journal article

Kapus T, Busse F, Cadar C, 2020, Pending constraints in symbolic execution for better exploration and seeding, IEEE/ACM International Conference on Automated Software Engineering (ASE 2020), Publisher: IEEE / ACM

Symbolic execution is a well established technique for software testing and analysis. However, scalability continues to be a challenge, both in terms of constraint solving cost and path explosion.In this work, we present a novel approach for symbolic execution, which can enhance its scalability by aggressively prioritising execution paths that are already known to be feasible, and deferring all other paths. We evaluate our technique on nine applications, in-cludingSQLite3, make and tcpdump and show it can achieve higher coverage for both seeded and non-seeded exploration.

Conference paper

, 2020, Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE '20: 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Publisher: ACM

Conference paper

Trabish D, Kapus T, Rinetzky N, Cadar Cet al., 2020, Past-sensitive pointer analysis for symbolic execution, European Software Engineering Conference / ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE 2020), Publisher: ACM, Pages: 19-208

We propose a novel fine-grained integration of pointer analysis with dynamic analysis, including dynamic symbolic execution. This is achieved via past-sensitive pointer analysis, an on-demand pointer analysis instantiated with an abstraction of the dynamic state on which it is invoked. We evaluate our technique in three application scenarios: chopped symbolic execution, symbolic pointer resolution, and write integrity testing. Our preliminary results show that the approach can have a significant impact in these scenarios, by effectively improving the precision of standard pointer analysis with only a modest performance overhead.

Conference paper

Boehme M, Cadar C, ROYCHOUDHURY A, 2020, Fuzzing: challenges and reflections, IEEE Software, Vol: 38, Pages: 79-86, ISSN: 0740-7459

Abstract—Fuzzing is a method to discover software bugs and vulnerabilities by automatic test input generation which has found tremendous recent interest in both academia and industry. Fuzzing comes in the form of several techniques. On one hand, we have symbolic execution, which enables a particularly effective approach to fuzzing by systematically enumerating the paths of a program. On the other hand, we have random input generation, which generates large amounts of inputs per second with none or minimal program analysis overhead. In this article, we summarize the open challenges and opportunities for fuzzing and symbolic execution as they emerged in discussions among researchers and practitioners in a Shonan Meeting, and were validated in a subsequent survey. We take a forward-looking view of the software vulnerability discovery technologies and provide concrete directions for future research.

Journal article

Busse F, Nowack M, Cadar C, 2020, Running symbolic execution forever, 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, Publisher: ACM, Pages: 63-74

When symbolic execution is used to analyse real-world applications,it often consumes all available memory in a relatively short amountof time, sometimes making it impossible to analyse an applicationfor an extended period. In this paper, we present a technique thatcan record an ongoing symbolic execution analysis to disk andselectively restore paths of interest later, making it possible to runsymbolic execution indefinitely.To be successful, our approach addresses several essential re-search challenges related to detecting divergences on re-execution,storing long-running executions efficiently, changing search heur-istics during re-execution, and providing a global view of the storedexecution. Our extensive evaluation of 93 Linux applications showsthat our approach is practical, enabling these applications to runfor days while continuing to explore new execution paths.

Conference paper

Busse F, Nowack M, Cadar C, 2020, Artefact for the ISSTA 2020 paper: Running symbolic execution forever

When symbolic execution is used to analyse real-world applications, it often consumes all available memory in a relatively short amount of time, sometimes making it impossible to analyse an application for an extended period. In this paper, we present a technique that can record an ongoing symbolic execution analysis to disk and selectively restore paths of interest later, making it possible to run symbolic execution indefinitely. To be successful, our approach addresses several essential research challenges related to detecting divergences on re-execution, storing long-running executions efficiently, changing search heuristics during re-execution, and providing a global view of the stored execution. Our extensive evaluation of 93 Linux applications shows that our approach is practical, enabling these applications to run for days while continuing to explore new execution paths.

Abstract
Cite

Software

Cadar C, Nowack M, 2020, KLEE symbolic execution engine in 2019, International Journal on Software Tools for Technology Transfer, Vol: 23, Pages: 867-870, ISSN: 0945-8115

KLEE is a popular dynamic symbolic execution engine, initially designed at Stanford University and now primarily developed and maintained by the Software Reliability Group at Imperial College London. KLEE has a large community spanning both academia and industry, with over 60 contributors on GitHub, over 350 subscribers on its mailing list, and over 80 participants to a recent dedicated workshop. KLEE has been used and extended by groups from many universities and companies in a variety of different areas such as high-coverage test generation, automated debugging, exploit generation, wireless sensor networks, and online gaming, among many others.

Journal article

Marcozzi M, Tang Q, Donaldson A, Cadar Cet al., 2019, Compiler fuzzing: how much does it matter?, Proceedings of the ACM on Programming Languages, Vol: 3, Pages: 155:1-155:29, ISSN: 2475-1421

Despite much recent interest in randomised testing (fuzzing) of compilers, the practical impact of fuzzer-foundcompiler bugs on real-world applications has barely been assessed. We present the first quantitative andqualitative study of the tangible impact of miscompilation bugs in a mature compiler. We follow a rigorousmethodology where the bug impact over the compiled application is evaluated based on (1) whether the bugappears to trigger during compilation; (2) the extent to which generated assembly code changes syntacticallydue to triggering of the bug; and (3) whether such changes cause regression test suite failures, or whetherwe can manually find application inputs that trigger execution divergence due to such changes. The studyis conducted with respect to the compilation of more than 10 million lines of C/C++ code from 309 Debianpackages, using 12% of the historical and now fixed miscompilation bugs found by four state-of-the-art fuzzersin the Clang/LLVM compiler, as well as 18 bugs found by human users compiling real code or as a by-productof formal verification efforts. The results show that almost half of the fuzzer-found bugs propagate to thegenerated binaries for at least one package, in which case only a very small part of the binary is typicallyaffected, yet causing two failures when running the test suites of all the impacted packages. User-reportedand formal verification bugs do not exhibit a higher impact, with a lower rate of triggered bugs and one testfailure. The manual analysis of a selection of the syntactic changes caused by some of our bugs (fuzzer-foundand non fuzzer-found) in package assembly code, shows that either these changes have no semantic impact orthat they would require very specific runtime circumstances to trigger execution divergence.

Journal article

Kapus T, Nowack M, Cadar C, 2019, Constraints in dynamic symbolic execution: bitvectors or integers?, International Conference on Tests and Proofs, Publisher: Springer International Publishing, Pages: 41-54, ISSN: 0302-9743

Dynamic symbolic execution is a technique that analyses programs by gathering mathematical constraints along execution paths. To achieve bit-level precision, one must use the theory of bitvectors. However, other theories might achieve higher performance, justifying in some cases the possible loss of precision.In this paper, we explore the impact of using the theory of integers on the precision and performance of dynamic symbolic execution of C programs. In particular, we compare an implementation of the symbolic executor KLEE using a partial solver based on the theory of integers, with a standard implementation of KLEE using a solver based on the theory of bitvectors, both employing the popular SMT solver Z3. To our surprise, our evaluation on a synthetic sort benchmark, the ECA set of Test-Comp 2019 benchmarks, and GNU Coreutils revealed that for most applications the integer solver did not lead to any loss of precision, but the overall performance difference was rarely significant.

Conference paper

Marcozzi M, Tang Q, Cadar C, Donaldson Aet al., 2019, Compiler Fuzzing: How Much Does It Matter

Despite much recent interest in randomised testing (fuzzing) of compilers, the practical impact of fuzzer-found compiler bugs on real-world applications has barely been assessed. We present the first quantitative and qualitative study of the tangible impact of miscompilation bugs in a mature compiler. We follow a rigorous methodology where the bug impact over the compiled application is evaluated based on (1) whether the bug appears to trigger during compilation; (2) the extent to which generated assembly code changes syntactically due to triggering of the bug; and (3) whether such changes cause regression test suite failures, or whether we can manually find application inputs that trigger execution divergence due to such changes. The study is conducted with respect to the compilation of more than 10 million lines of C/C++ code from 309 Debian packages, using 12% of the historical and now fixed miscompilation bugs found by four state-of-the-art fuzzers in the Clang/LLVM compiler, as well as 18 bugs found by human users compiling real code or as a by-product of formal verification efforts. The results show that almost half of the fuzzer-found bugs propagate to the generated binaries for at least one package, in which case only a very small part of the binary is typically affected, yet causing two failures when running the test suites of all the impacted packages. User-reported and formal verification bugs do not exhibit a higher impact, with a lower rate of triggered bugs and one test failure. The manual analysis of a selection of the syntactic changes caused by some of our bugs (fuzzer-found and non fuzzer-found) in package assembly code, shows that either these changes have no semantic impact or that they would require very specific runtime circumstances to trigger execution divergence.

Abstract
Cite

Software

Marcozzi M, Tang Q, Cadar C, Donaldson Aet al., 2019, Compiler Fuzzing: How Much Does It Matter?

Despite much recent interest in compiler randomized testing (fuzzing), the practical impact of fuzzer-found compiler bugs on real-world applications has barely been assessed. We present the first quantitative and qualitative study of the tangible impact of miscompilation bugs in a mature compiler. We follow a rigorous methodology where the bug impact over the compiled application is evaluated based on (1) whether the bug appears to trigger during compilation; (2) the extent to which generated assembly code changes syntactically due to triggering of the bug; and (3) how much such changes do cause regression test suite failures and could be used to manually trigger divergences during execution. The study is conducted with respect to the compilation of more than 10 million lines of C/C++ code from 309 Debian packages, using 12% of the historical and now fixed miscompilation bugs found by four state-of-the-art fuzzers in the Clang/LLVM compiler, as well as 18 bugs found by human users compiling real code or by formal verification. The results show that almost half of the fuzzer-found bugs propagate to the generated binaries for some packages, but rarely affect their syntax and cause two failures in total when running their test suites. User-reported and formal verification bugs do not exhibit a higher impact, with less frequently triggered bugs and one test failure. Our manual analysis of a selection of bugs, either fuzzer-found or not, suggests that none can easily trigger a runtime divergence on the packages considered in the analysis, and that in general they affect only corner cases.

Abstract
Cite

Software

Kapus T, Cadar C, 2019, A segmented memory model for symbolic execution, ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE ’19), Publisher: ACM, Pages: 774-784

Symbolic execution is an effective technique for exploring paths ina program and reasoning about all possible values on those paths.However, the technique still struggles with code that uses complex heap data structures, in which a pointer is allowed to refer to morethan one memory object. In such cases, symbolic execution typicallyforks execution into multiple states, one for each object to whichthe pointer could refer.In this paper, we propose a technique that avoids this expensiveforking by using asegmented memory model. In this model, mem-ory is split into segments, so that each symbolic pointer refers toobjects in a single segment. The size of the segments are bound by athreshold, in order to avoid expensive constraints. This results in amemory model where forking due to symbolic pointer dereferencesis significantly reduced, often completely.We evaluate our segmented memory model on a mix of wholeprogram benchmarks (such asm4andmake) and library bench-marks (such asSQLite), and observe significant decreases in execu-tion time and memory usage.

Conference paper

Liew D, Cadar C, Donaldson A, Stinnett JRet al., 2019, Just fuzz it: solving floating-point constraints using coverage-guided fuzzing, ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE ’19), Publisher: ACM, Pages: 521-532

We investigate the use of coverage-guided fuzzing as a means ofproving satisfiability of SMT formulas over finite variable domains,with specific application to floating-point constraints. We show howan SMT formula can be encoded as a program containing a locationthat is reachable if and only if the program’s input corresponds toa satisfying assignment to the formula. A coverage-guided fuzzercan then be used to search for an input that reaches the location,yielding a satisfying assignment. We have implemented this ideain a tool,JustFuzz-itSolver (JFS), and we present a large experi-mental evaluation showing that JFS is both competitive with andcomplementary to state-of-the-art SMT solvers with respect tosolving floating-point constraints, and that the coverage-guidedapproach of JFS provides significant benefit over naive fuzzing inthe floating-point domain. Applied in a portfolio manner, the JFS approach thus has the potential to complement traditional SMTsolvers for program analysis tasks that involve reasoning aboutfloating-point constraints.

Conference paper

Kapus T, Ish-Shalom O, Itzhaky S, Rinetzky N, Cadar Cet al., 2019, Computing summaries of string loops in C for better testing and refactoring, ACM SIGPLAN Conference on Programming Language Design and Implementation, Publisher: ACM, Pages: 874-888

Analysing and comprehending C programs that use stringsis hard: Using standard library functions for manipulatingstrings is not enforced and programs often use complex loopsfor the same purpose. We introduce the notion of memorylessloops that capture some of these string loops and presenta counterexample-guided inductive synthesis approach tosummarise memoryless string loops using C standard libraryfunctions, which has applications to testing, optimizationand refactoring.We prove our summarization is correct for arbitrary inputstrings and evaluate it on a database of loops we gatheredfrom a set of 13 open-source programs. Our approach cansummarize over two thirds of memoryless loops in less than5 minutes of computation time per loop. We then show thatthese summaries can be used to (1) enhance symbolic ex-ecution testing, where we observed median speedups of79x when employing a string constraint solver, (2) optimizenative code, where certain summarizations led to signifi-cant performance gains, and (3) refactor code, where wehad several patches accepted in the codebases of popularapplications such as patch and wget.

Conference paper

Pina L, Andronidis A, Hicks M, Cadar Cet al., 2019, Mvedsua: Higher availability dynamic software updates via multi-version execution, International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '19), Publisher: Association for Computing Machinery, Pages: 573-585

Dynamic Software Updating (DSU) is a technique for patch-ing stateful software without shutting it down, which enablesboth timely updates and non-stop service. Unfortunately,bugs in the update itself—whether in the changed code or inthe way the change is introduced dynamically—may causethe updated software to crash or misbehave. Furthermore,the time taken to dynamically apply the update may be un-acceptable if it introduces a long delay in service.This paper makes the key observation that both prob-lems can be addressed by employingMulti-Version Execution(MVE). To avoid delay in service, the update is applied to aforked copy while the original system continues to operate.Once the update completes, the MVE system monitors thatthe responses of both versions agree for the same inputs.Expected divergences are specified by the programmer usingan MVE-specific DSL. Unexpected divergences signal pos-sible errors and roll back the update, which simply meansterminating the updated version and reverting to the orig-inal version. This is safe because the MVE system keepsthe state of both versions in sync. If the new version showsno problems after a warmup period, operators can make itpermanent and discard the original version.We have implemented this approach, which we callMved-sua,1by extending the Kitsune DSU framework with Varan,a state-of-the-art MVE system. We have usedMvedsuato up-date several high-performance servers: Redis, Memcached,and Vsftpd. Our results show thatMvedsuasignificantlyreduces the update-time delay, imposes little overhead insteady state, and easily recovers from a variety of update-related errors.

Conference paper

Marcozzi M, Tang Q, Donaldson A, Cadar Cet al., 2019, A systematic impact study for fuzzer-found compiler bugs

Despite much recent interest in compiler fuzzing, the practical impact offuzzer-found miscompilations on real-world applications has barely beenassessed. We present the first quantitative and qualitative study of thetangible impact of fuzzer-found compiler bugs. We follow a novel methodologywhere the impact of a miscompilation bug is evaluated based on (1) whether thebug appears to trigger during compilation; (2) the extent to which generatedassembly code changes syntactically due to triggering of the bug; and (3) howlikely such changes are to cause runtime divergences during execution. Thestudy is conducted with respect to the compilation of more than 10 millionlines of C/C++ code from 309 Debian packages, using 12% of the historical andnow fixed miscompilation bugs found by four state-of-the-art fuzzers in theClang/LLVM compiler, as well as 18 other bugs found by the Alive formalverification tool or human users. The results show that almost half of thefuzzer-found bugs propagate to the generated binaries for some applications,but barely affect their syntax and only cause two failures in total whenrunning their regression test suites. Our manual analysis of a selection ofbugs suggests that these bugs cannot trigger on the packages considered in theanalysis, and that in general they affect only corner cases which have a lowprobability of occurring in practice. User-reported and Alive bugs do notexhibit a higher impact, with less frequently triggered bugs and one testfailure.

Working paper

Kuchta T, Lutellier T, Wong E, Tan L, Cadar Cet al., 2018, On the correctness of electronic documents: studying, finding, and localizing inconsistency bugs in PDF readers and files, Empirical Software Engineering, Vol: 23, Pages: 3187-3220, ISSN: 1382-3256

Electronic documents are widely used to store and share information such as bank statements, contracts, articles, maps and tax information. Many different applications exist for displaying a given electronic document, and users rightfully assume that documents will be rendered similarly independently of the application used. However, this is not always the case, and these inconsistencies, regardless of their causes—bugs in the application or the file itself—can become critical sources of miscommunication. In this paper, we present a study on the correctness of PDF documents and readers. We start by manually investigating a large number of real-world PDF documents to understand the frequency and characteristics of cross-reader inconsistencies, and find that such inconsistencies are common—13.5% PDF files are inconsistently rendered by at least one popular reader. We then propose an approach to detect and localize the source of such inconsistencies automatically. We evaluate our automatic approach on a large corpus of over 230 K documents using 11 popular readers and our experiments have detected 30 unique bugs in these readers and files. We also reported 33 bugs, some of which have already been confirmed or fixed by developers.

Journal article

Kuchta T, Palikareva H, Cadar C, 2018, Shadow symbolic execution for testing software patches, ACM Transactions on Software Engineering and Methodology, Vol: 27, ISSN: 1049-331X

While developers are aware of the importance of comprehensively testing patches, the large effortinvolved in coming up with relevant test cases means that such testing rarely happens in practice.Furthermore, even when test cases are written to cover the patch, they often exercise the samebehaviour in the old and the new version of the code. In this article, we present a symbolicexecution-based technique that is designed to generate test inputs that cover the new programbehaviours introduced by a patch. The technique works by executing both the old and the newversion in the same symbolic execution instance, with the old version shadowing the new one. Duringthis combined shadow execution, whenever a branch point is reached where the old and the newversion diverge, we generate a test case exercising the divergence and comprehensively test thenew behaviours of the new version. We evaluate our technique on theCoreutilspatches from theCoREBenchsuite of regression bugs, and show that it is able to generate test inputs that exercisenewly added behaviours and expose some of the regression bugs.

Journal article

Soria Dustmann O, Klaus W, Cadar C, 2018, PARTI: A Multi-interval Theory Solver for Symbolic Execution, ACM, IEEE/ACM International Conference on Automated Software Engineering, Pages: 430-440

Symbolic execution is an effective program analysis technique whose scalability largely depends on the ability to quickly solve large numbers of first-order logic queries. We propose an effective general technique for speeding up the solving of queries in the theory of arrays and bit-vectors with a specific structure, while otherwise falling back to a complete solver.The technique has two stages: a learning stage that determines the solution sets of each symbolic variable, and a decision stage that uses this information to quickly determine the satisfiability of certain types of queries. The main challenges involve deciding which operators to support and precisely dealing with integer type casts and arithmetic underflow and overflow.We implemented this technique in an incomplete solver called PARTI (``PARtial Theory solver for Intervals''), directly integrating it into the popular KLEE symbolic execution engine. We applied KLEE with PARTI and a state-of-the-art SMT solver to synthetic and real-world benchmarks. We found that PARTI practically does not hurt performance while many times achieving order-of-magnitude speedups.

Conference paper

Pina L, Andronidis A, Cadar C, 2018, FreeDA: deploying incompatible stock dynamic analyses in production via multi-version execution, ACM International Conference on Computing Frontiers (CF 2018), Publisher: ACM, Pages: 1-10

Dynamic analyses such as those implemented by compiler sanitizersand Valgrind are effective at finding and diagnosing challengingbugs and security vulnerabilities. However, most analyses cannotbe combined on the same program execution, and they incur ahigh overhead, which typically prevents them from being used inproduction.This paper addresses the ambitious goal of running concurrentlymultiple incompatible stock dynamic analysis tools in production,without requiring any modifications to the tools themselves oradding significant runtime overhead to the deployed system. This isaccomplished using multi-version execution, in which the dynamicanalyses are run concurrently with the native version, all on thesame program execution.We implement our approach in a system calledFreeDAand showthat it is applicable to several common scenarios, involving networkservers and interactive applications. In particular, we show howincompatible stock dynamic analyses implemented by Clang’s sani-tizers and Valgrind can be used to check high-performance serverssuch as Memcached, Nginx and Redis, and interactive applicationssuch as Git, HTop and OpenSSH.

Conference paper

Trabish D, Mattavelli A, Rinetzky N, Cadar Cet al., 2018, Chopped symbolic execution, ICSE '18: Proceedings of the 40th International Conference on Software Engineering, Publisher: ACM, Pages: 350-360

Symbolic execution is a powerful program analysis technique thatsystematically explores multiple program paths. However, despiteimportant technical advances, symbolic execution often struggles toreach deep parts of the code due to the well-known path explosionproblem and constraint solving limitations.In this paper, we proposechopped symbolic execution, a novelform of symbolic execution that allows users to specify uninter-esting parts of the code to exclude during the analysis, thus onlytargeting the exploration to paths of importance. However, theexcluded parts are not summarily ignored, as this may lead toboth false positives and false negatives. Instead, they are executedlazily, when their effect may be observable by code under anal-ysis. Chopped symbolic execution leverages various on-demandstatic analyses at runtime to automatically exclude code fragmentswhile resolving their side effects, thus avoiding expensive manualannotations and imprecision.Our preliminary results show that the approach can effectivelyimprove the effectiveness of symbolic execution in several differentscenarios, including failure reproduction and test suite augmenta-tion.

Conference paper

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: http://wlsprd.imperial.ac.uk:80/respub/WEB-INF/jsp/search-html.jsp Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: respub-action=search.html&id=00613928&limit=30&person=true