72 results found
Busse F, Gharat P, Cadar C, et al., 2022, Combining static analysis error traces with dynamic symbolic execution (experience paper), International Symposium on Software Testing and Analysis (ISSTA 2022), Publisher: ACM
This paper reports on our experience implementing a techniquefor sifting through static analysis reports using dynamic symbolicexecution. Our insight is that if a static analysis tool produces apartial trace through the program under analysis, annotated withconditions that the analyser believes are important for the bugto trigger, then a dynamic symbolic execution tool may be ableto exploit the trace by (a) guiding the search heuristically so thatpaths that follow the trace most closely are prioritised for explo-ration, and (b) pruning the search using the conditions associatedwith each step of the trace. This may allow the bug to be quicklyconfirmed using dynamic symbolic execution, if it turns out to be atrue positive, yielding an input that triggers the bug.To experiment with this approach, we have implemented the ideain a tool chain that allows the popular open-source static analysistools Clang Static Analyzer (CSA) and Infer to be combined withthe popular open-source dynamic symbolic execution engine KLEE.Our findings highlight two interesting negative results. First, whilefault injection experiments show the promise of our technique,they also reveal that the traces provided by static analysis tools arenot that useful in guiding search. Second, we have systematicallyapplied CSA and Infer to a large corpus of real-world applicationsthat are suitable for analysis with KLEE, and find that the staticanalysers are rarely able to find non-trivial true positive bugs forthis set of applications.We believe our case study can inform static analysis and dynamicsymbolic execution tool developers as to where improvements maybe necessary, and serve as a call to arms for researchers interestedin combining symbolic execution and static analysis to identifymore suitable benchmark suites for evaluation of research ideas.
Even-Mendoza K, Cadar C, Donaldson A, 2022, CsmithEdge: more effective compiler testing by handling undefined behaviour less conservatively, Empirical Software Engineering: an international journal, Vol: 27, ISSN: 1382-3256
Compiler fuzzing techniques require a means of generating programs that are free from undefined behaviour (UB) to reliably reveal miscompilation bugs. Existing program generators such as CSMITH achieve UB-freedom by heavily restricting the form of generated programs. The idiomatic nature of the resulting programs risks limiting the test coverage they can offer, and thus the compiler bugs they can discover. We investigate the idea of adapting existing fuzzers to be less restrictive concerning UB, in the practical setting of C compiler testing via a new tool, CSMITHEDGE, which extends CSMITH. CSMITHEDGE probabilistically weakens the constraints used to enforce UB-freedom, thus generated programs are no longer guaranteed to be UB-free. It then employs several off-the-shelf UB detection tools and a novel dynamic analysis to (a) detect cases where the generated program exhibits UB and (b) determine where CSMITH has been too conservative in its use of safe math wrappers that guarantee UB-freedom for arithmetic operations, removing the use of redundant ones. The resulting UB-free programs can be used to test for miscompilation bugs via differential testing. The non-UB-free programs can still be used to check that the compiler under test does not crash or hang.Our experiments on recent versions of GCC, LLVM and the Microsoft VisualStudio Compiler show that CSMITHEDGE was able to discover 7 previously unknown miscompilation bugs (5 already fixed in response to our reports) that could not be found via intensive testing using CSMITH, and 2 compiler-hang bugs that were fixed independently shortly before we considered reporting them.
Schemmel D, Büning J, Busse F, et al., 2022, A deterministic memory allocator for dynamic symbolic execution, 36th European Conference on Object-Oriented Programming (ECOOP 2022), Publisher: Schloss Dagstuhl -- Leibniz-Zentrum fuer Informatik, ISSN: 1868-8969
Dynamic symbolic execution (DSE) has established itself as an effective testing and analysis technique. While the memory model in DSE has attracted significant attention, the memory allocator has beenlargely ignored, despite its significant influence on DSE.In this paper, we discuss the different ways in which the memory allocator can influence DSE and the main design principles that a memory allocator for DSE needs to follow: support for external calls, cross-run and cross-path determinism, spatially and temporally distanced allocations, and stability. We then present KDAlloc, a deterministic allocator for DSE that is guided by these six design principles.We implement KDAlloc in KLEE, a popular DSE engine, and first show that it is competitive with KLEE’s default allocator in terms of performance and memory overhead, and in fact significantly improves performance in several cases. We then highlight its benefits for use-after-free error detection and two distinct DSE-based techniques: MoKlee, a system for saving DSE runs to disk and later (partially) restoring them, and SymLive, a system for finding infinite-loop bugs.
Andronidis A, Cadar C, 2022, SnapFuzz: High-throughput fuzzing of network applications, International Symposium on Software Testing and Analysis (ISSTA 2022), Publisher: ACM
In recent years, fuzz testing has benefited from increased com-putational power and important algorithmic advances, leading tosystems that have discovered many critical bugs and vulnerabilitiesin production software. Despite these successes, not all applicationscan be fuzzed efficiently. In particular, stateful applications such asnetwork protocol implementations are constrained by a low fuzzingthroughput and the need to develop complex fuzzing harnessesthat involve custom time delays and clean-up scripts.In this paper, we present SnapFuzz, a novel fuzzing frameworkfor network applications. SnapFuzz offers a robust architecturethat transforms slow asynchronous network communication intofast synchronous communication, snapshots the target at the latestpoint at which it is safe to do so, speeds up file operations byredirecting them to a custom in-memory filesystem, and removesthe need for many fragile modifications, such as configuring timedelays or writing clean-up scripts.Using SnapFuzz, we fuzzed five popular networking applications:LightFTP, TinyDTLS, Dnsmasq, LIVE555 and Dcmqrscp. We reportimpressive performance speedups of 62.8 x, 41.2 x, 30.6 x, 24.6 x, and8.4 x, respectively, with significantly simpler fuzzing harnesses inall cases. Due to its advantages, SnapFuzz has also found 12 extracrashes compared to AFLNet in these applications.
Binary rewriting consists in disassembling a program to modify its instructions. However, existing solutions suffer from shortcomings in terms of soundness and performance. We present SaBRe, a load-time system for selective binary rewriting. SaBRe rewrites specific constructs—particularly system calls and functions—when the program is loaded into memory, and intercepts them using plugins through a simple API. We also discuss the theoretical underpinnings of disassembling and rewriting. We developed two backends—for x86_64 and RISC-V—which were used to implement three plugins: a fast system call tracer, a multi-version executor, and a fault injector. Our evaluation shows that SaBRe imposes little overhead, typically below 3%.
Kapus T, Busse F, Cadar C, 2020, Pending constraints in symbolic execution for better exploration and seeding, IEEE/ACM International Conference on Automated Software Engineering (ASE 2020), Publisher: IEEE / ACM
Symbolic execution is a well established technique for software testing and analysis. However, scalability continues to be a challenge, both in terms of constraint solving cost and path explosion.In this work, we present a novel approach for symbolic execution, which can enhance its scalability by aggressively prioritising execution paths that are already known to be feasible, and deferring all other paths. We evaluate our technique on nine applications, in-cludingSQLite3, make and tcpdump and show it can achieve higher coverage for both seeded and non-seeded exploration.
, 2020, Proceedings of the 28th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, ESEC/FSE '20: 28th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Publisher: ACM
Trabish D, Kapus T, Rinetzky N, et al., 2020, Past-sensitive pointer analysis for symbolic execution, European Software Engineering Conference / ACM SIGSOFT Symposium on the Foundations of Software Engineering (ESEC/FSE 2020), Publisher: ACM, Pages: 19-208
We propose a novel fine-grained integration of pointer analysis with dynamic analysis, including dynamic symbolic execution. This is achieved via past-sensitive pointer analysis, an on-demand pointer analysis instantiated with an abstraction of the dynamic state on which it is invoked. We evaluate our technique in three application scenarios: chopped symbolic execution, symbolic pointer resolution, and write integrity testing. Our preliminary results show that the approach can have a significant impact in these scenarios, by effectively improving the precision of standard pointer analysis with only a modest performance overhead.
Boehme M, Cadar C, ROYCHOUDHURY A, 2020, Fuzzing: challenges and reflections, IEEE Software, Vol: 38, Pages: 79-86, ISSN: 0740-7459
Abstract—Fuzzing is a method to discover software bugs and vulnerabilities by automatic test input generation which has found tremendous recent interest in both academia and industry. Fuzzing comes in the form of several techniques. On one hand, we have symbolic execution, which enables a particularly effective approach to fuzzing by systematically enumerating the paths of a program. On the other hand, we have random input generation, which generates large amounts of inputs per second with none or minimal program analysis overhead. In this article, we summarize the open challenges and opportunities for fuzzing and symbolic execution as they emerged in discussions among researchers and practitioners in a Shonan Meeting, and were validated in a subsequent survey. We take a forward-looking view of the software vulnerability discovery technologies and provide concrete directions for future research.
Busse F, Nowack M, Cadar C, 2020, Running symbolic execution forever, 29th ACM SIGSOFT International Symposium on Software Testing and Analysis, Publisher: ACM, Pages: 63-74
When symbolic execution is used to analyse real-world applications,it often consumes all available memory in a relatively short amountof time, sometimes making it impossible to analyse an applicationfor an extended period. In this paper, we present a technique thatcan record an ongoing symbolic execution analysis to disk andselectively restore paths of interest later, making it possible to runsymbolic execution indefinitely.To be successful, our approach addresses several essential re-search challenges related to detecting divergences on re-execution,storing long-running executions efficiently, changing search heur-istics during re-execution, and providing a global view of the storedexecution. Our extensive evaluation of 93 Linux applications showsthat our approach is practical, enabling these applications to runfor days while continuing to explore new execution paths.
Busse F, Nowack M, Cadar C, 2020, Artefact for the ISSTA 2020 paper: Running symbolic execution forever
When symbolic execution is used to analyse real-world applications, it often consumes all available memory in a relatively short amount of time, sometimes making it impossible to analyse an application for an extended period. In this paper, we present a technique that can record an ongoing symbolic execution analysis to disk and selectively restore paths of interest later, making it possible to run symbolic execution indefinitely. To be successful, our approach addresses several essential research challenges related to detecting divergences on re-execution, storing long-running executions efficiently, changing search heuristics during re-execution, and providing a global view of the stored execution. Our extensive evaluation of 93 Linux applications shows that our approach is practical, enabling these applications to run for days while continuing to explore new execution paths.
Cadar C, Nowack M, 2020, KLEE symbolic execution engine in 2019, International Journal on Software Tools for Technology Transfer, Vol: 23, Pages: 867-870, ISSN: 0945-8115
KLEE is a popular dynamic symbolic execution engine, initially designed at Stanford University and now primarily developed and maintained by the Software Reliability Group at Imperial College London. KLEE has a large community spanning both academia and industry, with over 60 contributors on GitHub, over 350 subscribers on its mailing list, and over 80 participants to a recent dedicated workshop. KLEE has been used and extended by groups from many universities and companies in a variety of different areas such as high-coverage test generation, automated debugging, exploit generation, wireless sensor networks, and online gaming, among many others.
Despite much recent interest in randomised testing (fuzzing) of compilers, the practical impact of fuzzer-foundcompiler bugs on real-world applications has barely been assessed. We present the first quantitative andqualitative study of the tangible impact of miscompilation bugs in a mature compiler. We follow a rigorousmethodology where the bug impact over the compiled application is evaluated based on (1) whether the bugappears to trigger during compilation; (2) the extent to which generated assembly code changes syntacticallydue to triggering of the bug; and (3) whether such changes cause regression test suite failures, or whetherwe can manually find application inputs that trigger execution divergence due to such changes. The studyis conducted with respect to the compilation of more than 10 million lines of C/C++ code from 309 Debianpackages, using 12% of the historical and now fixed miscompilation bugs found by four state-of-the-art fuzzersin the Clang/LLVM compiler, as well as 18 bugs found by human users compiling real code or as a by-productof formal verification efforts. The results show that almost half of the fuzzer-found bugs propagate to thegenerated binaries for at least one package, in which case only a very small part of the binary is typicallyaffected, yet causing two failures when running the test suites of all the impacted packages. User-reportedand formal verification bugs do not exhibit a higher impact, with a lower rate of triggered bugs and one testfailure. The manual analysis of a selection of the syntactic changes caused by some of our bugs (fuzzer-foundand non fuzzer-found) in package assembly code, shows that either these changes have no semantic impact orthat they would require very specific runtime circumstances to trigger execution divergence.
Kapus T, Nowack M, Cadar C, 2019, Constraints in dynamic symbolic execution: bitvectors or integers?, International Conference on Tests and Proofs, Publisher: Springer International Publishing, Pages: 41-54, ISSN: 0302-9743
Dynamic symbolic execution is a technique that analyses programs by gathering mathematical constraints along execution paths. To achieve bit-level precision, one must use the theory of bitvectors. However, other theories might achieve higher performance, justifying in some cases the possible loss of precision.In this paper, we explore the impact of using the theory of integers on the precision and performance of dynamic symbolic execution of C programs. In particular, we compare an implementation of the symbolic executor KLEE using a partial solver based on the theory of integers, with a standard implementation of KLEE using a solver based on the theory of bitvectors, both employing the popular SMT solver Z3. To our surprise, our evaluation on a synthetic sort benchmark, the ECA set of Test-Comp 2019 benchmarks, and GNU Coreutils revealed that for most applications the integer solver did not lead to any loss of precision, but the overall performance difference was rarely significant.
Marcozzi M, Tang Q, Cadar C, et al., 2019, Compiler Fuzzing: How Much Does It Matter
Despite much recent interest in randomised testing (fuzzing) of compilers, the practical impact of fuzzer-found compiler bugs on real-world applications has barely been assessed. We present the first quantitative and qualitative study of the tangible impact of miscompilation bugs in a mature compiler. We follow a rigorous methodology where the bug impact over the compiled application is evaluated based on (1) whether the bug appears to trigger during compilation; (2) the extent to which generated assembly code changes syntactically due to triggering of the bug; and (3) whether such changes cause regression test suite failures, or whether we can manually find application inputs that trigger execution divergence due to such changes. The study is conducted with respect to the compilation of more than 10 million lines of C/C++ code from 309 Debian packages, using 12% of the historical and now fixed miscompilation bugs found by four state-of-the-art fuzzers in the Clang/LLVM compiler, as well as 18 bugs found by human users compiling real code or as a by-product of formal verification efforts. The results show that almost half of the fuzzer-found bugs propagate to the generated binaries for at least one package, in which case only a very small part of the binary is typically affected, yet causing two failures when running the test suites of all the impacted packages. User-reported and formal verification bugs do not exhibit a higher impact, with a lower rate of triggered bugs and one test failure. The manual analysis of a selection of the syntactic changes caused by some of our bugs (fuzzer-found and non fuzzer-found) in package assembly code, shows that either these changes have no semantic impact or that they would require very specific runtime circumstances to trigger execution divergence.
Marcozzi M, Tang Q, Cadar C, et al., 2019, Compiler Fuzzing: How Much Does It Matter?
Despite much recent interest in compiler randomized testing (fuzzing), the practical impact of fuzzer-found compiler bugs on real-world applications has barely been assessed. We present the first quantitative and qualitative study of the tangible impact of miscompilation bugs in a mature compiler. We follow a rigorous methodology where the bug impact over the compiled application is evaluated based on (1) whether the bug appears to trigger during compilation; (2) the extent to which generated assembly code changes syntactically due to triggering of the bug; and (3) how much such changes do cause regression test suite failures and could be used to manually trigger divergences during execution. The study is conducted with respect to the compilation of more than 10 million lines of C/C++ code from 309 Debian packages, using 12% of the historical and now fixed miscompilation bugs found by four state-of-the-art fuzzers in the Clang/LLVM compiler, as well as 18 bugs found by human users compiling real code or by formal verification. The results show that almost half of the fuzzer-found bugs propagate to the generated binaries for some packages, but rarely affect their syntax and cause two failures in total when running their test suites. User-reported and formal verification bugs do not exhibit a higher impact, with less frequently triggered bugs and one test failure. Our manual analysis of a selection of bugs, either fuzzer-found or not, suggests that none can easily trigger a runtime divergence on the packages considered in the analysis, and that in general they affect only corner cases.
Liew D, Cadar C, Donaldson A, et al., 2019, Just fuzz it: solving floating-point constraints using coverage-guided fuzzing, ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE ’19), Publisher: ACM, Pages: 521-532
We investigate the use of coverage-guided fuzzing as a means ofproving satisfiability of SMT formulas over finite variable domains,with specific application to floating-point constraints. We show howan SMT formula can be encoded as a program containing a locationthat is reachable if and only if the program’s input corresponds toa satisfying assignment to the formula. A coverage-guided fuzzercan then be used to search for an input that reaches the location,yielding a satisfying assignment. We have implemented this ideain a tool,JustFuzz-itSolver (JFS), and we present a large experi-mental evaluation showing that JFS is both competitive with andcomplementary to state-of-the-art SMT solvers with respect tosolving floating-point constraints, and that the coverage-guidedapproach of JFS provides significant benefit over naive fuzzing inthe floating-point domain. Applied in a portfolio manner, the JFS approach thus has the potential to complement traditional SMTsolvers for program analysis tasks that involve reasoning aboutfloating-point constraints.
Kapus T, Cadar C, 2019, A segmented memory model for symbolic execution, ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE ’19), Publisher: ACM, Pages: 774-784
Symbolic execution is an effective technique for exploring paths ina program and reasoning about all possible values on those paths.However, the technique still struggles with code that uses complex heap data structures, in which a pointer is allowed to refer to morethan one memory object. In such cases, symbolic execution typicallyforks execution into multiple states, one for each object to whichthe pointer could refer.In this paper, we propose a technique that avoids this expensiveforking by using asegmented memory model. In this model, mem-ory is split into segments, so that each symbolic pointer refers toobjects in a single segment. The size of the segments are bound by athreshold, in order to avoid expensive constraints. This results in amemory model where forking due to symbolic pointer dereferencesis significantly reduced, often completely.We evaluate our segmented memory model on a mix of wholeprogram benchmarks (such asm4andmake) and library bench-marks (such asSQLite), and observe significant decreases in execu-tion time and memory usage.
Kapus T, Ish-Shalom O, Itzhaky S, et al., 2019, Computing summaries of string loops in C for better testing and refactoring, ACM SIGPLAN Conference on Programming Language Design and Implementation, Publisher: ACM, Pages: 874-888
Analysing and comprehending C programs that use stringsis hard: Using standard library functions for manipulatingstrings is not enforced and programs often use complex loopsfor the same purpose. We introduce the notion of memorylessloops that capture some of these string loops and presenta counterexample-guided inductive synthesis approach tosummarise memoryless string loops using C standard libraryfunctions, which has applications to testing, optimizationand refactoring.We prove our summarization is correct for arbitrary inputstrings and evaluate it on a database of loops we gatheredfrom a set of 13 open-source programs. Our approach cansummarize over two thirds of memoryless loops in less than5 minutes of computation time per loop. We then show thatthese summaries can be used to (1) enhance symbolic ex-ecution testing, where we observed median speedups of79x when employing a string constraint solver, (2) optimizenative code, where certain summarizations led to signifi-cant performance gains, and (3) refactor code, where wehad several patches accepted in the codebases of popularapplications such as patch and wget.
Pina L, Andronidis A, Hicks M, et al., 2019, Mvedsua: Higher availability dynamic software updates via multi-version execution, International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS '19), Publisher: Association for Computing Machinery, Pages: 573-585
Dynamic Software Updating (DSU) is a technique for patch-ing stateful software without shutting it down, which enablesboth timely updates and non-stop service. Unfortunately,bugs in the update itself—whether in the changed code or inthe way the change is introduced dynamically—may causethe updated software to crash or misbehave. Furthermore,the time taken to dynamically apply the update may be un-acceptable if it introduces a long delay in service.This paper makes the key observation that both prob-lems can be addressed by employingMulti-Version Execution(MVE). To avoid delay in service, the update is applied to aforked copy while the original system continues to operate.Once the update completes, the MVE system monitors thatthe responses of both versions agree for the same inputs.Expected divergences are specified by the programmer usingan MVE-specific DSL. Unexpected divergences signal pos-sible errors and roll back the update, which simply meansterminating the updated version and reverting to the orig-inal version. This is safe because the MVE system keepsthe state of both versions in sync. If the new version showsno problems after a warmup period, operators can make itpermanent and discard the original version.We have implemented this approach, which we callMved-sua,1by extending the Kitsune DSU framework with Varan,a state-of-the-art MVE system. We have usedMvedsuato up-date several high-performance servers: Redis, Memcached,and Vsftpd. Our results show thatMvedsuasignificantlyreduces the update-time delay, imposes little overhead insteady state, and easily recovers from a variety of update-related errors.
Marcozzi M, Tang Q, Donaldson A, et al., 2019, A systematic impact study for fuzzer-found compiler bugs
Despite much recent interest in compiler fuzzing, the practical impact offuzzer-found miscompilations on real-world applications has barely beenassessed. We present the first quantitative and qualitative study of thetangible impact of fuzzer-found compiler bugs. We follow a novel methodologywhere the impact of a miscompilation bug is evaluated based on (1) whether thebug appears to trigger during compilation; (2) the extent to which generatedassembly code changes syntactically due to triggering of the bug; and (3) howlikely such changes are to cause runtime divergences during execution. Thestudy is conducted with respect to the compilation of more than 10 millionlines of C/C++ code from 309 Debian packages, using 12% of the historical andnow fixed miscompilation bugs found by four state-of-the-art fuzzers in theClang/LLVM compiler, as well as 18 other bugs found by the Alive formalverification tool or human users. The results show that almost half of thefuzzer-found bugs propagate to the generated binaries for some applications,but barely affect their syntax and only cause two failures in total whenrunning their regression test suites. Our manual analysis of a selection ofbugs suggests that these bugs cannot trigger on the packages considered in theanalysis, and that in general they affect only corner cases which have a lowprobability of occurring in practice. User-reported and Alive bugs do notexhibit a higher impact, with less frequently triggered bugs and one testfailure.
Kuchta T, Lutellier T, Wong E, et al., 2018, On the correctness of electronic documents: studying, finding, and localizing inconsistency bugs in PDF readers and files, Empirical Software Engineering, Vol: 23, Pages: 3187-3220, ISSN: 1382-3256
Electronic documents are widely used to store and share information such as bank statements, contracts, articles, maps and tax information. Many different applications exist for displaying a given electronic document, and users rightfully assume that documents will be rendered similarly independently of the application used. However, this is not always the case, and these inconsistencies, regardless of their causes—bugs in the application or the file itself—can become critical sources of miscommunication. In this paper, we present a study on the correctness of PDF documents and readers. We start by manually investigating a large number of real-world PDF documents to understand the frequency and characteristics of cross-reader inconsistencies, and find that such inconsistencies are common—13.5% PDF files are inconsistently rendered by at least one popular reader. We then propose an approach to detect and localize the source of such inconsistencies automatically. We evaluate our automatic approach on a large corpus of over 230 K documents using 11 popular readers and our experiments have detected 30 unique bugs in these readers and files. We also reported 33 bugs, some of which have already been confirmed or fixed by developers.
Kuchta T, Palikareva H, Cadar C, 2018, Shadow symbolic execution for testing software patches, ACM Transactions on Software Engineering and Methodology, Vol: 27, ISSN: 1049-331X
While developers are aware of the importance of comprehensively testing patches, the large effortinvolved in coming up with relevant test cases means that such testing rarely happens in practice.Furthermore, even when test cases are written to cover the patch, they often exercise the samebehaviour in the old and the new version of the code. In this article, we present a symbolicexecution-based technique that is designed to generate test inputs that cover the new programbehaviours introduced by a patch. The technique works by executing both the old and the newversion in the same symbolic execution instance, with the old version shadowing the new one. Duringthis combined shadow execution, whenever a branch point is reached where the old and the newversion diverge, we generate a test case exercising the divergence and comprehensively test thenew behaviours of the new version. We evaluate our technique on theCoreutilspatches from theCoREBenchsuite of regression bugs, and show that it is able to generate test inputs that exercisenewly added behaviours and expose some of the regression bugs.
Soria Dustmann O, Klaus W, Cadar C, 2018, PARTI: A Multi-interval Theory Solver for Symbolic Execution, ACM, IEEE/ACM International Conference on Automated Software Engineering, Pages: 430-440
Symbolic execution is an effective program analysis technique whose scalability largely depends on the ability to quickly solve large numbers of first-order logic queries. We propose an effective general technique for speeding up the solving of queries in the theory of arrays and bit-vectors with a specific structure, while otherwise falling back to a complete solver.The technique has two stages: a learning stage that determines the solution sets of each symbolic variable, and a decision stage that uses this information to quickly determine the satisfiability of certain types of queries. The main challenges involve deciding which operators to support and precisely dealing with integer type casts and arithmetic underflow and overflow.We implemented this technique in an incomplete solver called PARTI (``PARtial Theory solver for Intervals''), directly integrating it into the popular KLEE symbolic execution engine. We applied KLEE with PARTI and a state-of-the-art SMT solver to synthetic and real-world benchmarks. We found that PARTI practically does not hurt performance while many times achieving order-of-magnitude speedups.
Pina L, Andronidis A, Cadar C, 2018, FreeDA: deploying incompatible stock dynamic analyses in production via multi-version execution, ACM International Conference on Computing Frontiers (CF 2018), Publisher: ACM, Pages: 1-10
Dynamic analyses such as those implemented by compiler sanitizersand Valgrind are effective at finding and diagnosing challengingbugs and security vulnerabilities. However, most analyses cannotbe combined on the same program execution, and they incur ahigh overhead, which typically prevents them from being used inproduction.This paper addresses the ambitious goal of running concurrentlymultiple incompatible stock dynamic analysis tools in production,without requiring any modifications to the tools themselves oradding significant runtime overhead to the deployed system. This isaccomplished using multi-version execution, in which the dynamicanalyses are run concurrently with the native version, all on thesame program execution.We implement our approach in a system calledFreeDAand showthat it is applicable to several common scenarios, involving networkservers and interactive applications. In particular, we show howincompatible stock dynamic analyses implemented by Clang’s sani-tizers and Valgrind can be used to check high-performance serverssuch as Memcached, Nginx and Redis, and interactive applicationssuch as Git, HTop and OpenSSH.
Symbolic execution is a powerful program analysis technique thatsystematically explores multiple program paths. However, despiteimportant technical advances, symbolic execution often struggles toreach deep parts of the code due to the well-known path explosionproblem and constraint solving limitations.In this paper, we proposechopped symbolic execution, a novelform of symbolic execution that allows users to specify uninter-esting parts of the code to exclude during the analysis, thus onlytargeting the exploration to paths of importance. However, theexcluded parts are not summarily ignored, as this may lead toboth false positives and false negatives. Instead, they are executedlazily, when their effect may be observable by code under anal-ysis. Chopped symbolic execution leverages various on-demandstatic analyses at runtime to automatically exclude code fragmentswhile resolving their side effects, thus avoiding expensive manualannotations and imprecision.Our preliminary results show that the approach can effectivelyimprove the effectiveness of symbolic execution in several differentscenarios, including failure reproduction and test suite augmenta-tion.
Liew D, Schemmel D, Cadar C, et al., 2017, Floating-Point Symbolic Execution: A Case Study in N-version Programming, IEEE/ACM International Conference on Automated Software Engineering (ASE 2017), Publisher: IEEE
Kapus T, Cadar C, 2017, Automatic Testing of Symbolic Execution Enginesvia Program Generation and Differential Testing, IEEE/ACM International Conference on Automated Software Engineering (ASE 2017), Publisher: IEEE
Perry DM, Mattavelli A, Zhang X, et al., 2017, Accelerating array constraints in symbolic execution, International Symposium on Software Testing and Analysis (ISSTA), Publisher: ACM
Despite significant recent advances, the effectiveness of symbolicexecution is limited when used to test complex, real-world software.One of the main scalability challenges is related to constraint solv-ing: large applications and long exploration paths lead to complexconstraints, often involving big arrays indexed by symbolic expres-sions. In this paper, we propose a set of semantics-preserving trans-formations for array operations that take advantage of contextualinformation collected during symbolic execution. Our transforma-tions lead to simpler encodings and hence better performance inconstraint solving. The results we obtain are encouraging: we show,through an extensive experimental analysis, that our transforma-tions help to significantly improve the performance of symbolicexecution in the presence of arrays. We also show that our transfor-mations enable the analysis of new code, which would be otherwiseout of reach for symbolic execution.
Ganchinho de Pina L, Grumberg D, Andronidis A, et al., 2017, A DSL approach to reconcile equivalent divergent program executions, USENIX Annual Technical Conference 2017, Publisher: USENIX, Pages: 417-429
Multi-Version Execution (MVE) deploys multiple versions of the same program, typically synchronizing their execution at the level of system calls. By default, MVE requires all deployed versions to issue the same sequence of system calls, which limits the types of versions whichcan be deployed.In this paper, we propose a Domain-Specific Language (DSL) to reconcile expected divergences between different program versions deployed through MVE. We evaluate the DSL by adding it to an existing MVE system (Varan) and testing it via three scenarios: (1) deploying the same program under different configurations, (2) deploying different releases of the same program, and (3) deploying dynamic analyses in parallel with the native execution. We also present an algorithm to automatically extract DSL rules from pairs of system call traces. Our results show that each scenario requires a small number of simple rules (at most 14 rules in each case) and that writing DSL rules can be partially automated.
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.