We describe a technique for systematic testing of multi-threaded programs. We combine Quasi-Optimal Partial-Order Reduction, a state-of-the-art technique that tackles path explosion due to interleaving non-determinism, with symbolic execution to handle data non-determinism. Our technique iteratively and exhaustively finds all executions of the program. It represents program executions using partial orders and finds the next execution using an underlying unfolding semantics. We avoid the exploration of redundant program traces using cutoff events. We implemented our technique as an extension of KLEE and evaluated it on a set of large multi-threaded C programs. Our experiments found several previously undiscovered bugs and undefined behaviors in memcached and GNU sort, showing that the new method is capable of finding bugs in industrial-size benchmarks.
The main reason for the standardization of network protocols, like QUIC, is to ensure interoperability between implementations, which poses a challenging task. Manual tests are currently used to test the different existing implementations for interoperability, but given the complex nature of network protocols, it is hard to cover all possible edge cases. State-of-the-art automated software testing techniques, such as Symbolic Execution (SymEx), have proven themselves capable of analyzing complex real-world software and finding hard to detect bugs. We present a SymEx-based method for finding interoperability issues in QUIC implementations, and explore its merit in a case study that analyzes the interoperability of picoquic and QUANT. We find that, while SymEx is able to analyze deep interactions between different implementations and uncovers several bugs, in order to enable efficient interoperability testing, implementations need to provide additional information about their current protocol state.
Processes in computer simulations tend to be highly repetitive. In particular, parameter studies further exasperate the situation as the same model is repeatedly executed with only partially varying parameters. Consequently, computer simulations perform identical computations, with identical code, identical input, and hence identical output. These redundant computations waste significant amounts of time and energy. Memoization, dating back to 1968, enables the caching of such identical intermediate results, thereby significantly speeding up those computations. However, until now, automated approaches were limited to pure functions. At ACM SIGSIM-PADS 2016 we published, to the best of our knowledge, the first practical approach for automated memoization for impure code. In this work, we extend this approach and evaluate the performance characteristics of a number of extensions that deal with questions posed at PADS: (1) To reduce and bound the memory footprint, we investigate several cache eviction strategies. (2) We allow the original and the memoized code to coexist via a runtime-switch and analyze the crossover point, thereby mitigating memoization overhead. (3) By optionally persisting the Memoization Cache to disk, we expand the scope to exploratory parameter studies where cached results can now be reused across multiple simulation runs. Altogether, automated memoization for impure code is a valuable technique, the versatility of which we explore further in this article. It sped up a case study of an OFDM network simulation by a factor of more than 80 with an only marginal increase of memory consumption.
Liveness violation bugs are notoriously hard to detect, especially due to the difficulty inherent in applying formal methods to real-world programs. We present a generic and practically useful liveness property which defines a program as being live as long as it will eventually either consume more input or terminate. We show that this property naturally maps to many different kinds of real-world programs.To demonstrate the usefulness of our liveness property, we also present an algorithm that can be efficiently implemented to dynamically find lassos in the target program’s state space during Symbolic Execution. This extends Symbolic Execution, a well known dynamic testing technique, to find a new class of program defects, namely liveness violations, while only incurring a small runtime and memory overhead, as evidenced by our evaluation. The implementation of our method found a total of five previously undiscovered software defects in BusyBox and the GNU Coreutils. All five defects have been confirmed and fixed by the respective maintainers after shipping for years, most of them well over a decade.
Especially in the case of Cyber-Physical Systems (CPSs), testbed validations and benchmarks, while necessary, incur significant setup and operation costs. Optimized benchmark sets reduce the number of tests that need to be performed, which ultimately reduces costs. In this paper, we propose a new methodology to provide automated assistance for optimizing existing benchmarks or for creating new ones from scratch. The proposed methodology is based on complete Symbolic Execution of a single control loop iteration, optionally expanded with a Nondeterministic Finite Automaton (NFA) model that represents possible changes in the environment or the system in between control loop iterations. This enables us to compute a stress number that represents the computational burden put upon the controller by a respective benchmark. By iteratively searching for benchmarks with high stress numbers and automatically detecting and pruning benchmarks that induce the same path through the controller code, we can ultimately create a minimal set of relevant benchmarks for a CPS.
Symbolic execution is a well-known program analysis technique for testing software, which makes intensive use of constraint solvers. Recent support for floating-point constraint solving has made it feasible to support floating-point reasoning in symbolic execution tools. In this paper, we present the experience of two research teams that independently added floating-point support to KLEE, a popular symbolic execution engine. Since the two teams independently developed their extensions, this created the rare opportunity to conduct a rigorous comparison between the two implementations, essentially a modern case study on N-version programming. As part of our comparison, we report on the different design and implementation decisions taken by each team, and show their impact on a rigorously assembled and tested set of benchmarks, itself a contribution of the paper.
The softwarization of networks provides a new degree of flexibility in network operation but its software components can result in unexpected runtime performance and erratic network behavior. This challenges the deployment of flexible software functions in performance critical (core) networks. To address this challenge, we present a methodology enabling the prediction of runtime performance and testing of functional behavior of Network Functions. Unlike traditional performance evaluation, e.g., testbed testing or simulation, our methodology can characterize the Network Function performance for any possible workload only by code analysis.
In computer simulations many processes are highly repetitive. These repetitions are amplified further when a parameter study is conducted where the same model is repeatedly executed with varying parameters, especially when performing multiple runs to increase statistical confidence. Inevitably, such repetitions result in the execution of identical computations, with identical code, identical input, and hence identical output. Performing computations redundantly wastes resources and the execution time of a parameter study could be reduced if the redundancies were avoided. To this end, the idea of memoization was proposed decades ago. However, until today memoization is either performed manually or automated memoization approaches are used that can only handle pure functions. This means that only the function parameters and the return value may be input and output of the function whereas side effects are not allowed. In order to expand the scope of automated memoization to a larger class of programs, we propose an approach able to reliably detect the full input and output of a function, including reading and writing objects through arbitrarily indirect pointers with some preconditions. We show the feasibility of our approach and derive simple performance approximations enabling rough predictions of the expected benefit. By means of a simple case study performing an OFDM network simulation, we demonstrate the practical suitability of our approach, speeding up the execution of the whole parameter study by a factor of 75, while only doubling memory consumption.
Developing complex technical systems requires a systematic exploration of the given design space in order to identify optimal system configurations. However, studying the effects and interactions of even a small number of system parameters often requires an extensive number of simulation runs. This in turn results in excessive runtime demands which severely hamper thorough design space explorations. In this paper, we present a parallel discrete event simulation scheme that enables cost- and time-efficient execution of large scale parameter studies on GPUs. In order to efficiently accommodate the stream-processing paradigm of GPUs, our parallelization scheme exploits two orthogonal levels of parallelism = External parallelism among the inherently independent simulations of a parameter study and internal parallelism among independent events within each individual simulation of a parameter study. Specifically, we design an event aggregation strategy based on external parallelism that generates workloads suitable for GPUs. In addition, we define a pipelined event execution mechanism based on internal parallelism to hide the transfer latencies between host- and GPU-memory. We analyze the performance characteristics of our parallelization scheme by means of a prototype implementation and show a 25-fold performance improvement over purely CPU-based execution.