# Efficient Verification of Programs with Complex Data Structures Using SMT Solvers

#### PhD Thesis

Author(s):Tianhai Liu
School:Karlsruhe Institute of Technology
Year:2018
DOI:10.5445/IR/1000084545

## Abstract

Complex data structures such as list, tree, and graph are mainly located on the heap. Verifying programs with complex data structures against the properties that constrain the configurations of the objects on the heap is particularly important for safety-critical software systems with extensive heap manipulations. Erroneous heap manipulations may cause loss or unauthorized access to data, violate software security, and may eventually cause a system to crash. Program verification techniques like bounded and deductive program verification, in general, are capable of verifying a program against a property of complex data structures. However, there is always a trade-off between the scope of their analyses, and their level of automation. Bounded program verification techniques are fully automatic, but they only analyze a program for a small scope. Deductive program verification, on the other hand, have no restriction on the scope, but they often require users to provide auxiliary specifications such as loop invariants and method contracts for sub-routines. In this thesis, we present a verification infrastructure combing the benefits of different program verification techniques. To verify a program with respect to a specification—specifying a property that is expected to be satisfied by the program, the envisioned verification process is as follows:
1. Bounded program verification. For gaining fast and initial confidence in the correctness of the program regarding the expected property, we first check whether the property holds for a small scope without providing any auxiliary specification. If a counterexample to the correctness of the program is found, no further analysis is required. Otherwise, the property holds—but only for the scope; thus a deductive verification—an unbounded program verification, is still needed.
2. Abstraction. Construct a semantic slice of the program with respect to the property. The semantic slice is an abstract program which contains those statements of the original program that are relevant to the property. The remainder of the original program (i.e., the irrelevant statements) is replaced by abstractions. Proving the correctness of slices requires less auxiliary specifications as slices have less details.
3. Bounded verification of auxiliary specifications. Before continuing with the deductive verification, we check whether the user-provided auxiliary specifications hold in the slice using bounded program verification. If a counterexample to the correctness of the specifications is found, an inspection of the specification is needed. Otherwise, go to step 4.
4. Deductive program verification. Prove the correctness of the slice with the bounded-verified auxiliary specifications. If the slice (i.e., the abstract program) is proved, the original program satisfies the expected property as well (by the construction of the slice), and the analysis terminates. Otherwise, a counterexample to the slice is found, and thus go to step 5.
5. Refinement. Check whether the counterexample is valid for the original program, i.e., it is also a counterexample to the original program. If it is valid, a fault has been discovered, and the analysis terminates. Otherwise, we refine the slice so that the invalid counterexample is eliminated. The process then starts over at step 3.
This verification process is an instantiation of counterexample-guided abstraction refinement framework and forms the basis of our verification infrastructure. Though it still relies on user-provided specifications for deductive program verification, it holds promise for efficiency. It allows only the necessary parts of the program for the expected property to be analyzed in deductive program verification, so it eases the burden of manually discovering useful annotations. It guarantees fast and initial confidence in the correctness of program and auxiliary specifications, hence avoids unnecessary attempts and facilitates users to inspect the failed proofs in deductive program verification. Our infrastructure repeatedly uses bounded program verification. To improve its efficiency, we provide novel approaches for bounded program verification. We provide an SMT-based encoding for bounded program verification by exploiting the recent advances of satisfiability modulo theory solvers. The encoding supports programs with complex data structures and specifications with arbitrarily nested quantifiers and reachability expressions, and is used in the rest approaches presented in this thesis. Besides, we provide a calculus to compute suitable scopes for gaining a high code coverage at bounded program verification. Finally, we study the effect of using various constraint solving techniques on the time efficiency of symbol execution—a technique used in program verification systems.

## BibTeX

@phdthesis{Liu2018PhD,
author    = {Tianhai Liu},
title     = {Efficient Verification of Programs with Complex Data Structures Using SMT Solvers},
school    = {Karlsruhe Institute of Technology},
year      = {2018},
month     = jul,
doi       = {10.5445/IR/1000084545},
abstract  = {Complex data structures such as list, tree, and graph are
mainly located on the heap. Verifying programs with complex
data structures against the properties that constrain the
configurations of the objects on the heap is particularly
important for safety-critical software systems with
extensive heap manipulations. Erroneous heap manipulations
software security, and may eventually cause a system to crash.
Program verification techniques like bounded and deductive
program verification, in general, are capable of verifying a
program against a property of complex data structures.
However, there is always a trade-off between the scope of
their analyses, and their level of automation.
\emph{Bounded program verification} techniques are fully
automatic, but they only analyze a program for a small scope.
\emph{Deductive program verification}, on the other hand,
have no restriction on the scope, but they often require
users to provide auxiliary specifications such as loop
invariants and method contracts for sub-routines.
In this thesis, we present a verification infrastructure
combing the benefits of different program verification
techniques. To verify a program with respect
to a specification—specifying a \emph{property} that is
expected to be satisfied by the program, the envisioned
verification process is as follows:
\newline
1. Bounded program verification. For gaining fast and initial
confidence in the correctness of the program regarding the
expected property, we first check whether the property
holds for a small scope without providing any auxiliary
specification. If a counterexample to the correctness
of the program is found, no further analysis is required.
Otherwise, the property holds—but only for the scope; thus
a deductive verification—an unbounded program verification,
is still needed.\newline
2. Abstraction. Construct a semantic slice of the program with
respect to the property. The semantic slice is an abstract
program which contains those statements of the original
program that are relevant to the property. The remainder of
the original program (i.e., the irrelevant statements) is
replaced by abstractions. Proving the correctness of slices
requires less auxiliary specifications as slices have less
details.\newline
3. Bounded verification of auxiliary specifications. Before
continuing with the deductive verification, we check
whether the user-provided auxiliary specifications hold in
the slice using bounded program verification. If a
counterexample to the correctness of the specifications is
found, an inspection of the specification is needed.
Otherwise, go to step 4.\newline
4. Deductive program verification. Prove the correctness of
the slice with the bounded-verified auxiliary
specifications. If the slice (i.e., the abstract program)
is proved, the original program satisfies the expected
property as well (by the construction of the slice), and
the analysis terminates. Otherwise, a counterexample to
the slice is found, and thus go to step 5.\newline
5. Refinement. Check whether the counterexample is valid for
the original program, i.e., it is also a counterexample to
the original program. If it is valid, a fault has been
discovered, and the analysis terminates. Otherwise, we
refine the slice so that the invalid counterexample is
eliminated. The process then starts over at step 3.
\newline
This verification process is an instantiation of
counterexample-guided abstraction refinement framework
and forms the basis of our verification infrastructure.
Though it still relies on user-provided specifications
for deductive program verification, it holds promise for
efficiency. It allows only the necessary parts of the
program for the expected property to be analyzed in
deductive program verification, so it eases the burden
of manually discovering useful annotations. It
guarantees fast and initial confidence in the correctness
of program and auxiliary specifications, hence avoids
unnecessary attempts and facilitates users to inspect the
failed proofs in deductive program verification.
Our infrastructure repeatedly uses bounded program
verification. To improve its efficiency, we provide novel
approaches for bounded program verification. We provide
an SMT-based encoding for bounded program verification
by exploiting the recent advances of satisfiability modulo
theory solvers. The encoding supports programs with
complex data structures and specifications with arbitrarily
nested quantifiers and reachability expressions, and is used
in the rest approaches presented in this thesis. Besides, we
provide a calculus to compute suitable scopes for gaining a
high code coverage at bounded program verification. Finally,
we study the effect of using various constraint solving
techniques on the time efficiency of symbol execution—a
technique used in program verification systems.}
}