Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve symbolic execution performance when running against concrete values #2363

Open
celinval opened this issue Apr 12, 2023 · 10 comments · May be fixed by #2396
Open

Improve symbolic execution performance when running against concrete values #2363

celinval opened this issue Apr 12, 2023 · 10 comments · May be fixed by #2396
Assignees
Labels
[C] Feature / Enhancement A new feature request or enhancement to an existing feature. [E] Performance Track performance improvement (Time / Memory / CPU) T-CBMC Issue related to an existing CBMC issue

Comments

@celinval
Copy link
Contributor

Requested feature: When running Kani assess, many unit tests take a long time to run. Since these tests run with concrete inputs, I would expect the performance to be somewhat similar to MIRI.
Use case: Running tests to find undefined behavior + potential improvement to harnesses.
Link to relevant documentation (Rust reference, Nomicon, RFC):

Test case:

// TODO
@celinval celinval added [E] Performance Track performance improvement (Time / Memory / CPU) [C] Feature / Enhancement A new feature request or enhancement to an existing feature. labels Apr 12, 2023
@zhassan-aws
Copy link
Contributor

zhassan-aws commented Apr 12, 2023

Here is one such example:

#[kani::proof]
#[kani::unwind(7)]
#[kani::solver(cadical)]
fn main() {
    let s = "Mary had a little lamb";
    let v: Vec<&str> = s.split(' ').collect();
    assert_eq!(v, ["Mary", "had", "a", "little", "lamb"]);
}

This takes ~77 seconds with cadical:

$ kani split.rs
...
Runtime Symex: 21.1936s
size of program expression: 583537 steps
slicing removed 412796 assignments
Generated 39013 VCC(s), 26663 remaining after simplification
Runtime Postprocess Equation: 0.695186s
Passing problem to propositional reduction
converting SSA
Runtime Convert SSA: 8.50542s
Running propositional reduction
Post-processing
Runtime Post-process: 20.7397s
Solving with CaDiCaL sc2021
3893095 variables, 33892927 clauses
SAT checker: instance is SATISFIABLE
Runtime Solver: 32.2184s
Runtime decision procedure: 40.7989s
Running propositional reduction
Solving with CaDiCaL sc2021
3893095 variables, 33892928 clauses
SAT checker: instance is UNSATISFIABLE
Runtime Solver: 0.626649s
Runtime decision procedure: 0.744567s
...
SUMMARY:
 ** 0 of 773 failed (14 unreachable)

VERIFICATION:- SUCCESSFUL
Verification Time: 76.816055s

~150 seconds with kissat:

SUMMARY:
 ** 0 of 773 failed (14 unreachable)

VERIFICATION:- SUCCESSFUL
Verification Time: 151.37282s

and ~635 seconds with minisat:


SUMMARY:
 ** 0 of 773 failed (14 unreachable)

VERIFICATION:- SUCCESSFUL
Verification Time: 634.53174s

MIRI takes 0.2 seconds on this example.

@tautschnig
Copy link
Member

Given

Runtime Symex: 21.1936s
size of program expression: 583537 steps

we seem to manage to do ~27.5k (complex) instructions per second, which is certainly not below what we usually see. But: we can and should still seek to improve on that, so I'll happily take up that challenge. (Also, the number of instructions does not seem excessive: perf stat -- ./2363 reports 1540867 concrete machine instructions on my system for this example.

@tautschnig tautschnig self-assigned this Apr 14, 2023
@celinval
Copy link
Contributor Author

@tautschnig do you know why this is invoking the SAT solver? I don't know much about the symbolic execution engine, but I was hoping that it would be able to solve these cases on its own.

@tautschnig
Copy link
Member

@tautschnig do you know why this is invoking the SAT solver? I don't know much about the symbolic execution engine, but I was hoping that it would be able to solve these cases on its own.

That's a good point that needs digging in a bit more: I believe that Kani's reachability assertions will mean we invoke the solver in some way, but then all the queries should be trivial (because the path conditions for those assertions should trivially be TRUE or FALSE, and the solver should have no clauses in the input formula).

@zhassan-aws
Copy link
Contributor

I tried it without reachability checks as well, and it was slightly faster. It also performed a single SAT query:

$ kani split.rs --no-assertion-reach-checks
Runtime Symex: 20.8828s
size of program expression: 579251 steps
slicing removed 412796 assignments
Generated 34727 VCC(s), 22377 remaining after simplification
Runtime Postprocess Equation: 0.672573s
Passing problem to propositional reduction
converting SSA
Runtime Convert SSA: 8.46135s
Running propositional reduction
Post-processing
Runtime Post-process: 20.7269s
Solving with CaDiCaL sc2021
3889146 variables, 33877177 clauses
SAT checker: instance is UNSATISFIABLE
Runtime Solver: 38.9186s
Runtime decision procedure: 47.4509s
...
SUMMARY:
 ** 0 of 773 failed

VERIFICATION:- SUCCESSFUL
Verification Time: 72.50889s

@zhassan-aws
Copy link
Contributor

Disabling all CBMC checks cuts the time in half:

$ kani split.rs --no-assertion-reach-checks --no-default-checks
...
Runtime Symex: 19.7691s
size of program expression: 559970 steps
slicing removed 412673 assignments
Generated 7491 VCC(s), 4215 remaining after simplification
Runtime Postprocess Equation: 0.575882s
Passing problem to propositional reduction
converting SSA
Runtime Convert SSA: 7.36501s
Running propositional reduction
Post-processing
Runtime Post-process: 0.398001s
Solving with CaDiCaL sc2021
3150503 variables, 22299522 clauses
SAT checker: instance is UNSATISFIABLE
Runtime Solver: 0.71818s
Runtime decision procedure: 8.14251s
...
SUMMARY:
 ** 0 of 152 failed

VERIFICATION:- SUCCESSFUL
Verification Time: 32.27915s

@feliperodri feliperodri added the T-CBMC Issue related to an existing CBMC issue label Apr 17, 2023
@tautschnig
Copy link
Member

We need to look into overriding align_offset to always produce zero instead of doing arithmetic over pointers in a way that doesn't make much sense given CBMC does not actually produce a concrete heap layout.

tautschnig added a commit to tautschnig/kani that referenced this issue Apr 20, 2023
This hook intercepts calls to `std::ptr::align_offset<T>` as CBMC's memory
model has no concept of alignment of allocations, so we would have to
non-deterministically choose an alignment of the base pointer, add the
pointer's offset to it, and then do the math that is done in
`library/core/src/ptr/mod.rs`. Instead, we choose to always return
`usize::MAX`, per `align_offset`'s documentation, which states: "It is
permissible for the implementation to always return usize::MAX. Only your
algorithm’s performance can depend on getting a usable offset here, not its
correctness."

Fixes: model-checking#2363
@tautschnig tautschnig linked a pull request Apr 20, 2023 that will close this issue
4 tasks
@tautschnig
Copy link
Member

With the fix from #2396 the "Verification Time" is now 1.9 seconds. When adding in #2395 the time decreases to 0.9 seconds.

@zhassan-aws
Copy link
Contributor

zhassan-aws commented May 9, 2023

Another example from #2235:

fn main() {
    let x = String::new().repeat(1);
    assert!(x.chars().nth(1).is_none());
}

This runs for several minutes more than two hours even if I turn off all optional checks --no-default-checks.

@zhassan-aws
Copy link
Contributor

Another case is #2517.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[C] Feature / Enhancement A new feature request or enhancement to an existing feature. [E] Performance Track performance improvement (Time / Memory / CPU) T-CBMC Issue related to an existing CBMC issue
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants