Sujet Facebook Redocs 2020

Automatic Exploit Generation

The main automatic tools for finding security issues are fuzzing and static analysis. Fuzzing is usually push-button and provides reliable signal, but often has trouble covering significant amounts of code in large applications. Security engineers can write harnesses to point a fuzzer at an interesting part of the code. Static analysis, and in particular compositional techniques such as implemented in Infer, scale to millions of lines of code without user intervention and can « start anywhere » in a code base. However, not all issues reported by static analysis lead to real crash or security vulnerabilities. A lot of the time this is due to the analysis over-approximating the program behaviour, which introduces false positives. More recently, under-approximating techniques have emerged [2], that can guarantee the absence of false positives in some situations. However, knowing that a static analyser reported a bug is still a long way away from having found and developed an exploit against the code! Reports from static analysis on large code bases can be hard to triage: sometimes, investigating if a report is a bug that can actually happen can take days. It is not unusual to get hundreds of report, making manual triage impractical.

The goal of this project is to automatically turn static analysis reports into executables confirming the vulnerability. This would also provide reliable test cases for developers to fix the issue. We can even imagine turning this into a start-anywhere fuzzer that doesn’t require users to write a harness.

Main Goal: Generate Harnesses from Infer.Pulse Summaries

Pulse is one of the analysers inside the Infer static analysis platform [1]. Pulse implements a version of Incorrectness Separation Logic [2,3]. Pulse generates pre- and post-condition pairs for each procedure in a program (or library). These pre- and post-conditions are abstract states meaning « if the state of the program at the beginning is one of this shape, then the function will exit with the state being of this other shape ».

This is how this looks like on a tiny example. Notice the ERROR spec given to goo().

void foo(int* x, int c) {
  if (c > 0) {
    free(x);
  }
}
/* Pulse summary: 2 pre/post pairs
#0
PRE: x allocated
POST: x freed
PATH CONDITION: c > 0

#1
PRE=POST=nothing
PATH CONDITION: c ≤ 0
*/

void goo(int* z, int c) {
  foo(z, *z);
  free(z);
}
/* Pulse summary: 2 pre/post pairs
#0
PRE: z allocated, points to some value v
POST: ERROR: double free
PATH CONDITION: v > 0

#1
PRE: z allocated, points to some value v
POST: z freed
PATH CONDITION: v ≤ 0
*/

While this example is straightforward, Pulse scales to millions of lines of C, C++, or Java code.

The goal of this project is to automatically generate programs that turn ERROR specs into executable vulnerabities. Starting from a Pulse ERROR spec, we want to generate a crash witness:

  1. Materialise the memory shape in its PRE, eg for goo() that would be a single allocated pointer to an integer
  2. Fill in the concrete values such that they satisfy the constraints of the path condition, e.g. using an off-the-shelf SMT solver. For goo() that would mean setting the value of the point to something positive.
  3. Run the function from that memory shape and observe the crash!

From Crashes to Exploits

Now we have an automatic method to validate Pulse reports, turning them into concrete crashes. But are these security vulnerabilities?

The goal is to find an automatic way to detect if a repro can be turned into an exploit. Possible ideas:

  • Use taint analysis: is the function called with external inputs?
  • Decide from the type of bug and shape of the memory. E.g. use-after-free could be more exploitable than nullptr dereferences (although the latter is still a potential Denial-of-Service).
  • This can inform the choice of which error specs we want to generate crash witnesses for.

Start-Anywhere Concolic Fuzzing

Another way to take crash witnesses further is to use them as the basis for a « start-anywhere » fuzzer, e.g. starting near functions that static analysis has labelled as suspicious! This would take fuzzers beyond what they are currently capable of [4]. For example, the crash witnesses can be turned into fuzzer harnesses taking memory states as input.

The technique of mixing concrete (i.e. executing a real program) and symbolic (i.e. statically computing abstract states possible at every program point) data is often dubbed concolic, as in concolic testing (e.g. Klee [5]) or concolic fuzzing (e.g. SAGE [6]).

References

[1] Infer: https://fbinfer.com/ and https://github.com/facebook/infer/
[2] Incorrectness Logic: http://www0.cs.ucl.ac.uk/staff/p.ohearn/papers/IncorrectnessLogic.pdf
[3] Incorrectness Separation Logic: Local Reasoning about the Presence of Bugs: Incorrectness Separation Logic, CAV 2020, http://www.soundandcomplete.org/papers/CAV2020/ISL-CAV-2020.pdf
[4] Incorrectness Logic by Example, about Incorrectness Logic and Fuzzing: https://hexgolems.com/2020/04/incorrectness-logic-by-example/
[5] KLEE https://klee.github.io/
[6] SAGE https://patricegodefroid.github.io/public_psfiles/ndss2008.pdf