How the Atheris Python Fuzzer Works

On Friday, we announced that we’ve released the Atheris Python fuzzing engine as open source. In this post, we’ll briefly talk about its origins, and then go into lots more detail on how it works.

The Origin Story 
Every year since 2013, Google has held a “Fuzzit”, an internal event where Googlers write fuzzers for their code or open source software. By October 2019, however, we’d already written fuzzers for most of the open-source C/C++ code we use. So for that Fuzzit, the author of this post wrote a Python fuzzing engine based on libFuzzer. Since then, over 50 Python fuzzers have been written at Google, and countless bugs have been reported and fixed.
Originally, this fuzzing engine could only fuzz native extensions, as it did not support Python coverage. But over time, the fuzzer morphed into Atheris, a high-performance fuzzing engine that supports both native and pure-Python fuzzing.
A Bit of Background 
Atheris is a coverage-guided fuzzer. To use Atheris, you specify an “entry point” via atheris.Setup(). Atheris will then rapidly call this entry point with different inputs, hoping to produce a crash. While doing so, Atheris will monitor how the program execution changes based on the input, and will attempt to find new and interesting code paths. This allows Atheris to find unexpected and buggy behavior very effectively.

import atheris

import sys

def TestOneInput(data):  # Our entry point

  if data == b”bad”:

    raise RuntimeError(“Badness!”)

    

atheris.Setup(sys.argv, TestOneInput)

atheris.Fuzz()

Atheris is a native Python extension, and uses libFuzzer to provide its code coverage and input generation capabilities. The entry point passed to atheris.Setup() is wrapped in the C++ entry point that’s actually passed to libFuzzer. This wrapper will then be invoked by libFuzzer repeatedly, with its data proxied back to Python.

Python Code Coverage 

Atheris is a native Python extension, and is typically compiled with libFuzzer linked in. When you initialize Atheris, it registers a tracer with CPython to collect information about Python code flow. This tracer can keep track of every line reached and every function executed.

We need to get this trace information to libFuzzer, which is responsible for generating code coverage information. There’s a problem, however: libFuzzer assumes that the amount of code is known at compile-time. The two primary code coverage mechanisms are __sanitizer_cov_pcs_init (which registers a set of program counters that might be visited) and __sanitizer_cov_8bit_counters_init (which registers an array of booleans that are to be incremented when a basic block is visited). Both of these need to know at initialization time how many program counters or basic blocks exist. But in Python, that isn’t possible, since code isn’t loaded until well after Python starts. We can’t even know it when we start the fuzzer: it’s possible to dynamically import code later, or even generate code on the fly.

Thankfully, libFuzzer supports fuzzing shared libraries loaded at runtime. Both __sanitizer_cov_pcs_init and __sanitizer_cov_8bit_counters_init are able to be safely called from a shared library in its constructor (called when the library is loaded). So, Atheris simulates loading shared libraries! When tracing is initialized, Atheris first calls those functions with an array of 8-bit counters and completely made-up program counters. Then, whenever a new Python line is reached, Atheris allocates a PC and 8-bit counter to that line; Atheris will always report that line the same way from then on. Once Atheris runs out of PCs and 8-bit counters, it simply loads a new “shared library” by calling those functions again. Of course, exponential growth is used to ensure that the number of shared libraries doesn’t become excessive.

What’s Special about Python 3.8+?

In the README, we advise users to use Python 3.8+ where possible. This is because Python 3.8 added a new feature: opcode tracing. Not only can we monitor when every line is visited and every function is called, but we can actually monitor every operation that Python performs, and what arguments it uses. This allows Atheris to find its way through if statements much better.

When a COMPARE_OP opcode is encountered, indicating a boolean comparison between two values, Atheris inspects the types of the values. If the values are bytes or Unicode, Atheris is able to report the comparison to libFuzzer via __sanitizer_weak_hook_memcmp. For integer comparison, Atheris uses the appropriate function to report integer comparisons, such as __sanitizer_cov_trace_cmp8.

In recent Python versions, a Unicode string is actually represented as an array of 1-byte, 2-byte, or 4-byte characters, based on the size of the largest character in the string. The obvious solution for coverage is to:

  • first compare two strings for equivalent character size and report it as an integer comparison with __sanitizer_cov_trace_cmp8
  • Second, if they’re equal, call __sanitizer_weak_hook_memcmp to report the actual string comparison

However, performance measurements discovered that the surprising best strategy is to convert both strings to utf-8, then compare those with __sanitizer_weak_hook_memcmp. Even with the performance overhead of conversion, libFuzzer makes progress much faster.


Building Atheris
Most of the effort to release Atheris was simply making it build outside of Google’s environment. At Google, building a Python project builds its entire universe of dependencies, including the Python interpreter. This makes it trivial for us to use libFuzzer with our projects – we just compile it into our Python interpreter, along with Address Sanitizer or whatever other features we want.
Unfortunately, outside of Google, it’s not that simple. We had many false starts regarding how to link libFuzzer with Atheris, including making it a standalone shared object, preloading it, etc. We eventually settled on linking it into the Atheris shared object, as it provides the best experience for most users.
However, this strategy still required us to make minor changes to libFuzzer, to allow it to be called as a library. Since most users won’t have the latest Clang and it typically takes several years for distributions to update their Clang installation, actually getting this new libFuzzer version would be quite difficult for most people, making Atheris installation a hassle. To avoid this, we actually patch libFuzzer if it’s too old. Atheris’s setup.py will detect an out-of-date libFuzzer, make a copy of it, mark its fuzzer entry point as visible, and inject a small wrapper to allow it to be called via the name LLVMFuzzerRunDriver. If the libFuzzer is sufficiently new, we just call it using LLVMFuzzerRunDriver directly.
The true problem comes from fuzzing native extensions with sanitizers. In theory, fuzzing a native extension with Atheris should be trivial – just build it with -fsanitize=fuzzer-no-link, and make sure Atheris is loaded first. Those magic function calls that Clang injected will point to the libFuzzer symbols inside Atheris. When just fuzzing a native extension without sanitizers, it actually is that simple. Everything works. Unfortunately, sanitizers make everything more complex.
When using a sanitizer like Address Sanitizer with Atheris, it’s necessary to LD_PRELOAD the sanitizer’s shared object. ASan requires that it be loaded first, before anything else; it must either be preloaded, or statically linked into the executable (in this case, the Python interpreter). ASan and UBSan define many of the same code coverage symbols as libFuzzer. In typical libFuzzer usage, this isn’t an issue, since ASan/UBSan declare those symbols weak; the libFuzzer ones take precedence. But when libFuzzer is loaded in a shared object later, that doesn’t work. The symbols from ASan/UBSan have already been loaded via LD_PRELOAD, and coverage information therefore goes to those libraries, leaving libFuzzer very broken.
The only good way to solve this is to link libFuzzer into python itself, instead of Atheris. Since it’s therefore part of the proper executable rather than a shared object that’s dynamically loaded later, symbol resolution works correctly and libFuzzer symbols take precedence. This is nontrivial. We’ve provided documentation about this, and a script to build a modified CPython 3.8.6. These scripts will use the same possibly-patched libFuzzer as Atheris.
Why is it called Atheris? 
Atheris Hispida, or the “Hairy bush viper”, is the closest thing that exists to a fuzzy Python.


This post appeared first on Google Online Security Blog
Author: Sarah O’Rourke