Research Project: `derive` Fuzz Testing

*Co-authored with @joshlf.*

## Overview
Create a library for fuzz-testing proc macro derives.

## Background
Zerocopy is a crate that provides safe abstractions over transmutation (i.e., reinterpreting the bits of a type as if they belong to another type). Zerocopy provides core traits, each of which can only be derived for a type with a procedural macro (e.g., `#[derive(TryFromBytes)]`):

- [`TryFromBytes`](https://docs.rs/zerocopy/latest/zerocopy/trait.TryFromBytes.html) provides the ability to determine at runtime whether a sequence of bytes represents a valid instance of a type
- [`FromZeros`](https://docs.rs/zerocopy/latest/zerocopy/trait.FromZeros.html) indicates that a sequence of zero bytes represents a valid instance of a type
- [`FromBytes`](https://docs.rs/zerocopy/latest/zerocopy/trait.FromBytes.html) indicates that a type may safely be converted from an arbitrary byte sequence
- [`IntoBytes`](https://docs.rs/zerocopy/latest/zerocopy/trait.IntoBytes.html) indicates that a type may safely be converted to a byte sequence
- [`Unaligned`](https://docs.rs/zerocopy/latest/zerocopy/trait.Unaligned.html) indicates that a type’s alignment requirement is 1 

When a user derives one or more of these traits for their types, zerocopy must prove that the properties associated with the traits actually hold. It does so in two stages. First, zerocopy analyzes the syntax tree of the type definition. If any required elements are missing (e.g., the type is not annotated with an appropriate `#[repr(...)]`), zerocopy produces an error that halts compilation.

Otherwise, zerocopy proceeds to emit both the requesite trait implementation and a type-level proof of soundness. For instance, for a type to be soundly `FromBytes`, each of its fields must also be `FromBytes`. Zerocopy emits code that, at type-checking time, asserts that each field implements `FromBytes`.

We currently use a small number of UI tests (using the `trybuild` crate) to assure ourselves that these analyses are correct. For each test, we craft a stand-alone Rust file that contains an *unsound* `derive` for a hand-written type definition. Our testing harness compiles each of these files, and confirms that the expected compilation error is produced.

For code that is known to compile, we also use [`miri`](https://github.com/rust-lang/miri), a Rust interpreter, to run the code and detect undefined behavior.

## Motivation
Zerocopy's current UI testing approach offers a high degree of control (e.g., we are able to track minute changes in error messages), but only with a large amount of labor. It is sufficiently difficult to create and maintain these tests that zerocopy does not have many of them.

Also, as with any codebase, zerocopy's UI tests only test for error conditions that have occurred to us to add tests for. As a result, some error conditions slip through the cracks, and sometimes this in turn allows bugs to slip through the cracks that could have been caught with more thorough testing such as in https://github.com/google/zerocopy/pull/672.

To remedy this, we would like to augment our small set of fine-grained, hand-written UI tests with a large, dynamically-generated set of coarse-grained UI tests.

## Design
We would like to write fuzz tests using the [`cargo-fuzz`](https://github.com/rust-fuzz/cargo-fuzz) testing framework. A zerocopy fuzz test will randomly generate a Rust datatype, derive zerocopy traits for that datatype, and then use `miri` to run methods from those traits. The test passes if this process produces either a compile error, or runs under miri-successfully. It fails if `miri` detects unsoundness.

A sample `cargo-fuzz` fuzzing harness might look like this:
```rust
#![no_main]
#[macro_use] extern crate libfuzzer_sys;
extern crate arbitrary_typedef;

use arbitrary_typedef::AdtDef;

fuzz_target!(|adt_def: AdtDef| {
    assert!(compile_error_or_miri_success(format!(r#"
        use zerocopy::FromZeroes;

        #[derive(FromZeros)]
        {adt_def}

        fn main() {
            let value = FromZeroes::new_zeroed();
        }
    "#);
});
```
For this, we need to define:
1. `arbitrary_typedef::AdtDef`, which abstractly describes a data type definition, and implements [`Arbitrary`](https://docs.rs/arbitrary/1.3.1/arbitrary/trait.Arbitrary.html) for it, allowing `AdtDef` to be automatically generated.
2. `compile_error_or_miri_success`, a function that compiles its argument, produces `true` if it compile-errors, otherwise runs it under miri, and produces `true` if it doesn't fail (or otherwise produces `false`).

The first item is the primary research challenge: How do we randomly generate interesting (compositions of) Rust datatypes? 

## Related Work

- PLT Redex's [`generate-term`](https://docs.racket-lang.org/redex/reference.html#%28form._%28%28lib._redex%2Freduction-semantics..rkt%29._generate-term%29%29) randomly generates a programming language 'term' of a given size.





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Research Project: `derive` Fuzz Testing #614

Overview

Background

Motivation

Design

Related Work

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Research Project: derive Fuzz Testing #614

Description

Overview

Background

Motivation

Design

Related Work

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Research Project: `derive` Fuzz Testing #614