Benjamin Sago / ogham / cairnrefinery / etc…

Technical notes Run cargo-fuzz in a RAM disk

Fuzzing, or fuzz testing, is a nice way to find unexpected bugs and edge cases in your code, and cargo-fuzz is a great way to set this up for Rust. If you have a non-trivial function that takes an array of bytes or other structured user input data as a parameter, I can’t recommend fuzzing enough. cargo-fuzz has found over a dozen crashes in dog’s DNS parser and What-The-Format’s input parser, where I made incorrect assumptions when dealing with user input.

The downside of fuzzing is that it is incredibly computationally intensive. Not only does fuzzing typically gobble up all your memory and peg all your CPU cores to 100%, it stores its “corpus” — the set of inputs that get mutated — on disk, which can often reach tens of thousands of files in number.

This is mentioned in the README of AFL, one of cargo-fuzz’s back-ends:

Fuzzing involves billions of reads and writes to the filesystem. On modern systems, this will be usually heavily cached, resulting in fairly modest “physical” I/O - but there are many factors that may alter this equation. It is your responsibility to monitor for potential trouble; with very heavy I/O, the lifespan of many HDDs and SSDs may be reduced.

Furthermore, while it’s sometimes necessary to keep these files around, I find that I usually don’t need them after I’m done fuzzing the current version of my code. More than once I’ve forgotten to exclude the corpus/ directory from backups, only to find myself backing up a bajillion tiny files I’m never going to use again.

So I wrote a script to run fuzzing in a RAM disk, saving the life of my SSD. Here it is, in its entirety:

fuzzdisk.shDownload
#!/bin/bash -e

# Size of the RAM disk in blocks (2048 = 1 MiB)
SIZE=1048576

# Name of the RAM disk
VOLUME_NAME="fuzzdisk"

# The mount point where the RAM disk will appear
VOLUME_PATH="/Volumes/$VOLUME_NAME"

# Some pre-flight checks
if test "$(pwd)" == "$HOME"; then
    echo "[Refusing to run in home directory!]" >/dev/stderr
    exit 1
fi

if test -d "$VOLUME_PATH"; then
    echo "[There is already a fuzz disk mounted!]" >/dev/stderr
    exit 1
fi

# Create, format, and mount the RAM disk
DISK="$(hdiutil attach -nomount ram://$SIZE | cut -d\  -f1)"
echo "[Created $DISK]"
diskutil erasevolume HFS+ "$VOLUME_NAME" "$DISK"
echo "[Created filesystem]"

# Copy the files over
rsync . "$VOLUME_PATH" \
    --recursive \
    --chmod Fa-w \
    --exclude-from ".gitignore" \
    --exclude-from "$HOME/.gitignore-global"
echo "[Copied over files]"

# Spawn subshell and wait for it to exit
export RAM_DISK="$VOLUME_PATH"
cd "$VOLUME_PATH"
"$SHELL"

# Leave the disk directory so it’s no longer being used
cd

# Dispose of the RAM disk
echo "[Unmounting RAM disk...]"
sleep 1
diskutil unmountDisk "$DISK"
echo "[Ejecting RAM disk...]"
diskutil eject "$DISK"

As you can see by the use of diskutil, it’s macOS-only — sorry about that — but should be easily adaptible to other OSes.

I use it by running the script in the root directory of my source code. It starts by creating a RAM Disk and copying the files over using rsync, making them non-writable so you don’t accidentally modify the in-memory version of your code. It then spawns a sub-shell that I’m free to run any commands in, including cargo fuzz to do fuzzing. When I’m done, I can simply exit out of the sub-shell (or hit Control-D) and the RAM disk will be unmounted and removed.