Releasing jsfunfuzz and DOMFuzz

July 28th, 2015

Today I'm releasing two fuzzers: jsfunfuzz, which tests JavaScript engines, and DOMFuzz, which tests layout and DOM APIs.

Over the last 11 years, these fuzzers have found 6450 Firefox bugs, including 790 bugs that were rated as security-critical.

I had to keep these fuzzers private for a long time because of the frequency with which they found security holes in Firefox. But three things have changed that have tipped the balance toward openness.

First, each area of Firefox has been through many fuzz-fix cycles. So now I'm mostly finding regressions in the Nightly channel, and the severe ones are fixed well before they reach most Firefox users. Second, modern Firefox is much less fragile, thanks to architectural changes to areas that once oozed with fuzz bugs. Third, other security researchers have noticed my success and demonstrated that they can write similarly powerful fuzzers.

My fuzzers are no longer unique in their ability to find security bugs, but they are unusual in their ability to churn out reliable, reduced testcases. Each fuzzer alternates between randomly building a JS string and then evaling it. This construction makes it possible to make a reproduction file from the same generated strings. Furthermore, most DOMFuzz modules are designed so their functions will have the same effect even if other parts of the testcase are removed. As a result, a simple testcase reduction tool can reduce most testcases from 3000 lines to 3-10 lines, and I can usually finish reducing testcases in less than 15 minutes.

The ease of getting reduced testcases lets me afford to report less severe bugs. Occasionally, one of these turns out to be a security bug in disguise. But most importantly, these bug reports help me establish positive relationships with Firefox developers, by frequently saving them time.

A JavaScript engine developer can easily spend a day trying to figure out why a web site doesn't work in Firefox. If instead I can give them a simple testcase that shows an incorrect result with a new JS optimization enabled, they can quickly find the source of the bug and fix it. Similarly, they much prefer reliable assertion testcases over bug reports saying "sometimes, Google Maps crashes after a while".

As a result, instead of being hostile to fuzzing, Firefox developers actively help me fuzz their code. They've added numerous assertions to their code, allowing fuzzers to notice as soon as the smallest thing goes wrong. They've fixed most of the bugs that impede fuzzing progress. And several have suggested new ways to test their code, even (especially) ways that scare them.

Developers working on the JavaScript engine have been especially helpful. First, they ensured I could test their code directly, apart from the rest of the browser. They already had a JavaScript shell for running regression tests, and they added a --fuzzing-safe option to disable the more dangerous testing functions.

The JS team also created a large set of testing functions to let me control things that would normally be based on heuristics. Fuzzers can now choose when garbage collection happens and even how much. They can make expensive JITs kick in after 2 loop iterations rather than 100. Fuzzers can even simulate out-of-memory conditions. All of these things make it possible to create small, reliable testcases for nasty classes of bugs.

Finally, the JS team has supported differential testing, a form of fuzzing where output is checked for correctness against some oracle. In this case, the oracle is the same JavaScript engine with most of its optimizations disabled. By fixing inconsistencies quickly and supporting --enable-more-deterministic, they've ensured that differential testing doesn't get stuck finding the same problems repeatedly.

Andreas Gal, a developer working on Firefox's JavaScript engine, once commented on Bugzilla: 'From this day forward, I shall never write a JIT again without Jesse.'

Please join us on IRC, or just dive in and contribute! Your suggestions and patches can have a large impact: fuzzer modules often act together to find complex interactions within the browser. For example, bug 893333 was found by my designMode module interacting with a <table> module contributed by a Firefox developer, Mats Palmgren. Likewise, bug 1158427 was found by Christoph Diehl's WebAudio module combined with my reflection-based API-discovery modules.

To the next 6450 browser bug fixes!

Fuzzers love assertions

February 3rd, 2014

Fuzzers make things go wrong.
Assertions make sure we find out.

Assertions can improve code quality in many ways, but they truly shine when combined with fuzzing. Fuzzing is normally limited to finding obvious symptoms like crashes, because it's rare to be able to tell correct behavior from incorrect behavior when the input is generated randomly. Assertions expand the scope of fuzzing to include everything they check.

Assertions can even help find crash bugs: some bugs are relatively easy for fuzzers to trigger, but only lead to crashes when additional conditions are met. A well-placed assertion can let us know every time we trigger the bug.

Fuzzing JS and DOM has found about 4000 assertion bugs, including about 300 security bugs.

Asserting safe use of generic data structures

Assertions in widely-used data structures can find bugs in many callers.

  • Array indices must be within bounds. This simple precondition assert in nsTArray has caught about 90 bugs.
  • Hash tables must not be modified during enumeration. If the modification happened to resize the hash table, it would leave stack pointers dangling. This PLDHashTable assertion has caught over 50 bugs.
  • Cached values should not be out of date. When a cache's get method takes a key and a closure for computing values in the case of a cache miss, debug builds can check whether the cached values are still correct. This is effectively a form of differential testing that notices bugs in cache-invalidation logic.

Asserting module invariants

When an entire module must maintain an invariant, a single assertion can catch dozens of bugs.

Making the frame arena safer

Gecko's CSS box objects, called "frames", are created and destroyed manually. They are allocated within an arena to reduce malloc overhead and fragmentation. The arena also made it possible to reduce the risk associated with manual memory management. A combination of assertions (in debug builds) and runtime mitigations (in all builds) mitigates dangling pointer bugs that involve frames.

Requests for Gecko developers

Please add assertions, especially when:

  • A bug would be a security hole
  • Crashing is not guaranteed
  • Many callers must fulfill a precondition
  • Complex, extensive code must maintain an invariant

Also consider:

Customizing the Mozilla Manifesto

December 17th, 2013

I have mixed feelings about requiring Mozillians to “agree” to the Mozilla Manifesto. I get the impression that many volunteers aren’t fond of “commercial involvement” (9). Firefox development often does not live up to the ideals of absolute security (4) or transparency (8), so we’d be asking new contributors to commit to behavior for which they may have little support.

Meanwhile, the manifesto is oddly silent on two issues that many Mozillians care about deeply. First, it says little about privacy. “Shaping your own experience on the Internet” (5) suggests control over customized ads, but not control over tracking by advertisers or governments.

Second, the manifesto does not adequately address removing barriers to contribution or promoting inclusiveness in community processes. The relevant principles (6, 8) are worded as vague beliefs rather than strong values. Compare with my favorite part of the Ada Initiative FAQ:

“Open technology and culture are shaping the future of global society. If we want that society to be socially just and to serve the interests of all people, [all kinds of people] must be involved in its creation and organization.”

Rather than asking each Mozillian to agree to the entire manifesto, let’s instead encourage everyone to Likert the 10 existing principles and add a few of their own.

Indicating how you feel about each principle is more memorable than clicking “Agree” once. Each Mozillian would have a personal version of the Manifesto to remind them what drives them to contribute. Such a survey could also lead to better understanding of the community and suggest improvements to the Manifesto.

Mobile apps for car-free living

April 16th, 2012
A man swings through the aisle of a train.

Each of these apps makes transit more efficient or convenient. Together, they can do something almost magical: make transit attractive to urbanites who previously saw owning a car as a necessity.

Planning your trips

These apps try to find the best way to reach your destination by combining timetables from multiple transit agencies:

Google Maps[Learn more] shows your current location along with walking, transit, or driving directions. In the iPhone app, you can double-tap the locator button to align the map with the iPhone's compass.

HopStop[iOS | Android] lets you specify whether you prefer trains or buses, and whether you prefer walking or waiting for a transfer. It shows a zoomed-in map for each transfer.

Reroute.it lets you quickly compare modes of transportation before getting directions.

Catching your ride

Routesy, Nextime, and Nextbus use real-time transit data to help you make quick decisions on familiar routes. For example, you'll know when to walk to your stop, when to run, and when to wait inside.

Not missing your stop

A location-based alarm, such as Get Off Now or GPSAlarms, can allow you to nap, read, or work without worrying about missing your stop.

These apps can run in the background and have surprisingly little effect on battery life. They use power-hungry GPS only when cell/wifi location data indicates that you are somewhat close.

Staying productive and entertained

One of the biggest advantages of public transportation is being able to get things done while in transit. Some people check email, watch TV shows, or even order from Chipotle using their phones.

I often use time on the train to read articles. Whenever I find myself with too many Wikipedia tabs open, I send them to my phone using the Instapaper or Spool bookmarklet. Sometimes I read books on my phone using the Amazon Kindle app.

Getting a car when you need one

The Zipcar app lets you borrow cars from Zipcar locations, while Getaround lets you borrow cars from awesome neighbors.

Or you can pay for a ride using Taxi Magic or Uber.

More reading

Some transit authorities recommend apps for their cities: San Francisco, New York, Chicago, Seattle, and Portland, Oregon.

In my next posts, I'll list my ideas for new transit apps and explain how platforms could better support location-aware apps.

Fuzzing for consistent rendering

March 3rd, 2012

My DOM fuzzer can now find bugs where the layout of a DOM tree depends on its history.

In this example, forcing a re-layout swapped a “1” and “3” on the screen. My fuzzer didn’t know which rendering was correct, but it could tell that Firefox was being inconsistent.

Initial DOM tree
  • DIV
    • ت
    • SPAN
      • 1
      • SPAN
      • 3
31ت
Random change:
remove the inner span
  • DIV
    • ت
    • SPAN
      • 1
      • 3
31ت
Force re-layout
  • DIV
    • ت
    • SPAN
      • 1
      • 3
13ت

Gecko developer Simon Montagu quickly determined that 13ت is the correct rendering and attached a patch. Later, when a user reported that the bug affected Persian comments on Facebook, we were able to backport Simon’s fix to Firefox 11.

How it works

The fuzzer starts by making random dynamic changes to a page. Then it compares two snapshots: one taken immediately after the dynamic changes, and another taken after also forcing a relayout.

To force a relayout, it removes the root from the document and then adds it back:

  var r = document.documentElement; 
  document.removeChild(r);
  document.appendChild(r);

Like reftest, it uses drawWindow() to take snapshots and compareCanvases() to compare them.

In theory, I could also look for bugs where dynamic changes do not repaint enough of the window. But I've been told that testing for painting invalidation bugs is tricky, so I'll wait until most of the layout bugs are fixed.

Exceptions

Since the testcases are random, I have to be heavy-handed in ignoring known bugs. If I file a rendering bug where the weirdest part of the testcase is floats, I'll have the fuzzer ignore inconsistent rendering in testcases with floats until the bug is fixed.

The current list of exceptions is fairly large and includes key web technologies:

Renting movies is hard

March 1st, 2012

None of the major video rental systems appeal to me:

The iTunes Store mostly works for my current set of devices, but all the movies I want to watch are either too new or too obscure for them to have rentals available.

Maybe I should sign up for Netflix but use other means to actually watch movies. At least then Hollywood will have enough money to make good films buy politicians, print and distribute billions of optical discs, prevent paying customers from exercising their fair use rights, and sue my neighbors.

I dream of Alpha

February 5th, 2012

This museum’s rooms are empty, waiting to be filled with answers to visitors’ questions.

-

In my search for nutrition, have I overlooked some fruit that I might find convenient and delicious? I start by trying to find out what’s popular throughout the world.

What fruits are liked by the most people? Human thoughts are not my forte.

What fruits are eaten the most? I get an answer, but not in the chart form I expected.

A row of fruit appears on the floor. The larger ones are shown both whole and sliced. Does the five-second rule apply to food that suddenly appeared on the floor, or only to food that has been dropped? Am I looking at holograms?

A bigger problem is that the list is dominated by small fruits like berries. I don’t like berries.

What fruits are eaten the most, by weight? Insufficient data.

I probe, using simpler questions, to figure out what it knows. What’s the weight of an apple? 180 grams. What’s the total weight of apples eaten in a year? Insufficient data.

I guess I have to be explicit if I want it to combine its weight and consumption data.

For each fruit for which you have sufficient data, chart the number eaten in a year, the average weight, and the product of the two.

I don’t get an answer right away. Is it just taking a while? Did I mangle the question, causing it to make a chart that is invisible because it has no entries? Did I confuse it with the phrase “the product of the two”?

-

Two women are debating the merits of bananas. In this place, they aren’t limited to speculation. Can you chart fruit by potassium per Calorie? Vitamin B6 per dollar? It helpfully highlights the “banana” row in each chart.

They explore the supply side as well. Show me maps of where bananas are grown. Can you add a yearly animation with harvests shown as glowing dots? Draw a chart with axes for temperature and latitude, colored to show how well bananas grow in each condition.

I start thinking of my own questions, but I don’t expect it to be able to answer them. How do most people open bananas? How many bananas are used in recipes rather than eaten directly?

How many bananas are used as sex toys? Oops, did I ask that out loud?

It doesn’t even acknowledge my question, but one of the women retorts with a question of her own.

What percent of the time are men thinking about sex? Human thoughts are not my forte.

-

When I wake up, it’s still dark outside.

Today, the closest thing to the museum of my dream is a web site called Wolfram Alpha. It can chart many things. But it requires us to phrase questions carefully, and sometimes it simply misinterprets queries.

As for fruit? Wolfram Alpha has consumption data for some fruit. But some fruit is missing, and some fruit confuses it.

I start writing this post while eating the last two apples from my fridge.

I go back to bed, hoping for additional pleasant dreams.

Lessons from JS engine bugs

September 1st, 2011

Last week, I asked Luke Wagner to explain some security bugs that he fixed in the past. I hoped to learn from each bug at multiple levels, in ways that could help prevent future security bugs from arising and persisting.

Luke is one of the developers working on Firefox's JavaScript engine, which is currently our largest source of critical security bugs.

Method

I imagined we would recurse in exhaustive breadth and exhausting depth. Instead, we recursed only on the most interesting items, and refined a checklist of starting points:

  • What was the bug?
  • What went wrong in the developer's thinking that caused the bug to be introduced?
  • What made the bug exploitable?
  • What caused us to use especially dangerous features of C++?
  • Could a new abstraction make it possible to do this both fast and safe?
  • What caused the bug to persist? Could we have caught this earlier with improved regression tests, fuzz testing, dynamic analysis, or static analysis?

Luke and I made trees for all ten bugs, at first on paper and later using EtherPad. Then I extracted and categorized what I thought were the most useful lessons and recommendations.

Recommendations for introducing fewer bugs

Casts

  • Create centralized, type-restricted cast functions. This protects you when you change the representation of one of the types. It also protects against mistakes that cause the input type to be incorrect.

Sentinel values

  • Use tagged unions instead.
  • Use a typed wrapper (a struct containing a single value). When assigning from the underlying numeric type, convert using one of two functions: one that checks for special values, and one that explicitly does not.
  • Audit existing code paths to ensure they cannot generate the special value.

Clarity of invariants

Interacting with other developers

  • If you're about to do something gross because someone else doesn't expose the right API/helper, maybe you should get it exposed.

JS Engine specific

  • Any patch that touches rooting should be reviewed by Igor.
  • Interpreter could have better abstraction and encapsulation for its stack.

Recommendations for catching bugs earlier

Static analysis

  • Find all casts (C-style casts, the reinterpret_cast keyword, and casts through unions) for a given type. Could be used to enforce centralization or to find things that should be centralized.
  • Be suspicious of a function with multiple return statements, all of which return the same primitive value.
  • Be suspicious of a function returning true/success in an OOM path.

Dynamic analysis

  • Ask Valgrind developers what they think of providing (in valgrind.h) a way to tie the addressability of "stacklike memory" to a variable that represents the end of the stack.

Fuzzing

  • We should fuzz worker threads somehow.
    • In browser (slow and messy, but it's what users are running).
    • In thread-safe shell (--enable-threadsafe?), which has "toy workers".
  • We should fuzz compartments better.
    • I should ask Blake and Andreas for help with testing compartments and wrappers.
    • I should ask Gary to run jsfunfuzz in xpcshell, where I can test both same-origin and different-origin compartments, and thus get more interesting wrappers.
  • We should give JS OOM fuzzing another shot.

Next steps

I'm curious if others have additional ideas for what could have prevented the ten bugs we looked at. For example, someone like Jeff Walden, who loves to write exhaustive regression tests, might have ideas that Luke and I did not consider.

I'd also like to do this kind of analysis with a other developers on bugs they have fixed.