Making Ag Faster: Profiling with Valgrind

These days, a lot of software is written to be “fast enough”. Since code bases can be very large, there’s no such thing as “fast enough” for The Silver Searcher. In fact, my main goal with Ag is speed.

Improving performance is not always easy, but it is simple:

  1. Find the slowest part of the program.
  2. Make that part faster.
  3. Repeat until it’s fast enough or you go insane.

There are lots of profiling tools and programmers often argue about which is the best. I use gprof, callgrind, and Instruments.app. Which profiler you use doesn’t matter as much as actually using one. They all have their advantages and disadvantages, but for this post I’ll only cover Valgrind’s callgrind. Using callgrind doesn’t require special compilation. Just invoke it with your program’s name and it will generate profiling data for callgrind_annotate to analyze.

Here’s a typical profiling run for Ag:

Liquid error: No such file or directory - posix_spawnp

I snipped out the annotated source code. You can see the full output here.

This profiling info tells me that I’m spending all my time in strnstr(). I did some research on string-matching and found out about the Boyer-Moore algorithm. After some more reading, I decided to go with a simplified version of Boyer-Moore called Boyer-Moore-Horspool.

Here’s the data after I implemented Boyer-Moore-Horspool strstr:

Liquid error: No such file or directory - posix_spawnp

For the curious, full output of callgrind_annotate is here.

That’s a 3x overall speedup and a 27x speedup in string matching. Impressive! Now Ag is spending most of the time figuring out whether or not it should search a file. It’s clear where I need to optimize next.

Valgrind isn’t perfect though. It makes programs run 25-50x slower than they normally would, so you won’t notice if you’re spending all your time waiting for network or disk I/O. In the case of Ag, this turned into a 20% performance improvement in my benchmarks.

Getting more useful data requires switching from an instrumenting profiler to a sampling profiler. Both Instruments.app and gprof are sampling profilers, but this post is already too long. I’ll cover them some other time.