Geoff's SiteGeoff Greer2012-05-19T19:12:19-07:00http://geoff.greer.fm/A Primer on Programmer IRC Etiquette2012-05-19T18:12:50-07:00http://geoff.greer.fm//2012/05/19/programmer-irc-etiquette/<p><a href='http://en.wikipedia.org/wiki/Internet_Relay_Chat'>IRC</a> is a great way to get hands-on help when learning a new language or framework. Experts in tech-related IRC channels are amazing. In the real world, they could charge insane rates for consulting, but they give advice freely on IRC and mailing lists. That said, it’s important to note that IRC veterans find some common newbie behaviors very annoying. Getting help from these experts can be frustrating for a newbie, but the rewards are great.</p>
<p>If you’re new to IRC, here is a short list of guidelines:</p>
<ol>
<li>Before asking for help, use Google and <a href='http://en.wikipedia.org/wiki/RTFM'>read the docs</a>. People will get annoyed if you ask questions that are answered in documentation. It shows you didn’t try very hard to find the answer before wasting their time.</li>
<li>Don’t ask to ask, just ask.</li>
<li>Explain the problem in detail. Saying, “It doesn’t work.” will earn you the ire of everyone in the channel. How doesn’t it work? What error messages do you see? What do the logs say? Be specific. People need to know the answers to these questions so they can help. Mention what you think should happen and what is actually happening.</li>
<li><a href='https://gist.github.com/'>Gist</a> any error logs or source code you have a problem with. Pasting large chunks of text in IRC is frowned upon. It’s called flooding, and it makes conversation difficult for others. Some channels will kick you for flooding. If your IRC client doesn’t rate-limit, the server can even disconnect you.</li>
<li>Explain what you’re trying to do. Often there’s an easier way to get what you want out of a language or framework. Others might know of a useful module or library that does much of the heavy lifting for you.</li>
<li>Finally, <a href='http://meta.wikimedia.org/wiki/Don%27t_be_a_dick'>don’t be a dick</a>. Don’t call a project stupid. Every piece of software has stupid parts. Granted, some have more stupidity than others, but name-calling isn’t going to help fix the problems. More importantly, name-calling won’t get you help. It will probably get you kick-banned.</li>
</ol>
<p>Following these rules will make life much more pleasant for everyone involved. You’ll be more likely to get your problem solved and the channel regulars will have less stress in their lives.</p>
<p>I was motivated to write this after an inexperienced person joined #node.js on Freenode. I failed to find a similar rule list that I read years before. I thought about linking to ESR’s <a href='http://www.catb.org/~esr/faqs/smart-questions.html'>How to Ask Questions the Smart Way</a>, but it’s too general, too long, and insults the reader. Now that I’ve written this, I’m ready for next time.</p>Chile Trip2012-03-31T20:58:46-07:00http://geoff.greer.fm//2012/03/31/chile-trip/<p><a href='/images/IMG_1068.jpg'><img src='/images/IMG_1068_small.jpg' alt='The view from LaPostolle' /></a></p>
<p>I went with <a href='http://journal.paul.querna.org/'>Paul</a> and <a href='http://www.erenkrantz.com/'>Justin</a> to Chile and Argentina. Most of the trip consisted of touring wineries, eating gourmet meals, and staying in fancy hotels. Even <a href='http://futurama.wikia.com/wiki/Hedonismbot'>Hedonismbot</a> would blush at this past week.</p>Work Patterns on a Plane2012-03-22T21:01:17-07:00http://geoff.greer.fm//2012/03/22/work-patterns-on-a-plane/<p>I’ve found that air travel provides a good opportunity for me to get stuff done. The isolation<a href='#ref_1'>[1]</a> helps me avoid distractions and forces my mind to stay focused on the job at hand. I’m also buoyed by the knowledge that I will have minimal interruptions for the duration of the flight.</p>
<p>That said, there are some caveats.</p>
<p>Having a small laptop is crucial. Large laptops won’t fit on your lap if someone reclines their seat.</p>
<p>And I can’t do just any sort of work on a plane. Anything requiring Internet access is right-out. Learning a new programming language isn’t going to happen. There are just too many tools I’d have to download and newbie error messages that I’d have to google. Writing something new in Twisted Python is fine, since I know the language and I can save all the necessary documentation beforehand. Writing C is also fine, since most of its documentation is in manpages. And anything I can’t figure out from docs I can usually figure out with <a href='/2012/01/30/programming-we-can-do-science/'>science</a>.</p>
<p>Another issue: If I travel with any sort of <a href='/2011/12/04/consume-less-shallow-content'>shallow content</a>, I end up wasting time (and battery life) watching movies instead of coding.</p>
<p>Speaking of battery life: it’s a bit of a problem. My 11” Air lasts 2-7 hours depending on what I’m doing<a href='#ref_2'>[2]</a>. That’s good enough for most domestic flights, but the poor thing didn’t stand a chance on my recent <a href='/2012/03/19/japan-trip'>Japan trip</a>. If my Air dies, I switch to my backup netbook. It’s not as functional, but the spare battery can give it over 12 hours of run-time.</p>
<p>I got a ton of work done on <a href='https://github.com/ggreer/the_silver_searcher'>The Silver Searcher</a> while flying. I managed to <a href='https://github.com/ggreer/the_silver_searcher/commit/050ead66ee98abbfba639fd5ff7eded53c630455'>add support for pipes</a>, <a href='https://github.com/ggreer/the_silver_searcher/pull/16/files'>refactor the searching code</a>, <a href='https://github.com/ggreer/the_silver_searcher/commit/b4dd2ac496edb75fec7bc4f66dde2fedead23b6f'>clean up some particularly ugly printing code</a>, and fix some bugs related to <a href='https://github.com/ggreer/the_silver_searcher/commit/46cc97f1ebe843e93825fbf8245d2dd2592a3a73'>printing</a> <a href='https://github.com/ggreer/the_silver_searcher/commit/a2bbca668dac9dcfbf55dad2887d2d2569bae2f7'>matches</a>. In total I changed over 1,000 lines on the trans-pacific flights. I accomplished more in those 20 hours than I did in the month previous. I also felt an unusually large sense of accomplishment.</p>
<p>As <a href='http://russellhaering.com/'>Russell</a> <a href='https://twitter.com/#!/russell_h/status/180862812074164224'>says</a>, “If you want to find time to hack on a side project, get on an airplane.”</p>
<hr /><a name='ref_1'> </a>
<ol>
<li>Lack of Internet access, phone, or friends nearby. <a name='ref_2'> </a></li>
<li>I realize this is a huge range. The CPU and screen seem to be the biggest power hogs. If I don’t pay any attention to power consumption and just run the screen bright, the battery will last about 4 hours. Keeping brightness low and CPU usage to a minimum can almost double that.</li>
</ol>Japan Trip2012-03-19T04:54:22-07:00http://geoff.greer.fm//2012/03/19/japan-trip/<p>I was close to my earned-time-off cap, so I took the month of March off. The first half of the month consisted of a trip to Japan with <a href='http://journal.paul.querna.org/'>Paul</a>, <a href='http://shawnps.net/'>Shawn</a>, <a href='https://github.com/morgabra'>Brad</a>, and Brad’s wife Caity. It was quite enjoyable. We managed to have a pretty good balance of calm time and party time. I also paid off some of my <a href='http://en.wikipedia.org/wiki/Sleep_debt'>sleep debt</a>.</p>
<p>Photos are <a href='/photos/Japan_Trip.html'>here</a>. Apologies for the terrible gallery software. Replacing it is on my todo list.</p>
<p>My favorite part of the entire trip was going to the 52nd floor of the Park Hyatt in Tokyo. We drank expensive cocktails while admiring <a href='/photos/Japan_Trip_files/Media/IMG_0978/IMG_0978.jpg'>the Blade Runner-esque view</a>. There was a magnitude 6.1 earthquake right after we ordered drinks. I knew we were perfectly safe, but some primitive part of my brain panicked and released copious quantities of adrenaline. The building took a <em>long</em> time to stop swaying.</p>
<p>A close second was running in the hills above Kyoto as snow fell. The dusting of snow over the trees made for beautiful scenery.</p>
<p>I stopped in San Francisco for the weekend (and my birthday, also known as St. Patrick’s Day). Now it’s off to Chile and Argentina until the end of the month.</p>From Wordpress to Jekyll2012-02-21T09:56:46-08:00http://geoff.greer.fm//2012/02/21/from-wordpress-to-jekyll/<p>I rewrote my site in <a href='https://github.com/mojombo/jekyll'>Jekyll</a> this weekend. Apologies for the extra items in the RSS feed. Unfortunately I misconfigured the site URL, which changed every post ID, which made RSS readers think I had a bunch of new posts.</p>
<p>I’m still going back through old posts to fix formatting errors. Also, comments will come back at some point in the future.</p>
<p>I hope you like the new design. It’s a blatant rip-off of <a href='http://mnmlist.com/'>mnmlist</a>.</p>
<p>Edit: Everyone’s comments are back.</p>Profiling with Gprof2012-02-08T02:36:31-08:00http://geoff.greer.fm//2012/02/08/profiling-with-gprof/<p><a href='/2012/01/23/making-programs-faster-profiling/'>I said I’d post about gprof</a>, so here goes.</p>
<p>Valgrind and gprof are two very different tools. Valgrind is an <a href='http://en.wikipedia.org/wiki/Profiling_%28computer_programming%29#Instrumenting_profilers'>instrumenting profiler</a>. Gprof is a <a href='http://en.wikipedia.org/wiki/Profiling_%28computer_programming%29#Statistical_profilers'>sampling profiler</a>. Gprof spends most of its time doing nothing. Then every 100,000,000 clock cycles or so, it looks at the <a href='http://en.wikipedia.org/wiki/Program_counter'>instruction pointer</a> to see what function your program is in. It collects that data enough times to end up with a good idea of where your program is spending its time. The advantage of this approach is that your program runs almost at full speed. This gives you a better idea of how much time your program spends waiting for things like disk or network I/O.</p>
<p>My typical profiling experience with gprof looks like this:</p>
<div class='highlight'><pre><code class='text'>(sets CFLAGS=-pg in Makefile.am)
$ make clean && ./build.sh
(snip)
$ time ./ag --literal abcdefghijklmnopqrstuvwxyz ../ | wc -l
271
real 0m1.144s
user 0m0.792s
sys 0m0.340s
$ gprof -bp ag gmon.out
Flat profile:
Each sample counts as 0.01 seconds.
% cumulative self self total
time seconds seconds calls ms/call ms/call name
36.09 0.22 0.22 20953 0.01 0.01 is_binary
32.81 0.42 0.20 16567 0.01 0.01 boyer_moore_strnstr
13.12 0.50 0.08 63 1.27 1.27 print_file_matches
11.48 0.57 0.07 filename_filter
3.28 0.59 0.02 26377 0.00 0.00 strlcpy
1.64 0.60 0.01 52754 0.00 0.00 strlcat
1.64 0.61 0.01 1 10.01 540.36 search_dir
0.00 0.61 0.00 40160 0.00 0.00 log_debug
0.00 0.61 0.00 40160 0.00 0.00 vplog
0.00 0.61 0.00 213 0.00 0.00 add_ignore_pattern
0.00 0.61 0.00 63 0.00 0.00 print_path
0.00 0.61 0.00 16 0.00 0.00 load_ignore_patterns
0.00 0.61 0.00 1 0.00 0.00 cleanup_ignore_patterns
0.00 0.61 0.00 1 0.00 0.00 generate_skip_lookup
0.00 0.61 0.00 1 0.00 0.00 init_options
0.00 0.61 0.00 1 0.00 0.00 parse_options
0.00 0.61 0.00 1 0.00 0.00 set_log_level
</code></pre>
</div>
<p>Some caveats: for gprof to work, you need to add -pg to your CFLAGS. Also, <a href='http://lists.apple.com/archives/PerfOptimization-dev/2006/Apr/msg00014.html'>gprof is broken on OS X</a>, so run it on a linux server. If you want a sampling profiler on OS X, I recommend Instruments.app.</p>My Twisted Hack Day Project: Why is the Reactor Pausing?2012-02-04T23:04:44-08:00http://geoff.greer.fm//2012/02/04/my-twisted-hack-day-project-why-is-the-reactor-pausing/<p>Last week we had a <a href='http://twistedmatrix.com/trac/'>twisted</a> hack day at work. The project I work on has over a dozen twisted services, so this was right up my alley. I knew a couple of services were doing dumb things (like DB calls in the main thread) but nobody had gotten around to fixing the issues.<a href='#ref_1'>[1]</a> It’s pretty easy to find most instances of Django calls in the main thread, but there are many other ways to hang the reactor. I wanted to find every instance of pausing, so that’s what I decided to make for my hack day project.</p>
<p>Before the hack day, <a href='http://as.ynchrono.us/'>Jean-Paul Calderone</a> came to the office and gave some talks about twisted. I asked him if there were any tools for finding the cause of reactor pauses. He said he’d built his own little script a while back, and that the key was to use <a href='http://en.wikipedia.org/wiki/SIGALRM'>SIGALRM</a>. That was enough to get me on the right track.</p>
<p><a href='https://github.com/ggreer/twisted_hang'>Here’s the result</a>. It’s a pretty simple script. Every 100 milliseconds, it cancels any pending SIGALRM and calls <a href='http://docs.python.org/library/signal.html#signal.setitimer'>setitimer</a>, so that the OS will send SIGALRM to the process in 500ms. It also has a handler for SIGALRM that logs a traceback and adds the offending function to a stats dict.<a href='#ref_2'>[2]</a></p>
<p>If the reactor is paused for more than 500ms, the pending SIGALRM won’t be cancelled, so it will be sent to the process. Then the handler will print out the traceback and update stats. Pretty handy.</p>
<p>My tool can give false positives on a heavily-loaded system. This is because I’m calling setitimer with ITIMER_REAL instead of ITIMER_VIRTUAL. Using virtual time won’t catch stuff like sleep()s in the main thread, since sleeping doesn’t count toward execution time. Using real time will fire the SIGALRM after 0.5 seconds even if the process has gotten zero time on the CPU. I got a few false positives on my overloaded VM, but this turned out to be beneficial. The tracebacks went all the way to <a href='http://en.wikipedia.org/wiki/Select_%28Unix%29'>select()</a>. I mentioned this fact to <a href='http://journal.paul.querna.org/'>Paul</a> and he reminded me that we should be using the <a href='http://twistedmatrix.com/documents/current/core/howto/choosing-reactor.html#auto9'>epoll reactor</a>.</p>
<p>That’s two production issues identified and fixed because of a silly hack day project. Booyah.<a href='#ref_3'>[3]</a></p>
<hr /><a name='ref_1'> </a>
<ol>
<li>We got around performance issues by running multiple instances of these services. The problems were never fixed because they were in really old code that was written before anyone at Cloudkick knew how to write twisted. <a name='ref_2'> </a></li>
<li>A stats <a href='http://docs.python.org/library/collections.html#collections.defaultdict'>defaultdict</a>, to be precise. If you haven’t used defaultdict before, check it out. It will save you from writing some dumb boilerplate code. <a name='ref_3'> </a></li>
<li>I wasn’t the only person to do a hack day project. <a href='https://github.com/HackThePlanet/TwistedPython-HackDay'>Here’s the full list.</a></li>
</ol>Programming: We can do Science2012-01-30T08:11:28-08:00http://geoff.greer.fm//2012/01/30/programming-we-can-do-science/<p>Let’s say you’re messing with some Python and you forget how the <code>global</code> keyword changes scoping<a href='#ref_1'>[1]</a>. To find the answer you might try…</p>
<ol>
<li><a href='https://www.google.com/'>Google</a></li>
<li><a href='http://stackoverflow.com/'>Stack Overflow</a></li>
<li><a href='http://docs.python.org/'>Actually reading the docs</a></li>
<li>Asking someone who knows more about it than you</li>
</ol>
<p>…usually in that order. That’s not a terrible way of going about it, but there’s an oft-overlooked option. Instead of scouring documentation or bugging a coworker, sometimes it’s best to run experiments.</p>
<p>This code…</p>
<div class='highlight'><pre><code class='python'><span class='c'>#!/usr/bin/env python</span>
<span class='n'>blah</span> <span class='o'>=</span> <span class='mi'>1</span>
<span class='k'>def</span> <span class='nf'>foo</span><span class='p'>():</span>
<span class='n'>blah</span> <span class='o'>=</span> <span class='mi'>0</span>
<span class='k'>print</span> <span class='s'>"foo:blah"</span><span class='p'>,</span> <span class='n'>blah</span>
<span class='k'>print</span> <span class='s'>"global:blah"</span><span class='p'>,</span> <span class='n'>blah</span>
<span class='n'>foo</span><span class='p'>()</span>
<span class='k'>print</span> <span class='s'>"global:blah"</span><span class='p'>,</span> <span class='n'>blah</span>
</code></pre>
</div>
<p>…prints…</p>
<div class='highlight'><pre><code class='text'>global:blah 1
foo:blah 0
global:blah 1
</code></pre>
</div>
<p>While adding a <code>global</code> in foo…</p>
<div class='highlight'><pre><code class='python'><span class='c'>#!/usr/bin/env python</span>
<span class='n'>blah</span> <span class='o'>=</span> <span class='mi'>1</span>
<span class='k'>def</span> <span class='nf'>foo</span><span class='p'>():</span>
<span class='k'>global</span> <span class='n'>blah</span>
<span class='n'>blah</span> <span class='o'>=</span> <span class='mi'>0</span>
<span class='k'>print</span> <span class='s'>"foo:blah"</span><span class='p'>,</span> <span class='n'>blah</span>
<span class='k'>print</span> <span class='s'>"global:blah"</span><span class='p'>,</span> <span class='n'>blah</span>
<span class='n'>foo</span><span class='p'>()</span>
<span class='k'>print</span> <span class='s'>"global:blah"</span><span class='p'>,</span> <span class='n'>blah</span>
</code></pre>
</div>
<p>…prints…</p>
<div class='highlight'><pre><code class='text'>global:blah 1
foo:blah 0
global:blah 0
</code></pre>
</div>
<p>Without <code>global</code>, foo’s blah is not the same blah. With <code>global</code>, it is.</p>
<p>These sorts of code blurbs are great for learning. They’re also great for solving disagreements about code behavior. Warning: You will be wrong sometimes. You will look stupid in front of your peers. On the bright side, these embarrassing moments will stick in your mind. You won’t forget what you were wrong about.</p>
<p>Code experiments can be much trickier, and much more useful, than above. For example…</p>
<p>…at work we recently switched a project’s MySQL engine from <a href='http://en.wikipedia.org/wiki/MyISAM'>MyISAM</a> to <a href='http://en.wikipedia.org/wiki/InnoDB'>InnoDB</a>. After the switch, we encountered some weird errors. A process would save an object to the DB, but some services couldn’t find the newly-created object<a href='#ref_2'>[2]</a>. I had a hunch that transactions were responsible for the weirdness; my reasoning being that MyISAM lacks transaction support, and it had worked fine.</p>
<p>So I did science. I opened up two MySQL consoles. In console #1, I ran <code>start transaction;</code>. Then in console #2, I ran:</p>
<div class='highlight'><pre><code class='mysql'><span class='n'>mysql</span><span class='o'>></span> <span class='n'>start</span> <span class='n'>transaction</span><span class='p'>;</span>
<span class='n'>Query</span> <span class='n'>OK</span><span class='p'>,</span> <span class='mi'>0</span> <span class='n'>rows</span> <span class='nf'>affected</span> <span class='p'>(</span><span class='mi'>0</span><span class='p'>.</span><span class='mi'>00</span> <span class='n'>sec</span><span class='p'>)</span>
<span class='n'>mysql</span><span class='o'>></span> <span class='k'>insert</span> <span class='k'>into</span> <span class='nf'>inventory_nodeaddress</span> <span class='p'>(</span><span class='n'>node_id</span><span class='p'>,</span> <span class='n'>ip</span><span class='p'>,</span> <span class='n'>ip_version</span><span class='p'>,</span> <span class='n'>type</span><span class='p'>)</span> <span class='k'>values</span> <span class='p'>(</span><span class='no'>NULL</span><span class='p'>,</span> <span class='s1'>'31.22.190.54'</span><span class='p'>,</span> <span class='mi'>4</span><span class='p'>,</span> <span class='mi'>0</span><span class='p'>);</span>
<span class='n'>Query</span> <span class='n'>OK</span><span class='p'>,</span> <span class='mi'>1</span> <span class='n'>row</span> <span class='nf'>affected</span> <span class='p'>(</span><span class='mi'>0</span><span class='p'>.</span><span class='mi'>00</span> <span class='n'>sec</span><span class='p'>)</span>
<span class='n'>mysql</span><span class='o'>></span> <span class='n'>commit</span><span class='p'>;</span>
<span class='n'>Query</span> <span class='n'>OK</span><span class='p'>,</span> <span class='mi'>0</span> <span class='n'>rows</span> <span class='nf'>affected</span> <span class='p'>(</span><span class='mi'>0</span><span class='p'>.</span><span class='mi'>00</span> <span class='n'>sec</span><span class='p'>)</span>
<span class='n'>mysql</span><span class='o'>></span> <span class='k'>select</span> <span class='o'>*</span> <span class='k'>from</span> <span class='n'>inventory_nodeaddress</span> <span class='k'>where</span> <span class='n'>node_id</span> <span class='k'>is</span> <span class='no'>NULL</span><span class='p'>;</span>
<span class='o'>+-----+---------+-------------------------------------+------------+------+</span>
<span class='o'>|</span> <span class='n'>id</span> <span class='o'>|</span> <span class='n'>node_id</span> <span class='o'>|</span> <span class='n'>ip</span> <span class='o'>|</span> <span class='n'>ip_version</span> <span class='o'>|</span> <span class='n'>type</span> <span class='o'>|</span>
<span class='o'>+-----+---------+-------------------------------------+------------+------+</span>
<span class='o'>|</span> <span class='mi'>106</span> <span class='o'>|</span> <span class='no'>NULL</span> <span class='o'>|</span> <span class='mi'>50</span><span class='p'>.</span><span class='mi'>57</span><span class='p'>.</span><span class='mi'>96</span><span class='p'>.</span><span class='mi'>184</span> <span class='o'>|</span> <span class='mi'>4</span> <span class='o'>|</span> <span class='mi'>0</span> <span class='o'>|</span>
<span class='o'>|</span> <span class='mi'>107</span> <span class='o'>|</span> <span class='no'>NULL</span> <span class='o'>|</span> <span class='mi'>10</span><span class='p'>.</span><span class='mi'>182</span><span class='p'>.</span><span class='mi'>67</span><span class='p'>.</span><span class='mi'>171</span> <span class='o'>|</span> <span class='mi'>4</span> <span class='o'>|</span> <span class='mi'>1</span> <span class='o'>|</span>
<span class='o'>|</span> <span class='mi'>147</span> <span class='o'>|</span> <span class='no'>NULL</span> <span class='o'>|</span> <span class='mi'>31</span><span class='p'>.</span><span class='mi'>22</span><span class='p'>.</span><span class='mi'>190</span><span class='p'>.</span><span class='mi'>54</span> <span class='o'>|</span> <span class='mi'>4</span> <span class='o'>|</span> <span class='mi'>0</span> <span class='o'>|</span>
<span class='o'>+-----+---------+-------------------------------------+------------+------+</span>
<span class='mi'>3</span> <span class='n'>rows</span> <span class='k'>in</span> <span class='kt'>set</span> <span class='p'>(</span><span class='mi'>0</span><span class='p'>.</span><span class='mi'>00</span> <span class='n'>sec</span><span class='p'>)</span>
<span class='n'>mysql</span><span class='o'>></span>
</code></pre>
</div>
<p>OK, the data’s committed. I even double-checked that it was there by selecting it. Everything is fine, right?</p>
<p>Nope. Back in console #1, I ran:</p>
<div class='highlight'><pre><code class='mysql'><span class='n'>mysql</span><span class='o'>></span> <span class='k'>select</span> <span class='o'>*</span> <span class='k'>from</span> <span class='n'>inventory_nodeaddress</span> <span class='k'>where</span> <span class='n'>node_id</span> <span class='k'>is</span> <span class='no'>NULL</span><span class='p'>;</span>
<span class='o'>+-----+---------+-------------------------------------+------------+------+</span>
<span class='o'>|</span> <span class='n'>id</span> <span class='o'>|</span> <span class='n'>node_id</span> <span class='o'>|</span> <span class='n'>ip</span> <span class='o'>|</span> <span class='n'>ip_version</span> <span class='o'>|</span> <span class='n'>type</span> <span class='o'>|</span>
<span class='o'>+-----+---------+-------------------------------------+------------+------+</span>
<span class='o'>|</span> <span class='mi'>106</span> <span class='o'>|</span> <span class='no'>NULL</span> <span class='o'>|</span> <span class='mi'>50</span><span class='p'>.</span><span class='mi'>57</span><span class='p'>.</span><span class='mi'>96</span><span class='p'>.</span><span class='mi'>184</span> <span class='o'>|</span> <span class='mi'>4</span> <span class='o'>|</span> <span class='mi'>0</span> <span class='o'>|</span>
<span class='o'>|</span> <span class='mi'>107</span> <span class='o'>|</span> <span class='no'>NULL</span> <span class='o'>|</span> <span class='mi'>10</span><span class='p'>.</span><span class='mi'>182</span><span class='p'>.</span><span class='mi'>67</span><span class='p'>.</span><span class='mi'>171</span> <span class='o'>|</span> <span class='mi'>4</span> <span class='o'>|</span> <span class='mi'>1</span> <span class='o'>|</span>
<span class='o'>+-----+---------+-------------------------------------+------------+------+</span>
<span class='mi'>2</span> <span class='n'>rows</span> <span class='k'>in</span> <span class='kt'>set</span> <span class='p'>(</span><span class='mi'>0</span><span class='p'>.</span><span class='mi'>00</span> <span class='n'>sec</span><span class='p'>)</span>
<span class='n'>mysql</span><span class='o'>></span>
</code></pre>
</div>
<p>Once I ended the transaction in console #1 (either through a rollback or a commit), the new row showed up in selects. After some Googling I finally found <a href='http://dev.mysql.com/doc/refman/5.0/en/set-transaction.html#isolevel_read-committed'>the relevant MySQL documentation</a>. Once I changed the transaction isolation from <code>REPEATABLE-READ</code> to <code>READ-COMMITTED</code>, selects inside transactions showed recently-inserted rows.</p>
<p>The experiment plus <a href='https://github.com/morgabra'>Brad</a>’s knowledge of Django helped solve the mystery. Django only runs <code>commit</code> when writing to the DB. This sucks for any long-running service that never writes. The service will start up, connect to MySQL, run <code>start transaction</code> and do selects without ever ending the transaction. With the default InnoDB configuration, these services will see an ever-older version of the database. Not fun.</p>
<p>After I added <code>transaction-isolation = READ-COMMITTED</code> to the my.cnf <a href='http://wiki.opscode.com/display/chef/Home'>chef</a> template, everything worked swimmingly. Hooray for science.</p>
<p>The next time you’re stumped, try some experiments. As a programmer, you have immense power over the program’s universe. Your code runs on a perfectly deterministic machine<a href='#ref_3'>[3]</a>. With the right software tools, you can stop time. You can read or change any part of memory. <a href='http://en.wikipedia.org/wiki/GNU_Debugger'>You can step through</a>, <a href='http://docs.python.org/library/pdb.html'>one line at a time</a>, to see exactly what’s happening.</p>
<p>Of course, this isn’t <em>real</em> science. These apparati make programming a cakewalk compared to real science.</p>
<hr /><a name='ref_1'> </a>
<ol>
<li>You forgot this fact not because you suck at Python, but because you usually write clean code with no globals. At least, that’s what you keep telling yourself. <a name='ref_2'> </a></li>
<li>Just so nobody freaks out: This thing is non-customer-facing and currently under heavy development. I’m also over-simplifying the process. The actual changes happened in a development branch and weren’t merged until things were hunky-dory. <a name='ref_3'> </a></li>
<li><a href='http://en.wikipedia.org/wiki/Single_event_upset'>Cosmic rays</a> notwithstanding.</li>
</ol>Making Ag Faster: Profiling with Valgrind2012-01-23T10:47:36-08:00http://geoff.greer.fm//2012/01/23/making-programs-faster-profiling/<p>These days, a lot of software is written to be “fast enough”. Since code bases can be very large, there’s no such thing as “fast enough” for <a href='https://github.com/ggreer/the_silver_searcher'>The Silver Searcher</a>. In fact, my main goal with Ag is speed.</p>
<p>Improving performance is not always easy, but it is simple:</p>
<ol>
<li>Find the slowest part of the program.</li>
<li>Make that part faster.</li>
<li>Repeat until it’s fast enough or you go insane.</li>
</ol>
<p>There are lots of profiling tools and programmers often argue about which is the best. I use <a href='http://www.cs.utah.edu/dept/old/texinfo/as/gprof.html'>gprof</a>, <a href='http://valgrind.org/docs/manual/cl-manual.html'>callgrind</a>, and <a href='http://developer.apple.com/library/mac/#documentation/DeveloperTools/Conceptual/InstrumentsUserGuide/Introduction/Introduction.html'>Instruments.app</a>. Which profiler you use doesn’t matter as much as <em>actually using one</em>. They all have their advantages and disadvantages, but for this post I’ll only cover <a href='http://valgrind.org/'>Valgrind’s</a> callgrind. Using callgrind doesn’t require special compilation. Just invoke it with your program’s name and it will generate profiling data for callgrind_annotate to analyze.</p>
<p>Here’s a typical profiling run for Ag:</p>
<div class='highlight'><pre><code class='text'>$ make clean && ./build.sh
(snip)
$ time valgrind --tool=callgrind --dsymutil=yes ./ag --literal abcdefghijklmnopqrstuvwxyz ../
(snip)
real 1m34.709s
user 1m33.206s
sys 0m1.492s
$ callgrind_annotate --auto=yes callgrind.out.10361
--------------------------------------------------------------------------------
Profile data file 'callgrind.out.10361' (creator: callgrind-3.6.1-Debian)
--------------------------------------------------------------------------------
I1 cache:
D1 cache:
LL cache:
Timerange: Basic block 0 - 798409857
Trigger: Program termination
Profiled target: ./ag --literal abcdefghijklmnopqrstuvwxyz ../ (PID 10361, part 1)
Events recorded: Ir
Events shown: Ir
Event sort order: Ir
Thresholds: 99
Include dirs:
User annotated:
Auto-annotation: on
--------------------------------------------------------------------------------
Ir
--------------------------------------------------------------------------------
3,068,387,924 PROGRAM TOTALS
--------------------------------------------------------------------------------
Ir file:function
--------------------------------------------------------------------------------
1,764,541,095 src/util.c:ag_strnstr [/home/geoff/code/the_silver_searcher/ag]
386,020,821 /build/buildd/eglibc-2.13/posix/fnmatch_loop.c:internal_fnmatch [/lib/x86_64-linux-gnu/libc-2.13.so]
226,548,868 /build/buildd/eglibc-2.13/string/../sysdeps/x86_64/multiarch/../strcmp.S:__GI_strncmp [/lib/x86_64-linux-gnu/libc-2.13.so]
181,861,517 src/util.c:is_binary [/home/geoff/code/the_silver_searcher/ag]
123,211,270 /build/buildd/eglibc-2.13/posix/fnmatch.c:fnmatch@@GLIBC_2.2.5 [/lib/x86_64-linux-gnu/libc-2.13.so]
104,867,805 src/print.c:print_file_matches [/home/geoff/code/the_silver_searcher/ag]
77,058,570 /build/buildd/eglibc-2.13/string/../sysdeps/x86_64/multiarch/../strlen.S:__GI_strlen [/lib/x86_64-linux-gnu/libc-2.13.so]
60,030,629 /build/buildd/eglibc-2.13/posix/fnmatch_loop.c:internal_fnmatch'2 [/lib/x86_64-linux-gnu/libc-2.13.so]
44,019,376 src/ignore.c:filename_filter [/home/geoff/code/the_silver_searcher/ag]
27,072,821 /build/buildd/eglibc-2.13/string/../sysdeps/x86_64/memchr.S:memchr [/lib/x86_64-linux-gnu/libc-2.13.so]
9,329,984 /build/buildd/eglibc-2.13/string/../sysdeps/x86_64/multiarch/../strcmp.S:__GI_strcmp [/lib/x86_64-linux-gnu/libc-2.13.so]
7,803,075 /build/buildd/eglibc-2.13/malloc/malloc.c:_int_malloc [/lib/x86_64-linux-gnu/libc-2.13.so]
7,040,644 /build/buildd/eglibc-2.13/posix/../locale/weight.h:internal_fnmatch
6,062,124 /build/buildd/eglibc-2.13/string/../string/memmove.c:__GI_memmove [/lib/x86_64-linux-gnu/libc-2.13.so]
4,384,383 /build/buildd/eglibc-2.13/string/../sysdeps/x86_64/multiarch/../memcpy.S:__GI_memcpy [/lib/x86_64-linux-gnu/libc-2.13.so]
3,951,640 /build/buildd/eglibc-2.13/malloc/malloc.c:_int_free [/lib/x86_64-linux-gnu/libc-2.13.so]
3,779,300 /build/buildd/eglibc-2.13/dirent/../sysdeps/unix/readdir.c:readdir [/lib/x86_64-linux-gnu/libc-2.13.so]
3,181,118 /build/buildd/eglibc-2.13/malloc/malloc.c:malloc [/lib/x86_64-linux-gnu/libc-2.13.so]
(snip)
</code></pre>
</div>
<p>I snipped out the annotated source code. You can see the full output <a href='/code/ag_callgrind_slow.txt'>here</a>.</p>
<p>This profiling info tells me that I’m spending all my time in strnstr(). I did some research on string-matching and found out about the <a href='http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm'>Boyer-Moore algorithm</a>. After some <a href='http://blog.phusion.nl/2010/12/06/efficient-substring-searching/'>more reading</a>, I decided to go with a simplified version of Boyer-Moore called <a href='http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore%E2%80%93Horspool_algorithm'>Boyer-Moore-Horspool</a>.</p>
<p>Here’s the data after I <a href='https://github.com/ggreer/the_silver_searcher/pull/12'>implemented</a> Boyer-Moore-Horspool strstr:</p>
<div class='highlight'><pre><code class='text'>$ time valgrind --tool=callgrind ./ag --literal abcdefghijklmnopqrstuvwxyz ../
real 0m32.429s
user 0m31.034s
sys 0m1.324s
$ callgrind_annotate --auto=yes callgrind.out.11921
--------------------------------------------------------------------------------
Profile data file 'callgrind.out.11921' (creator: callgrind-3.6.1-Debian)
--------------------------------------------------------------------------------
I1 cache:
D1 cache:
LL cache:
Timerange: Basic block 0 - 228181262
Trigger: Program termination
Profiled target: ./ag --literal abcdefghijklmnopqrstuvwxyz ../ (PID 11921, part 1)
Events recorded: Ir
Events shown: Ir
Event sort order: Ir
Thresholds: 99
Include dirs:
User annotated:
Auto-annotation: on
--------------------------------------------------------------------------------
Ir
--------------------------------------------------------------------------------
1,139,437,344 PROGRAM TOTALS
--------------------------------------------------------------------------------
Ir file:function
--------------------------------------------------------------------------------
386,014,011 /build/buildd/eglibc-2.13/posix/fnmatch_loop.c:internal_fnmatch [/lib/x86_64-linux-gnu/libc-2.13.so]
181,870,097 src/util.c:is_binary [/home/geoff/code/the_silver_searcher/ag]
123,209,345 /build/buildd/eglibc-2.13/posix/fnmatch.c:fnmatch@@GLIBC_2.2.5 [/lib/x86_64-linux-gnu/libc-2.13.so]
104,867,805 src/print.c:print_file_matches [/home/geoff/code/the_silver_searcher/ag]
76,747,163 /build/buildd/eglibc-2.13/string/../sysdeps/x86_64/multiarch/../strlen.S:__GI_strlen [/lib/x86_64-linux-gnu/libc-2.13.so]
63,421,170 src/util.c:boyer_moore_strnstr [/home/geoff/code/the_silver_searcher/ag]
60,028,609 /build/buildd/eglibc-2.13/posix/fnmatch_loop.c:internal_fnmatch'2 [/lib/x86_64-linux-gnu/libc-2.13.so]
44,018,667 src/ignore.c:filename_filter [/home/geoff/code/the_silver_searcher/ag]
27,072,637 /build/buildd/eglibc-2.13/string/../sysdeps/x86_64/memchr.S:memchr [/lib/x86_64-linux-gnu/libc-2.13.so]
8,312,570 /build/buildd/eglibc-2.13/string/../sysdeps/x86_64/multiarch/../strcmp.S:__GI_strcmp [/lib/x86_64-linux-gnu/libc-2.13.so]
7,803,075 /build/buildd/eglibc-2.13/malloc/malloc.c:_int_malloc [/lib/x86_64-linux-gnu/libc-2.13.so]
7,040,534 /build/buildd/eglibc-2.13/posix/../locale/weight.h:internal_fnmatch
6,061,868 /build/buildd/eglibc-2.13/string/../string/memmove.c:__GI_memmove [/lib/x86_64-linux-gnu/libc-2.13.so]
4,384,383 /build/buildd/eglibc-2.13/string/../sysdeps/x86_64/multiarch/../memcpy.S:__GI_memcpy [/lib/x86_64-linux-gnu/libc-2.13.so]
3,951,640 /build/buildd/eglibc-2.13/malloc/malloc.c:_int_free [/lib/x86_64-linux-gnu/libc-2.13.so]
3,779,220 /build/buildd/eglibc-2.13/dirent/../sysdeps/unix/readdir.c:readdir [/lib/x86_64-linux-gnu/libc-2.13.so]
3,181,118 /build/buildd/eglibc-2.13/malloc/malloc.c:malloc [/lib/x86_64-linux-gnu/libc-2.13.so]
3,089,135 src/main.c:search_dir'2 [/home/geoff/code/the_silver_searcher/ag]
2,095,514 /build/buildd/eglibc-2.13/malloc/malloc.c:free [/lib/x86_64-linux-gnu/libc-2.13.so]
2,018,298 /build/buildd/eglibc-2.13/dirent/../sysdeps/wordsize-64/../../dirent/scandir.c:scandir [/lib/x86_64-linux-gnu/libc-2.13.so]
1,941,992 /build/buildd/eglibc-2.13/string/strcoll_l.c:strcoll_l [/lib/x86_64-linux-gnu/libc-2.13.so]
1,889,859 /build/buildd/eglibc-2.13/stdlib/msort.c:msort_with_tmp.part.0'2 [/lib/x86_64-linux-gnu/libc-2.13.so]
1,704,553 /build/buildd/eglibc-2.13/malloc/malloc.c:malloc_consolidate.part.3 [/lib/x86_64-linux-gnu/libc-2.13.so]
1,644,688 src/ignore.c:ignorefile_filter [/home/geoff/code/the_silver_searcher/ag]
1,601,628 /build/buildd/eglibc-2.13/dirent/../sysdeps/unix/sysv/linux/getdents.c:__getdents [/lib/x86_64-linux-gnu/libc-2.13.so]
1,582,620 src/util.c:strlcat [/home/geoff/code/the_silver_searcher/ag]
(snip)
</code></pre>
</div>
<p>For the curious, full output of callgrind_annotate is <a href='/code/ag_callgrind.txt'>here</a>.</p>
<p>That’s a 3x overall speedup and a 27x speedup in string matching. Impressive! Now Ag is spending most of the time figuring out whether or not it should search a file. It’s clear where I need to optimize next.</p>
<p>Valgrind isn’t perfect though. It makes programs run 25-50x slower than they normally would, so you won’t notice if you’re spending all your time waiting for network or disk I/O. In the case of Ag, this turned into a 20% performance improvement in my benchmarks.</p>
<p>Getting more useful data requires switching from an instrumenting profiler to a sampling profiler. Both Instruments.app and gprof are sampling profilers, but this post is already too long. I’ll cover them some other time.</p>Building for Others2012-01-20T08:14:57-08:00http://geoff.greer.fm//2012/01/20/building-for-others/<p>I like to write new code. Unfortunately, writing new code is only a small part of building something useful. The <a href='http://en.wikipedia.org/wiki/Pareto_principle'>Pareto principle</a> applies. Once you’ve written 80% of the code in a short, fun-filled period, you spend much longer finishing up little things. You have to debug some odd edge case, clean up messy stuff that mostly works, and get it to build and run on some <a href='http://www.ubuntu.com/'>obscure Linux distribution</a>.</p>
<p>Worst of all, you have to write documentation.</p>
<p>This stuff isn’t fun, but it’s necessary if you want others to use your project.</p>
<p>Because you made it, the various dependencies and <a href='http://lesswrong.com/lw/ke/illusion_of_transparency_why_no_one_understands/'>quirks are obvious to you</a>. For the poor soul who clones your repository, the same is not true. Even compiling is a challenge for a newbie. What build system does your project use? Make? Scons? Ant? What build dependencies does it have? Does it check for them or print out a useful error message if any are missing? Is there a helpful README?</p>
<p>If you want other people to use (and possibly one day improve) your work, you need to polish the build scripts and write documentation. Think of it like a <a href='http://en.wikipedia.org/wiki/Sales_process'>sales funnel</a>. Perhaps 100 people download your code, 80 get it to build, 75 run it, 50 use it regularly, 5 make modifications, and finally, 2 contribute patches back. You can increase the numbers. Making those steps easier will grow your user base and contributions.</p>
<p>So that’s what I’ve done with <a href='/2011/12/27/the-silver-searcher-better-than-ack/'>Ag</a>. Over the past couple of weeks I’ve added a man page, <a href='https://github.com/ggreer/the_silver_searcher/wiki'>a wiki</a>, accepted <a href='https://github.com/ggreer/the_silver_searcher/pull/9'>pull requests</a> to <a href='https://github.com/ggreer/the_silver_searcher/pull/10'>clean up the build</a>, and even improved the –help output.</p>
<p>I’ll be the first to admit that fixing trivial inconveniences is boring. But frequently, boring work is the difference between a personal project and a community of users. Too often, <a href='http://lesswrong.com/lw/f1/beware_trivial_inconveniences/'>trivial inconveniences</a> stop projects from reaching critical mass. So if you want to build something for others, grit those teeth and go <a href='http://paulgraham.com/schlep.html'>schlep</a>.</p>The Silver Searcher: Better than Ack2011-12-27T16:36:40-08:00http://geoff.greer.fm//2011/12/27/the-silver-searcher-better-than-ack/<p>A lot of my time spent “writing” code is actually spent reading code. And a decent chunk of my time spent reading code is actually spent searching code. Lately I’ve started working with a larger codebase.<a href='#ref_1'>[1]</a> Both grep and ack take a non-negligible amount of time to search it. Both are slow, but for different reasons. <a href='http://www.gnu.org/s/grep/'>Grep</a> is fast, but doesn’t ignore files.<a href='#ref_2'>[2]</a> <a href='http://betterthangrep.com/'>Ack</a> is very good at ignoring files, but it’s written in Perl instead of C. What I really want is something that’s fast <em>and</em> ignores files.</p>
<p>So I built it. I call it <a href='https://github.com/ggreer/the_silver_searcher'>The Silver Searcher</a>, or <a href='http://en.wikipedia.org/wiki/Symbol_(chemical_element'>Ag</a>) for short. Ag is like ack, but better. It’s fast. It’s damn fast. The only thing faster is stuff that builds indicies beforehand, like <a href='http://ctags.sourceforge.net/'>Exuberant Ctags</a>.</p>
<p>Don’t believe me? Here are some benchmarks. I ran them multiple times and grabbed the median for each.</p>
<div class='highlight'><pre><code class='text'>ggreer@carbon:~/cloudkick/reach% du -sh
250M .
ggreer@carbon:~% time grep -r -i SOLR ~/cloudkick/reach | wc -l
617
11.06s user 0.81s system 96% cpu 12.261 total
ggreer@carbon:~% time ack -i SOLR ~/cloudkick/reach | wc -l
488
2.87s user 0.78s system 97% cpu 3.750 total
ggreer@carbon:~% time ag -i SOLR ~/cloudkick/reach | wc -l
573
1.00s user 0.51s system 95% cpu 1.587 total
</code></pre>
</div>
<p>Here’s Ag with some extra ignores, similar to how ack ignores many files by default:</p>
<div class='highlight'><pre><code class='text'>ggreer@carbon:~% cat ~/cloudkick/reach/.agignore
extern
release
fixtures
ggreer@carbon:~% time ag -i SOLR ~/cloudkick/reach | wc -l
499
0.35s user 0.15s system 94% cpu 0.528 total
</code></pre>
</div>
<p>That’s the same as <a href='http://book.git-scm.com/4_finding_with_git_grep.html'>git grep</a>:</p>
<div class='highlight'><pre><code class='text'>ggreer@carbon:~/cloudkick/reach% time git grep -i SOLR ~/cloudkick/reach | wc -l
489
0.32s user 0.58s system 161% cpu 0.556 total
</code></pre>
</div>
<p>…except git grep only works in git repos. And it doesn’t ignore stuff in the repository like extern or generated files.<a href='#ref_3'>[3]</a></p>
<p>The bottom line: Grep’s output was the least useful. It dutifully reported matches in .pyc files and other things I don’t care about. Ack’s results were better and faster than grep. Ag had more results than ack, but <strong>took half as long.</strong> With a couple of clever ignores (like the extern directory), Ag took a mere half-second and gave even more pertinent results.</p>
<p>I can already hear someone saying, “Big deal. It’s only a second faster. What does one second matter when searching an entire codebase?” My reply: <a href='http://lesswrong.com/lw/f1/beware_trivial_inconveniences/'>trivial inconveniences matter</a>. Using Ag is like having a faster computer; you don’t realize how slow things were until you’ve experienced fast. The difference is big enough that I can’t go back to ack, just like ack users can’t go back to grep.</p>
<p>Since it behaves like ack, Ag can be used by many fancy ack GUI front-ends. This makes searching convenient as well as fast. After I got Ag sorta-working, I forked <a href='https://github.com/protocool/AckMate/'>AckMate</a> so that I could use Ag in <a href='http://macromates.com/'>my favorite editor</a>. <a href='https://github.com/ggreer/AckMate/'>My fork</a> bundles both AckMate’s Ack and my own Ag. You can switch between them with a simple check box. The tmbundle is on the <a href='https://github.com/ggreer/AckMate/downloads'>downloads page</a>. Be warned: it replaces your current AckMate.</p>
<p>There’s still plenty of stuff I want to add,<a href='#ref_4'>[4]</a> but it’s good enough for my own daily use so I figured I should tell others about it. And of course, patches are welcome!</p>
<hr /><a name='ref_1'> </a>
<ol>
<li>The decision was made to put all python dependencies into extern/ instead of using pip. A good call, in my opinion. <a name='ref_2'> </a></li>
<li>At least not without a bunch of pipes and find and xargs. Yes I know there are aliases but it’s annoying to keep those up-to-date. <a name='ref_3'> </a></li>
<li>Yes, I know it’s bad form to put generated files in revision control. <a name='ref_4'> </a></li>
<li>Ctags support, for one. Also inverted matching, accepting piped input, and basic stuff like retrying a search with fewer ignores and no case-sensitivity.</li>
</ol>The Moral Trajectory2011-12-12T00:20:11-08:00http://geoff.greer.fm//2011/12/12/the-moral-trajectory/<p>Apologies in advance. This post is about philosophy and morality; way out of my area of expertise. You have been warned.</p>
<p>By our standards, ancient Rome was not a nice place. I’m not just talking about problems due to inferior technology or sanitation or disease. Even a hypothetical ancient Rome that fixed those things would be worse than the modern world. Slaves would still exist. Racism and discrimination would still be rampant. The citizenship of one’s parents would still determine your lot in life. Simply put, ancient Roman morality was worse than ours.</p>
<p>It is my claim that over time, most civilizations have moved along a moral trajectory. Slowly -ever so slowly- people have become nicer. We’ve recognized more humans as worthy of moral value. Eventually we decided slavery was wrong. We began to treat women and men equally. Currently we’re moving toward treating homosexuals and heterosexuals equally.</p>
<p>The fact that morals have improved over time should strike people as extraordinary. Think about what must have happened for slavery to go from right to wrong. At some point, a person went from thinking “slavery is right” to thinking “slavery is wrong”, <em>without initially wanting to change his mind!</em> I wish I knew how to trigger that sort of thinking. It’s as valuable as vaccines.</p>
<p>It’s great that people have become more moral over the millennia. Now what are the odds that after thousands of years, we suddenly got morality right? What are the chances that we won’t improve upon our current ideas about right and wrong?</p>
<p>Yeah, pretty damn small.</p>
<p>OK, so if we’re wrong, we should ask the question, “What are some things that future societies might condemn us for?”</p>
<p>Stop here and think about them. I don’t want to contaminate your initial ideas.</p>
<p>…</p>
<p>There are some obvious ones that are already in political discourse. Recreational drugs. LGBT rights. <a href='http://elidourado.com/blog/smash-the-new-aristocracy/'>Border and immigration laws.</a> Religious indoctrination of children. Other things that <a href='http://www.paulgraham.com/say.html'>You Can’t Say</a>… yet.</p>
<p>One day morality will even go beyond those things. What about eating meat? It’s absolutely indefensible that we breed, kill, chop up, and eat animals. It’s not the eating of meat that’s wrong. If we could grow meat in vats, there would be no problem. The problem is that animals experience pain much like we do. While animals are not open to the same range of experiences as humans, they still seem to have some moral value. All else equal, I think we’d prefer that animals not suffer and die.</p>
<p>Future technologies could improve morality as well. Like education and nutrition today, in the future it could be considered abusive to not give your child anti-aging and intelligence-enhancing treatments. And that’s just the start.</p>
<p>Does this future scare you? Well so would the present to those who lived in the past. <a href='http://lesswrong.com/lw/xl/eutopia_is_scary/'>Eutopia is Scary</a>.</p>Consume Less Shallow Content2011-12-04T15:21:13-08:00http://geoff.greer.fm//2011/12/04/consume-less-shallow-content/<p>Many of my friends and coworkers own iPads. I do not. I don’t have anything against Apple, in fact <a href='/2010/11/16/five-years-of-progress-in-laptops/'>I love their products</a>. I agree that the iPad is an engineering marvel, both of hardware and software. It’s elegant. It’s responsive. It’s just plain fun to play with. But I don’t own one.</p>
<p>I don’t own an iPad for the same reason I don’t have a cable subscription: Because it encourages shallow content consumption.</p>
<p>Don’t misunderstand. I don’t think content consumption is bad. I enjoy it and spend lots of time doing it. But there are different ways to consume it, different types to consume, and differing amounts of time one can spend on it.</p>
<p>So what is shallow content? The shallowest content consists of action movies and chick flicks. These are equivalent to sitting in front of a machine that pulls levers in your brain to trigger emotional reactions. Shallow content is easily accessible, but uninformative and in the long run, less rewarding.</p>
<p>Deep content is complex. Profound. Memorable. It is organized on multiple levels. It rewards re-reading (or re-watching). <a href='http://en.wikipedia.org/wiki/G%C3%B6del,_Escher,_Bach'>Gödel, Escher, Bach: An Eternal Golden Braid</a> is the prototype of deep content.</p>
<p>Shallow content is cotton candy and Coca-Cola. Deep content is raspberry cheesecake and Riesling.</p>
<p>When it comes to content consumption, different devices encourage different behaviors. For example, a television is the ultimate device for consuming shallow content. You select a channel. The pictures and sounds elicit emotional responses. Changing the channel is easy, so TV shows are selected for stickiness and addictiveness. TV shows can’t be as complex as other media.[1] They have to be accessible to people who start watching in the middle.</p>
<p>Whether or not a device is meant for content consumption is not a binary attribute; it’s more of a sliding scale. A television is solely for consuming. An iPad is more general-purpose, but encourages consumption over production, and shallow content over deep. A laptop or desktop computer is fully general-purpose. A good indicator of a device’s purpose is its input interface. A TV’s input is simple: channel selection. An iPad has a touch-screen that can show a virtual keyboard for occasions when you need to write something. A computer has a physical keyboard, the fastest brain-machine interface currently available.</p>
<p>Of course, even with a computer you can waste your days on Reddit and IRC (consuming and producing shallow content) or you can do something <a href='http://projecteuler.net/problems'>more</a> <a href='https://github.com/'>rewarding</a>.</p>
<ol>
<li>Also, the cost to make a TV show is much greater than the cost to make a book. So to be profitable, a TV show must appeal to a greater population than a book. This limits the ideas and opinions on television.</li>
</ol>Information Dieting with a Kindle2011-03-14T00:44:35-07:00http://geoff.greer.fm//2011/03/14/information-dieting-with-a-kindle/<p>I got a Kindle a few months ago, and it has completely changed how I read. It’s yet another example of <a href='http://lesswrong.com/lw/f1/beware_trivial_inconveniences/'>trivial inconveniences</a> affecting behavior.</p>
<p>Before, my information diet consisted of countless hours reading short, forgetful pieces linked from <a href='http://en.wikipedia.org/wiki/Internet_Relay_Chat'>IRC</a>, <a href='http://twitter.com/'>Twitter</a>, <a href='http://www.google.com/reader/'>Google Reader</a>, or <a href='http://news.ycombinator.com/'>Hacker News</a>. I would try to read longer articles on my laptop, but it was hard to avoid distractions such as e-mail and IM clients.</p>
<p>Reading full-length books was also a hassle, although I didn’t realize it at the time. I’d order a book from Amazon and wait two days. Then I had to carry a chunk of dead tree around. When travelling, I usually took multiple books with me, since I’d finish more than one on a trip.</p>
<p>Right after I got my Kindle, <a href='http://trolocsis.com/wp/'>Ryan Phillips</a> told me about <a href='http://www.instapaper.com/'>Instapaper</a>. Now I skim or ignore short things, and use Instapaper to mark a couple of large gems for evening reading. Instead of habitually refreshing Hacker News, I load it maybe once a day.</p>
<p>Fan fiction such as <a href='http://www.fanfiction.net/s/5782108/1/Harry_Potter_and_the_Methods_of_Rationality'>Harry Potter and the Methods of Rationality</a> have fan-created .mobi and .epub files, but many stories aren’t popular enough to warrant such devotion. Fortunately I found <a href='http://fanfictionloader.appspot.com/'>FanFiction Downloader</a>, a semi-automated way to get <a href='http://fanfiction.net/'>FanFiction.net</a> stories onto my Kindle. The app isn’t perfect though. It runs out of memory on longer books.</p>
<p>The only major annoyance I’ve encountered is from flight attendants asking me to turn off my Kindle during takeoff and landing. I smile and wait for them to move on, then continue reading. Flipping the power switch wouldn’t do much anyway; Kindles don’t really turn off. They wake up to check for new content periodically, and there’s no way to remove the battery. I’m surprised the “no electronic devices” rule has lasted so long. If I were a conspiracy theorist, I’d rant about Amish Illuminati controlling the FAA. Googling for that idea doesn’t return much. I guess conspiracy theorists aren’t very creative.</p>VPS.net is Really Annoying2010-11-20T03:55:10-08:00http://geoff.greer.fm//2010/11/20/vps-net-is-really-annoying/<p>I’m writing this because I am annoyed at <a href='http://www.vps.net/'>VPS.net</a>. I wanted a server in the UK so I could use things like <a href='http://en.wikipedia.org/wiki/Spotify'>Spotify</a> and <a href='http://www.bbc.co.uk/iplayer/tv'>BBC iPlayer</a>. <a href='https://www.cloudkick.com/'>Cloudkick</a> supports <a href='http://www.linode.com/'>Linode</a> and VPS.net. Both are closer to normal virtual private server providers than “true” cloud providers.[1] Linode billed monthly and VPS.net offered daily billing, so I went with the latter. The signup process was typical, except that I was asked to fill out four security questions. I entered stupid nonsensical answers, since while many people know my mother’s maiden name, nobody knows my current passwords and I am unlikely to forget them. Not long after signing up, I received a confirmation e-mail… containing my password in plaintext.</p>
<p>E-mailing me my password tells me several things about your company. It tells me that you store passwords in plaintext instead of hashing them. If anyone gets ahold of a DB dump, they’ll have passwords and e-mail addresses. Lots of people use the same password everywhere, making their e-mail vulnerable. E-mailing my password also tells me that you don’t either don’t know or don’t care about the dangers of sending secrets via e-mail. E-mail isn’t always encrypted and messages are often relayed through many servers. Anyone with access to one of those servers could see my password.</p>
<p>The password thing was a big red flag, but I didn’t want to give up so easily. I booted a server and started screwing around with it. I renamed my server in the VPS.net dashboard. Suddenly, my ssh session died. It turns out that renaming a server reboots it without warning. Frustrated, I gave up and decided to try again when I had more patience.</p>
<p>I woke up and saw my inbox contained an invoice for $1.00. Yes, VPS.net sends an invoice every day. Worse, after a week, VPS.net started warning me that my invoices were overdue. I tried to log in and pay the measly $10. I was confronted with a login page asking me to enter my username, password, and answer some security questions. They noticed I was trying to log in from a different IP address and threw some security questions at me. I finally managed to get enough answers correct to log in.</p>
<p><a href='http://geoff.greer.fm/rambling/wp-content/uploads/2010/11/Screen-shot-2010-11-11-at-1.06.18-AM.png'><img src='http://geoff.greer.fm/rambling/wp-content/uploads/2010/11/Screen-shot-2010-11-11-at-1.06.18-AM-500x242.png' alt='' /></a></p>
<p>That “Pay Now” button is actually a “try to pay $1 and show a big failure message, but mark the invoice as paid if there have been no payment attempts in the past few hours” button. I had to click it once for each invoice, waiting 3-4 hours between tries if I wanted them to work. Later I noticed the charges actually showed up on my card.</p>
<p>There were other things I noticed, such as sequential instance IDs. Did you know VPS.net has only booted a total of 33,000 instances? Anyway, after the invoice thing I wrote VPS.net off as amateurs and tried out Linode. I’ve had no problems with Linode. Their stuff works without annoying the hell out of me. If you want a server in the UK, go with them.</p>
<ol>
<li>The litmus test I use is, “Can I make an API call and get a booted server in under 5 minutes?” If not, then it’s not really cloud computing. It’s more of a reasonably fast VPS provider.</li>
</ol>Five Years of Progress in Laptops2010-11-16T01:07:52-08:00http://geoff.greer.fm//2010/11/16/five-years-of-progress-in-laptops/<p>The last iteration of the iBook G4 came out in September of 2005 and was sold until mid-2006.</p>
<table cellspacing='0' border='1' cellpadding='4'>
<tbody><tr>
<th />
<th>11.6″ MacBook Air</th>
<th>14″ iBook G4</th>
</tr>
<tr>
<td>Processor</td>
<td>1.6Ghz Core 2 Duo</td>
<td>1.42Ghz PowerPC G4</td>
</tr>
<tr>
<td>Memory</td>
<td>4GB 1067Mhz DDR3 RAM</td>
<td>512MB 333Mhz DDR2 RAM <sup>[1]</sup></td>
</tr>
<tr>
<td>Storage</td>
<td>128GB SSD</td>
<td>60GB 4200RPM HDD</td>
</tr>
<tr>
<td>Battery</td>
<td>5-7 hours</td>
<td>3-4 hours</td>
</tr>
<tr>
<td>Weight</td>
<td>2.3 lbs</td>
<td>5.9 lbs</td>
</tr>
<tr>
<td>Dimensions</td>
<td>0.11-0.68 x 11.8 x 7.6</td>
<td>1.35 x 12.7 x 10.2</td>
</tr>
<tr>
<td>Cost</td>
<td>$1500</td>
<td>$1700</td>
</tr>
</tbody></table>
<p>[1] Bottlenecked by 142Mhz front side bus.</p>
<p>There are a dozen things that don’t show up in the numbers. The iBook’s keyboard is spongy. It has a smaller trackpad that only supports two-finger scrolling. The iBook’s display is much worse, although that may be due to age. The iBook’s trackpad is plastic instead of glass. Even with a fresh install of Leopard, it feels slow. Yet somehow I used it as my main computer for two years.</p>
<p><a href='http://geoff.greer.fm/rambling/wp-content/uploads/2010/11/air_ibook2.jpg'><img src='http://geoff.greer.fm/rambling/wp-content/uploads/2010/11/air_ibook2-500x332.jpg' alt='' /></a></p>
<p>In case you forgot, that skinny ethernet port on the iBook is a 56k modem. It’s like ethernet, but data goes 1,800 times slower and makes angry noises.</p>
<p><a href='http://geoff.greer.fm/rambling/wp-content/uploads/2010/11/air_ibook.jpg'><img src='http://geoff.greer.fm/rambling/wp-content/uploads/2010/11/air_ibook-500x250.jpg' alt='' /></a></p>
<p>I wonder what I’ll compare my Air to in 2015.</p>Expensive Computers are Worth the Price2010-10-30T08:37:15-07:00http://geoff.greer.fm//2010/10/30/expensive-computers-are-worth-the-price/<p>You should not be afraid to spend lots of money on a computer. All too often developers skimp on their computers. They balk at high price tags and try to save money by buying a less expensive model.</p>
<p>For example: A <a href='http://en.wikipedia.org/wiki/Samsung_NC20'>Samsung NC20</a> goes for around $500. A completely maxed-out 11.6” <a href='http://www.apple.com/macbookair/'>MacBook Air</a> costs $1500. Most people look at those two prices and think, “Wow, I could save $1,000 by taking a small hit in performance and features.” Or worse, “$1500 is too much money for a computer.” That is the wrong way to think about it.</p>
<p>Here’s why: According to RescueTime, I average 10 hours a day on my laptop. Assuming I upgrade every 18 months, that’s 5,400 hours of use. Amortized over its life, a $1,500 laptop costs me 28 cents per hour. A $500 laptop would be 9 cents per hour. The important factor I haven’t mentioned yet is the amount of value I can create per hour. Let’s be extremely pessimistic and say I create $5 worth of value per hour on average. Let’s also say the $1500 laptop makes me 10% more efficient than the $500 laptop. That means I would create $5.50 worth of value each hour on the expensive laptop. Subtracting the amortized cost of the expensive laptop still leaves me with an extra 31 cents per hour.</p>
<p>The expensive laptop costs 3x as much and only improves my performance by 10%, but it still comes out ahead purely for economic reasons. This is because the amount of value you extract or create using a computer is much greater than the hourly cost of that computer. I’m not even factoring in more subjective things like ease-of-use or aesthetics. You shouldn’t be worried about spending too much on your computer. You should be worried about not spending enough!</p>
<p>I’ve followed my own advice here and bought a maxed-out 11” MacBook Air. I’m using it as my sole development machine. It’s amazing.</p>
<p>OK, I admit it. This entire post was just an excuse to brag about my MacBook Air.</p>On Alpha Geeks and Gadgets2010-10-25T03:43:39-07:00http://geoff.greer.fm//2010/10/25/on-alpha-geeks-and-gadgets/<p>Earlier this month, Benjamin Stein observed that <a href='http://benjaminste.in/post/1223476561/hey-guys-whatcha-doing'>“alpha geeks” are now using the same hardware as normal people</a>. I have a hypothesis for why this is.</p>
<p>Today, there are three main computing devices that people use: Laptops, which fit in a backpack and are fast enough for most stuff; smartphones, which fit in a pocket and are a tradeoff of size and speed/features; and desktops, which can have more storage and better graphics than laptops.</p>
<p>Now which one of those three things can an individual assemble and customize?</p>
<p>That’s why nerds/hackers are using the same hardware as everyone else. It used to be that almost everyone had desktops. Alpha geeks built custom machines from parts. Now, people use laptops and smartphones. Alpha geeks don’t have the resources to build or modify such tightly-integrated devices. Instead, they buy whatever best satisfies their needs. Lately, those products have been Macs and iPhones.</p>
<p>So are alpha geeks done with hardware tinkering? I doubt it. I think a new type of device will show up: the wearable. Recent advances in microdisplays and embedded computing have made wearable computers a borderline-practical idea. And individuals can build them for about the cost of a laptop. <a href='http://www.umpcportal.com/2009/07/awesome-wearable-computer-setup-is-powered-by-sony-vaio-ux-umpc'>A</a> <a href='http://www.linux.com/community/blogs/my-wearable-computer-updates-and-what-happens-next.html'>few</a> <a href='http://blog.2yb.org/2010/07/cd-case-wearable-computer.html'>people</a> have already hacked their own together from video goggles (like the Myvu Crystal) and single-board computers (like the <a href='http://beagleboard.org/'>BeagleBoard</a>).</p>
<p>An often-heard objection to wearables is, “I already carry a computer with me. It’s called an iPhone and I can use it any time.” That’s true, but most people don’t realize how much <a href='http://lesswrong.com/lw/f1/beware_trivial_inconveniences/'>trivial inconveniences</a> can affect their behavior. Every time you want to look something up, you have to pull the phone out of your pocket and unlock it. You lose eye contact with anyone you were conversing with. It takes time for the phone to come out of hibernation and run whatever app you want to use. The screen is visible to those nearby, so you can’t message a friend, “What’s the name of this person next to me?”, or search your e-mail archives for a forgotten message that has come up in discussion.</p>
<p>Two major issues with wearables are fashion and software. Fashion will probably be resolved as technology gets better and people become more used to seeing wearables. Software is going to be a bigger problem. Wearables have to be extremely responsive. The idea is to completely integrate the computer with your life. Smartphone software has a similar constraints, but not to the same degree.</p>
<p>A wearable with good software would be outrageously useful. Imagine having a local cache of Wikipedia, e-mail, personal notes, and other data sources. Add a camera to the mix to get <a href='http://en.wikipedia.org/wiki/Lifelog'>lifelogging</a> and <a href='http://en.wikipedia.org/wiki/Augmented_reality'>augmented reality</a>. While AR is more of a toy, people completely underestimate the utility of lifelogging. I know of one case where a lifelogger had video of the first time he met his wife. A lifelog would also come in handy if you were witness to a crime or accident. To get some idea of what you do all day, you could put the lifelog data into something like <a href='https://www.rescuetime.com/'>RescueTime</a>. If automated transcribing gets better you could build a searchable database of all your conversations. With these new sources of data and forms of interaction, the possibilities are quite vast.</p>
<p>Note: This post is expanded from <a href='http://news.ycombinator.com/item?id=1758750'>a comment I made</a> on Hacker News.</p>Time Management2010-10-10T21:07:33-07:00http://geoff.greer.fm//2010/10/10/time-management/<p>Each year has 365 days. Each day has 24 hours. That’s 8,760 hours a year. Sounds like a lot, but how much of it is spent doing stuff you <em>have</em> to do versus what you <em>want</em> to do?</p>
<p>You need around 8 hours a day for sleep. That leaves 5,840 hours of wakefulness.</p>
<p>A full-time job is around 2,000 hours a year. 3,840 hours left.</p>
<p>In addition to work you’re paid for, you have to run errands and do chores (shop for groceries, go to the bank, yard work, vacuum, cook, wash dishes, etc). Let’s say you average 90 minutes a day on that, which is 548 hours per year. 3,292 hours remaining.</p>
<p>How long is your commute? The average is 25 minutes each way, so just over 208 hours a year. 3,084 hours left.</p>
<p>How long do you take to get ready in the morning? 40 minutes? 240 hours. 2,844 left.</p>
<p>Do you exercise? Maybe you don’t work out every day. 1 hour a day, 3 times a week? 156 hours. 2,688 left.</p>
<p>Now let’s say you’re 25 years old and healthy. According to most <a href='http://www.ssa.gov/OACT/STATS/table4c6.html'>actuarial tables</a>, that gives you even odds of sticking around for another 50 years.</p>
<p>2,688 hours * 50 = 134,400 hours. That’s 15 years. I admit it’s a very rough estimate. I left out a lot of factors (such as anything related to raising children). People work less when they’re older, but that’s partially offset by more time spent being sick. I’d also weight calculations toward free time in younger years, since youth usually means fewer responsibilities and greater physical ability. Later years are spent undergoing general cognitive and physical decline.</p>
<p>What wisdom do I take away from this calculation? Pretty straightforward stuff: Find a job you enjoy. Use a tool like <a href='https://www.rescuetime.com/'>RescueTime</a> to ensure you spend your free time wisely. Finally, you spend a significant chunk of your life using your bed, office chair, and/or computer. Don’t skimp on those things.</p>Stanislav Petrov Day2010-09-26T00:34:28-07:00http://geoff.greer.fm//2010/09/26/stanislav-petrov-day/<p>Many people say Thanksgiving is the fourth Thursday in November. I disagree. If there is any day of the year on which to give thanks, it is September 26th.</p>
<p>Why?</p>
<p>Today is the 27th anniversary of when <a href='http://en.wikipedia.org/wiki/Stanislav_Petrov'>Stanislav Petrov</a> chose not to destroy the world. When an early warning system reported the Americans had launched ICBMs, Petrov told his superiors it was a malfunction. Had he decided incorrectly, the Soviets would have launched their nukes, causing the United States to <em>actually</em> launch their nukes. Billions would have died.</p>
<p>I am continually amazed by how few know about this incident. The average person could probably tell you the latest celebrity gossip or which team won some sporting competition, but ask them who prevented a gigadeath event and they wouldn’t have a clue.</p>
<p>Lieutenant Colonel Petrov, your name isn’t as famous as it should be, but thank you nonetheless. The gratitude owed to you simply cannot be conveyed.</p>