Bugs are a failure of prediction

We think of bugs as flaws in code. This is incorrect, or at best, banal.

Rather, the essential definition of a bug is that a programmer failed to predict what a piece of code would do.

See, smart programmers wrote the code. Other smart programmers looked at the code. It went into production, and did the wrong thing.

No programmer wanted it to do the wrong thing. They thought it would do the right thing. The prediction was wrong. So let’s focus not on the broken code, but on the flawed prediction.


I like tests not for safety, per se. I like tests because they require us to articulate what a piece of code should do. Further, they require us to interact with our code from the outside (you know, like a user). Safety is a side effect of these two things.

When we test code, we test our predictions of what the code will do.


Humans form a mental model of what a piece of code will do. That mental model is limited in size. To the extent that understanding a piece of code requires understanding other, possibly far away, pieces of code, we make forming an accurate mental model more difficult.

Code becomes unpredictable by humans when it taxes our mental models.

Playing computer

Some rare programmers can look at a piece of code and know what it will do. Most of us can’t. Instead, we have educated guesses.

The computer (the compiler, the environment) is the unambiguous decider of what the code will do. When we think we know what a piece of code will do, we are playing computer.

The computer is better at playing computer than the human mind. A prediction about what a piece of code will do, short of executing it, is folly.


Failure of prediction often happens because a piece of code doesn’t look like what it does. This is a very subjective idea, of course.

The measure for me is: how often would a programmer new to this code make correct predictions about how it behaves?

This is of course an argument about abstractions. One person’s “readable abstraction” is another’s “too much magic”.

What we call “readability” is in fact predictability.

None of these are new ideas. They are mostly best practices. We believe them, often correctly, to lead to higher quality software.

They work, however, not because they “improve code”. Rather, they improve human understanding of code. That distinction is the essence of understanding bugs.

discuss on hacker news

The legacy ad industry is wrong about ad blockers

I was recently directed to a couple of papers on browser ad-blocking software (here and here). They are not so much alarmist as they are self-serving and hindered by status quo bias.

Ad blockers remind me of music piracy since the advent of Napster. The technical details are different, but the dynamic is the same — a consumer gets what they want at a lower price, through technical means.

Then, as now, few users thought of themselves as violating any rights. They simply used a product they liked. And if their friend turns them on to a tool that makes it better or cheaper, great.

Then, as now, one could make the not-illegitimate argument that one is stealing or in some way depriving a creative artist or organization of revenue. Those accustomed to making a living in the field call their customers immoral. They predict the decline of product as we starve it of revenue.

Plausible! But wrong. The argument is as impotent as it was 10 years ago.

Ad blocking, like music piracy, is best understood as a market phenomenon. A class of users prefers that experience over the more “expensive” one.

Market data seems like an opportunity to make a better product.

Steve Jobs recognized this, and responded with iTunes. A better product that competed against piracy. (It also competed with the legacy model of album sales.)

Today, Medium.com is positioned as Apple was with music piracy. They see ads as making a worse product. So they are trying to make a better one.

They sell transparent sponsorships that respect the user, so it’s not like they are against making money. Rather, it’s a more evolved product shaped by user preference.

By the way, the record companies weren’t wrong. They shrank a lot. They were indeed threatened by piracy and by Apple’s decoupling of songs from albums.

But it’s not like music has suffered. It’s as good or better than it’s ever been. Production quality and distribution are great. More people hear more songs.

Remember, the “good old days” of profitable record-making worked for maybe 1% of artists. It was not a halcyon. It was high barriers to entry and a power law that favored few.

Maybe we’ve shaved some profit from middlemen. But consumer surplus has increased. Which seems like economic progress, no?

discuss on hacker news

Embedding a struct declaration in a text template in Go

Here’s a nice little technique I came across in developing gen.

Say you have a text template, and that template is intended to output Go code. You’d like to pass a data structure into the template, and have the structure appear in the template as its own value literal.

There is a printf format string that does just this: %#v

Let’s say you have a template like:

p := {{.}}

Where {{.}} represents the passed-in data. Let’s say the data is a struct.

type Person struct {
	Name string
	Age int

person := Person{"Jane", 30}

In order to have the output work as we wish, we make this modification to the template:

p := {{ printf "%#v" . }}

…which means “call printf with the format string %#v and the value ., which represents the passed data.

We execute the template along the lines of:

tmpl.Execute(w, person)

(Where tmpl is the compiled template above, and w is an io.Writer)

The resulting output will be:

p := pkg.Person{Name:"Jane", Age:30}

…where pkg is your package name. That’s valid Go code which in turn can be executed. Neat.

Here’s a runnable version on the Go playground.

Uncanny valley tech recruiting

As recruiter pitches show up in my inbox, it’s clear there’s a lot of “fake it til you make it” when it comes to tech terminology.

One arrived today, telling me that “dependency injection” is a hot technology these days, and that their client was hiring people with that skill.

For a short moment, I was impressed that such specificity was included in a cold email — a hint of credibility. But the subsequent usage quickly indicated to me that the writer doesn’t know what it means. Just a phrase dropped in alongside other tech jargon, template-like.

See, this feeling of plausibility followed by disappointment is what we experience as uncanny valley. Close enough to be considered real for a moment — but then the fakeness reveals itself and we are repulsed. (If you found the visuals of Polar Express a little creepy, that’s the feeling.)

Mori Uncanny Valley

Two legitimate, non-creepy zones exist on either side of the valley. To the left, one makes no pretension of being real and works with that. To the right, one is actually the thing that one resembles.

In the recruiting world, both sides are fine. If one has been asked to fulfill a position, but doesn’t have a strong grasp on the tech, that’s OK. Just phrase it honestly, like “the Engineering Lead is looking for X, Y and Z and asked for my help”.

To the right of the uncanny valley is to be an expert on tech. It’s harder, but there’s payoff there.

There is (likely) a small number of effective people that do a disproportionate amount of the successful recruiting — a power law distribution.

Some of them really are tech experts and wield that knowledge; others are honest about their own technical limitations but know how to appeal to talented developers. There’s room on both sides of the valley.

Yes, net neutrality is conservative. That’s the problem.

Fred Wilson makes the argument (here and here) that net neutrality is a conservative idea. That’s correct, and that’s the problem.

My definition of “conservative” is “seeking to preserve accreted value”. It fights change, because change is considered dangerous. It is (perhaps rational) risk-aversion and loss-aversion.

For example, environmental protection is conservative. Preferring traditional marriage is conservative.

(You’ll note that people who support those respective ideas are not typically found within the same political tribe, which is my semantic point.)

Net neutrality wishes to “preserve the internet”. It wishes to lock in a certain model, believing that one is protecting value.

But the internet is not conservative. It allows for many unpredictable outcomes. It is emergent and adaptive.

To me, neutrality locks in the worst of the internet (last-mile monopoly) while hindering its best qualities (routing around damage).

Imagine it’s 1996, the internet is emergent. It is largely designed around discrete text protocols – email and HTTP and such. Binary data is supported, but pipes are narrow.

Now, companies start popping up to stream video over that network. Video was not considered in the network’s design – in 1996, video over IP is wildly inefficient. Coax cables can stream dozens of (analog) channels, but that modem on the phone line? Not so much.

So netizens quite reasonably wish to prevent this change. Imagine if bandwidth-hungry video providers crowd out those of us sending email! It’s an abuse of the network, greedy, and a departure from history. At the very least, the FCC should step in.

That’s a conservative case, and it looks a lot like the argument we are having today.

My security and privacy tools

A quick list of the things I use to improve my web experience:

HTTPS Everywhere – Does its best to detect whether a site offers HTTPS, and switches to that if so.

DNSCrypt – DNS traffic can be encrypted, too.

Block third-party cookies – In Chrome it’s a checkbox at chrome://settings/content. In Safari, it’s the default. Fluid is a nice Safari wrapper, btw.

AdBlock Plus – Ad blocker that’s open source.
Funny story: I work for a company that makes money on ads, and I wrote one of our ad servers. Every six months or so, I give myself a heart attack when I don’t see the ads on the site, and remember above.

I don’t hate ads by the way, just the 90% of them that are worthless.

Disconnect – Blocking of analytics and such.

All of the above have the ability to cause trouble under certain conditions, but they work well for me. DNSCrypt takes a few seconds to get going on a new connection.

A type system in runtime’s clothing

Any sufficiently complicated C or Fortran program contains an ad hoc, informally-specified, bug-ridden, slow implementation of half of Common Lisp. – Greenspun

I am reminded of this quote when I see a common pattern in data stores, especially RDBMS. It’s a key-value pattern, along the lines of:

Columns: IDType | ID | Value

…combined with application code which branches on the IDType column. The Value (and heck, the identifier) are interpreted differently based on the type.

This is a fine pattern depending on one’s goals. But it’s important to understand the choice one is making here: we’ve created a dynamic type system. Those if’s and switch’s are type resolution, happening at runtime.

With a RDBMS, a table typically maps to a single type, say “Person”. One can completely express the shape of that entity in code, requiring no conditionals at runtime. Values flow into known slots.

Using the pattern at top, by contrast, one might create a “Documents” table. IDType might be “PDF” or “Section” or whatever; the Value may be a complex payload or a reference to another entity.

And it can work great. As a key-value store, the store is “dumb”: meaning happens in code. This can give you great performance and a lot of presentation choices at runtime.

But one gives up a large class of static (compile-time) type guarantees; one inevitably will do “type” checks at runtime, to combat newly-possible illegal states.

Too often, such code looks like an ad hoc, informally-specified, slow type system.

The upshot is, it’s a trade-off between safety and flexibility, exactly as with static and dynamic type systems. If one chooses the latter, plan on accounting for legal and illegal states in application code — and be clear about guarantees the system will and won’t offer.