Imagining version control for APIs

Versioning is, right now, merely a human label. We have helpful conventions like semver, but one still relies on human judgment as to the definition of a “notable” change.

What if the machine could detect, and perhaps describe, API changes in a meaningful way?

What I imagine is this: a program that examines a program’s API for changes, and describes them. Version control, in other words, of the sort we expect from Git.

Instead of a whole program being described as a version, each callable API (endpoint, method, etc) gets its own version. We, the consumer, know when Foo() has changed while Bar() has not; API versioning becomes granular.

Black box

One way me might go about this is black box versioning, by which I mean the versioner can interact with an API, but knows nothing of its underlying implementation.

In this case, the versioner would be an extension of the test suite. This versioner could only provide the guarantees that the test suite does.

Most tests are necessarily finite and incomplete; they are as good as the test writer’s imagination. We don’t (can’t) write tests for every possible input, and so the programmers choose representative cases.

Such a system could prove the existence of changes, but could not prove the absence of changes. There is also the meta-issue that the tests themselves would need to be versioned.

Even with such limitations, it would be a substantial improvement over the status quo of human-imagined “versions”.

White box

This gets interesting. What if the versioner didn’t simply call the API as an outsider, but could inspect the source code comprising the API?

In this case, we could bring static analysis to bear and provide more guarantees. Instead of calling the API looking for an exceptional case, we analyze the source code to detect semantic changes.

Is there a form of static analysis that transforms source code into a normalized representation, such that two semantically identical programs with different source can be proved identical?

I don’t know the answer, which is one reason I am writing this post. (Would love your feedback on HN or Twitter.)

One representation that comes to mind is SSA — could it identify some classes of semantically identical programs? What about the various forms of IR in compiler front ends or back ends (LLVM, etc)?

Such a versioner would likely need to be language specific (though perhaps it could exploit common IR’s like LLVM).


This really intrigues me and I haven’t come across logical show-stoppers yet. Such a system would only need to be an improvement over the status quo, but not perfect.

That said, I would love to see real, Git-like versioning of API semantics. This would be especially helpful in the world of dependency management — where versioning is a particularly intractable bear.

Improve the median, not the mean

I like seeing Google Fiber prompt incumbent dinosaurs into boosting speeds. It’s great.

But: it doesn’t matter much. I have 10x the bandwidth at work (gigabit) than I do at home. I barely notice it, however, for both technical and perceptual reasons.

In fact, my home internet recently improved 3x from ~30mbps to its current ~100mbps. Maybe videos achieve HD a little more quickly? Maybe?

Point being, I had decent bandwidth for my use cases, and the new bandwidth — which I am intellectually thrilled about — offers diminishing returns of actual utility.

Because I already had above-average bandwidth, these improvements move the mean more than they move the median. This disparity offers a clue about utility.

There’s more bang for the buck in improving the lower percentiles of performance. Moving a substantial number of users from 1mbps to 10mbps is a greater increase in utility than my recent improvements.

Google has the right idea here with QUIC. It offers a substantial improvement for the worst 1% of latency, which nudges the median upward. Let’s focus on that.

Liquidity, open source and security

Jeff has a thoughtful post about open source, security and incentives. A few points stood out to me.


First, the “all bugs are shallow” idea is a bit idealistic, as he points out. What comes to mind for me is Joel’s Hitting the High Notes. Tens of thousands of average developers will not pick up a bug that only experts would recognize, and adding another ten thousand won’t help.

If we have a “chunky”, discrete Gaussian distribution of talent reviewing the code, the area under of the far right-hand tail may be indistinguishable from zero.

Few markets are liquid enough for distributions to be smooth, which allows some area under the right tail.

For example, casinos doing millions of bets with known probabilities have smooth, measurable, non-zero tails; they are liquid enough to predict that someone will win a million dollars.

An open source project with an audience not in the millions, less so. At some point moving right, the graph will discretely drop to zero. That zero represents “the number of people smart enough to identify difficult bugs”.


Second, we consider incentives. Jeff explores the idea that paying for bugs may both be necessary and risky.

He sees moral hazard: perhaps there is an incentive to hoard important information for a payoff. Maybe only the wealthiest organizations can afford to pay for vulnerabilities, as their value is bid up.

But let’s consider the audience. A person that discovers a bug in an important piece of software is someone with an unusually strong interest in that software. They are likely a user, and therefore are more likely interested in having better software, for their own interests.

The alternative to imagine mercenaries that dive into unfamiliar software in the hope of a payoff. Not impossible! But unlikely.

Which is an essential quality of open source that confuses those new to it — that volunteers work not only on goodwill, but on self-interest.

I’ll stretch the analogy. The person next to you on the plane might be a terrorist. Not impossible!

But it’s more likely that, if they showed up where you showed up, they simply want the same things you do.


What if security becomes too cheap to meter? Which is to say, what happens if improving software quality requires a lot fewer humans?

In that case, the economics and incentives questions become a lot less salient.

It’s possible — likely, to my mind — that safer software will not come mainly from greater resources, but better tools.

The two that come to mind are formal methods and safer languages. (These are actually two sides of the same coin.)

To the extent that we can formally articulate a definition for “safety”, we can prove a program’s characteristics with static analysis. Describing code paths in terms of provable propositions allows us to know where logical guarantees exist, and where they don’t.

We talk less about talent and trust, and more about the existence, or non-existence, of guarantees.

And heck, even informally: languages like Rust and Go prevent classes of human errors that C cannot. Using such languages, we prevent the humans from making certain classes of mistakes.

Both of the above strike me a relatively cheap and automatable, and therefore more likely a source of progress than foundations and funding.

Statements are statements, and expressions are expressions (in Go)

I got trolled by a facetious article on Go on April 1. But it did trigger a conversation about why Go doesn’t do certain things other languages do.

The answer, in several cases, is that Go chooses to make a clear distinction between expressions and statements. It chooses not to conflate them.

By way of definition, an expression is a thing that has (returns) a value. A statement is an imperative command to do something, but itself does not have a value.

I’ll use C# by way of comparison.

Increment operators

Most C-family languages have an operator like a++, which says “increment the value of a value by 1″.

In most languages, this expression has a return value. In Go, it does not.


var a = 5;
// prints 5 (though a is now valued at 6)


a := 5
// syntax error: unexpected ++

To be clear, a++ is a valid statement in Go; it increments by 1. It does not, however, return a value, avoiding error-prone patterns like if (a++ == 6) { ...


In C#, assignments have return values.


int a;
Console.WriteLine(a = 5);
// prints 5

The expression a = 5 has a return value of 5. Further shenanigans:

int a;
Console.WriteLine((a = 5) == 5);
// prints True

The expression (a = 5) returns a value of 5, which is then compared to 5.


In Go, assignments are statements.

a := 5
fmt.Println(a = 6)
// syntax error: unexpected =

a = 6 is a valid statement. It is not, however, an expression (and thus can’t be evaluated and printed).


You are probably familiar with an expression like condition ? value : other. It’s generally understood as syntactic sugar for an if-else statement, with a return value.


var temp = 50;
Console.WriteLine(temp > 30 ? "warm" : "cold");
// prints warm


Go doesn’t have ternaries! Reason being, it’s sugar for an if-else, and if-else’s are statements, not expressions.

You may be detecting a pattern here: Go prefers orthogonality over sugar. There are classes of (human) error that result from a statement also being an expression, and Go chooses to make that class of error less likely.

Theory of the firm: opt-in initiatives

I’ve become adamant about making internal initiatives opt-in for the following reason: it brings more information to bear.

For example, at Stack, I started something called “Ask a Nerd”. The goal is to create a channel of communication between our sales people and our developers.

Now, I might easily have pushed for compelling all developers and sales people to participate. It stands to reason: if it’s a good idea, everyone should be part of it. It’s a company initiative.

But I didn’t push in this way. Rather, I communicated the idea and made it opt-in on all sides. This has several advantages.

First, it allows the initiative to get off the ground with as little drag as possible. Had I wanted to compel participation, I’d have needed more permission and consensus.

Second, it’s an ongoing test. Growth, or lack thereof, is apparent in the participation numbers; participation becomes a proxy for value creation. The onus is on me to improve the product and sell it.

Third, it selects for people who are interested in the idea. It reveals hidden leadership and undiscovered skills.

And fourth, frankly, it’s safer. It’s allowed to die at a small cost, whereas a mandatory initiative risks being propped up to prevent embarrassment.

Had participation in the initiative been mandatory, a lot of information would have been muted. We’d have optimized not for creating the most value, but for having a successful initiative. See the difference?

Put another way, mandatory participation becomes a test of a fairly static a priori idea. Opt-in means we start from a kernel, and evolve.

Related: Google+ had this problem

discuss on hacker news

Bugs are a failure of prediction

We think of bugs as flaws in code. This is incorrect, or at best, banal.

Rather, the essential definition of a bug is that a programmer failed to predict what a piece of code would do.

See, smart programmers wrote the code. Other smart programmers looked at the code. It went into production, and did the wrong thing.

No programmer wanted it to do the wrong thing. They thought it would do the right thing. The prediction was wrong. So let’s focus not on the broken code, but on the flawed prediction.


I like tests not for safety, per se. I like tests because they require us to articulate what a piece of code should do. Further, they require us to interact with our code from the outside (you know, like a user). Safety is a side effect of these two things.

When we test code, we test our predictions of what the code will do.


Humans form a mental model of what a piece of code will do. That mental model is limited in size. To the extent that understanding a piece of code requires understanding other, possibly far away, pieces of code, we make forming an accurate mental model more difficult.

Code becomes unpredictable by humans when it taxes our mental models.

Playing computer

Some rare programmers can look at a piece of code and know what it will do. Most of us can’t. Instead, we have educated guesses.

The computer (the compiler, the environment) is the unambiguous decider of what the code will do. When we think we know what a piece of code will do, we are playing computer.

The computer is better at playing computer than the human mind. A prediction about what a piece of code will do, short of executing it, is folly.


Failure of prediction often happens because a piece of code doesn’t look like what it does. This is a very subjective idea, of course.

The measure for me is: how often would a programmer new to this code make correct predictions about how it behaves?

This is of course an argument about abstractions. One person’s “readable abstraction” is another’s “too much magic”.

What we call “readability” is in fact predictability.

None of these are new ideas. They are mostly best practices. We believe them, often correctly, to lead to higher quality software.

They work, however, not because they “improve code”. Rather, they improve human understanding of code. That distinction is the essence of understanding bugs.

discuss on hacker news

The legacy ad industry is wrong about ad blockers

I was recently directed to a couple of papers on browser ad-blocking software (here and here). They are not so much alarmist as they are self-serving and hindered by status quo bias.

Ad blockers remind me of music piracy since the advent of Napster. The technical details are different, but the dynamic is the same — a consumer gets what they want at a lower price, through technical means.

Then, as now, few users thought of themselves as violating any rights. They simply used a product they liked. And if their friend turns them on to a tool that makes it better or cheaper, great.

Then, as now, one could make the not-illegitimate argument that one is stealing or in some way depriving a creative artist or organization of revenue. Those accustomed to making a living in the field call their customers immoral. They predict the decline of product as we starve it of revenue.

Plausible! But wrong. The argument is as impotent as it was 10 years ago.

Ad blocking, like music piracy, is best understood as a market phenomenon. A class of users prefers that experience over the more “expensive” one.

Market data seems like an opportunity to make a better product.

Steve Jobs recognized this, and responded with iTunes. A better product that competed against piracy. (It also competed with the legacy model of album sales.)

Today, is positioned as Apple was with music piracy. They see ads as making a worse product. So they are trying to make a better one.

They sell transparent sponsorships that respect the user, so it’s not like they are against making money. Rather, it’s a more evolved product shaped by user preference.

By the way, the record companies weren’t wrong. They shrank a lot. They were indeed threatened by piracy and by Apple’s decoupling of songs from albums.

But it’s not like music has suffered. It’s as good or better than it’s ever been. Production quality and distribution are great. More people hear more songs.

Remember, the “good old days” of profitable record-making worked for maybe 1% of artists. It was not a halcyon. It was high barriers to entry and a power law that favored few.

Maybe we’ve shaved some profit from middlemen. But consumer surplus has increased. Which seems like economic progress, no?

discuss on hacker news