The emergence of Big Data (2.0)

I’ve written previously on the idea that data, not algorithms, will be a main differentiator for web companies moving forward. What does this mean?

In short, it’s not hard to hire very smart people to write algorithms. One or two PhD’s can produce a lot of great logic. But algorithms need data — big data — to be useful, and that’s considerably harder to amass. Only a small number of firms will have it, and it will represent a barrier to entry that nimble startups can’t climb.

So I was intrigued to learn that Google recently dropped both of its map data providers. Apparently, Google has been able to build its own database, based on its own data collection and public sources. Combine this development with Google’s entry into the turn-by-turn mapping business, and I see 3, 4, maybe 5 firms being kneecapped.

(Google doesn’t entirely confirm my premise about data vs algorithms: they have both. Why choose?)

I also learned recently that Amazon’s book database is becoming the canonical bibliographic data source — possibly eclipsing an incumbent provider called RR Bowker. Amazon might also be the best data source of “what people buy”.

Going further, Facebook is becoming the source for “who people know” while Twitter defines “what people are talking about”.

These companies therefore represent the advent of Big Data 2.0. (The 1.0 version was the unsexy B2B brands — like RR Bowker — that didn’t make the leap to consumers.)

If you need to understand why these firms do what they do as they fight it out, ask yourself what data they are trying to own.

Published November 16, 2009