Big Data can be Small Data and Useful

This article in Computerworld featuring a tour of the predictive analytics behind the Proactive in iOS9 (the Siri AI improvements) does two things at once: it points at some pretty useful features and also showcases how little "AI" a program really needs if it has high enough fidelity on the data collected. For instance, how hard is it to use your personal mail repository to grep for the telephone numbers of incoming calls? Or "predict" what you might need given a repeating calendar activity (gym/office/home)?

Of course all of these activities and more will benefit from machine learning eventually but we are so early in the development of personal agents that simple heuristics will carry the day for a long time still.

Which brings me the ultimate fallacy of "big data" as justification for the predominant consumer Internet business model. Companies like Google and Facebook rationalize the collection of vast amounts of personal data in the name of user experience, arguing that the more data they have, the better the experience gets. And while this is true for some domains (speech recognition for instance), there are many more areas, it is not true for many others— or at least won't be until we are far into strip mining the tricks coming in iOS9.

Tim Wu's recent New Yorker piece arguing that Facebook should pay users for the right to collect their data may point to a growing sensitivity on the part of mainstream users. So hopefully others will unpack the big data fallacy when it comes to "improvements" by asking: yeah, but improvements for whom?