Concise review by Tim Harford of the promise of big data and how it can go awry, citing inaccuracies that emerged in the celebrated Google Flu Trends, possibly because Google's own search engines may have been prompting searches, even when the searcher's own symptoms didn't suggest flu. According to an analysis in Science, in 2013 this led to an over-estimate that almost doubled the actual data on the incidence of flu. Harford also covers the problem of multiple comparisons which can yield patterns that, although significant, are spurious. While comparisons that fail to show effects remain in the desk drawer.
Related, but not quite: xkcd's 'little data' representation, Frequency.
04 April 2014
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment