Big + Beautiful Data

Here's a data oriented double act, with associate director Claire Moon on author/broadcaster Tim Harford's Google Firestarters presentation, and Eric Shapiro, our creative delivery knowledge leader, reviewing David McCandless' talk at a Guardian Live event. Let's go...

In the first of our two reports, author, broadcaster and FT columnist Tim Harford gave two TED-style talks – one titled ‘Big Mistakes With Big Data’ and the second on ‘How To Tell The Future’. Here’s four relevant insights from his presentations.

  1. Data can’t always speak for itself

At first glance, big data promises to render traditional methods of sampling obsolete (because we now have the data for ‘n=all’), and does away with the need for theories and hypotheses because we can simply ‘listen’ to the data by running algorithms to analyse it.

However, the rise and fall of Google Flu Trends – the poster child for big data – highlights the importance of ‘old-fashioned, boring lessons around how we behave with data’ and the enduring importance of human intelligence at all stages of analysis.

Despite working well at the start, the success rate of the predictions made by Google Flu Trends began to fall spectacularly – and because Google didn’t have a theory for why it worked in the first place, it was impossible to work out why it had gone wrong.

  1. The importance of being human

Despite calling himself a huge fan of big data, Tim advocated human intuition over computer learning and algorithms, and explained why speaking to ‘n=all that matter’ is still a far better approach than attempting to listen to ‘n=all’.

As the volume of ‘found data’ increases, big data is becoming increasingly good at telling us what is happening and identifying correlations, but it can’t tell you why it’s happening and if a correlation actually represents causation – you still need to speak to real humans for that!

  1. Be self-critical

Tim’s final lesson was around prediction, and the importance of being open minded. He spoke at length about a research programme set up by psychologist Philip Tetlock that aggregated a large number (20,000) of quantifiable forecasts made by a broad variety of people. Through this experiment, Tetlock found that the success of predictions lie in correcting biases, working in teams, and in practicing ‘actively open-minded thinking’.

In short, the best way to ensure accuracy when carrying out research and looking to the future is to continually challenge what you find and be prepared to change your mind when new information arises.

  1. Research isn’t always about finding answers

During the Q&A session after Tim’s talks, he was asked about his work for the Scenario Planning division at Shell. Tim’s description of it as ‘science fiction’ got a few laughs, but his point was a serious one – research shouldn’t always be about finding answers. Instead, research should be about stimulating thinking.

(If you want a more detailed account of the event and Tim’s talks, check out Neil Perkin’s great write-up here)

 

In the second of our reports, we heard Mr Information Is Beautiful (more commonly known as David McCandless) discuss his new book Knowledge Is Beautiful, where he spoke not only of the art of data visualisation, but more deeply on the dividing line between ‘data’ and ‘knowledge’.

Psychology tells us seven pieces of knowledge is about the most information a person can hold, so here’s three things to remember from David’s speech to add to the four from Tim’s.

Knowledge is joined up data

Bored with drawing up immaculate and fascinating data representations, McCandless sought to understand and illustrate knowledge in his new book. He came to the realisation that single data sets only tell you so much. If you want to find something new and genuinely interesting, you need to join up different banks of data to paint a clearer representation. For example, if you want to know who’s top dog, you need to look at a huge range of factors, including vet records, dog genealogies and popularity to reach your goal. It’s the same with insights. To find something new, you need to join up different data types and studies, and view them as one.

3/4 of our brain is vision

Astonishingly, three quarters of our neurons are dedicated to the visual system. We’re incredibly sensitive to beautiful things, but we’re equally aware of ugly things. Even more fascinatingly, we have trust in the former, and are suspicious of the latter. It’s why we describe companies with older or more simple websites as ‘dodgy’, and equally why we forgive glamorous celebrities for just about anything (nice corn rows, Justin…). This means no matter how great, relevant, or life changing a piece of knowledge is, we won’t trust it unless it’s packaged in something beautiful that earns our trust. Equally, we need to be conscious of not presenting something incorrect beautifully, encouraging the wrong sort of knowledge – which means data integrity still matters.

Up wide, crash zoom, to the side

Finally, we learned how in order to extract the best information from data, you need to examine it from all angles. That means looking at the whole picture, exploring the tiny details within, and changing the angle of approach. Take the world of cash crops. From afar, wheat is the most planted, sugar cane the most fecund and most popular, and cannabis yields the highest revenue. That last one’s interesting, no? Well, if we zoom in, you can see that cannabis generates £47,660,000 per square kilometer. And if we look at it from another angle, we see in a state where cannabis is now legal, Colorado, that it reels in more tax revenue than Alcohol. The insight? Cannabis is more lucrative than you might have thought.

Rather than point you towards the illegal drug trade, we reckon this is a lesson in analysis: specifically the importance of using frameworks to view data through different lenses and extract the best and most interesting bits.

(You can see more of David’s beautiful works here, and he’d probably want this blog to link to the Amazon page for his new book – we’ll acquiesce and do this here.)