What can we learn from umbrellas? Why social data science matters
When it starts raining suddenly and heavily, everyone puts their umbrellas up at once: when the change is more gradual, umbrellas go up one by one. Social data science lets us observe similar differences in change in business and society
When it starts to pour on a large crowd, what can we learn from watching the raising of umbrellas? What does that tell us about data science breakthroughs coming from basic social science research?
This is how MIT Professor Sinan Aral introduced his research at his May lecture in the Distinguished Speaker Series, hosted by the Data Science Institute (DSI) and its Business Analytics Lab (Imperial Business Analytics), a team of researchers housed principally at Imperial College Business School. Aral is the David Austin Professor of Management at MIT’s Sloan School of Management, where he holds joint appointments in the IT and Marketing groups and co-leads the school’s Initiative on the Digital Economy.
An economist by training, Aral is at the vanguard of a new field – social data science – that is revolutionising social science by using data science and big data to address basic questions about economy, society and culture. Historically, this field has also been called mathematical and computational social science; to be sure, maths, statistics and computing are essential to social data science, as are theoretical models of crowds and social cognition. In recognition of this new field’s rising importance, Aral has recently joined the board of the Alan Turing Institute, the UK’s national centre for data science research.
Everyone who has an umbrella raises it as quickly as they can. You do not see the spread of umbrellas wending its way through the crowd
To explain this new field (social data science) and why it matters, we must first ask, what is the “social data” we are talking about? In essence, it is a picture of us – of society, economy, culture. With billions of us now using smartphones and the internet in everyday life, we are supported by a distributed ecosystem of data centres and services that are capturing a staggeringly high-definition picture of how we all live. As we communicate, connect, navigate, and do business, we are all taking a giant social selfie.
What’s in this picture? Only nearly everything. It’s not just what websites we visit, or what we read or watch online; it’s also the emotion in our comments about these things. It’s who we talk to, and what we talk about. Of course, it’s also what we buy and sell, and what we research but do not buy (at least not yet). And, crucially, it’s where we go, how we get there, how slow or fast we travel, who we hang out with when we get there, and how long we stay. And soon, it will increasingly also be how fast our hearts are beating, how awake or dozy we are, and measurements of feelings – even ones we are not putting into words, or tweets. Where will it stop?
Now back to the umbrellas. Social data give us a picture that allows us to say when social, economic or cultural changes happen in unison, or in the cascades we see when changes spread from person to person by word of mouth, imitation, or something more like contagion. When sodden clouds unleash heavy rains on a large crowd all at once, everyone who has an umbrella raises it as quickly as they can. You do not see the spread of umbrellas wending its way through the crowd, as you would if people saw someone nearby raise an umbrella and thought, “Hey, that looks cool – think I’ll raise my umbrella too.”
In essence, it is a picture of us – of society, economy, culture
Telling the difference between changes in unison and by cascade might sound simple, but it is nearly impossible without the kind of social data we now have. And even with today’s social data, it takes some pretty wicked social data science.
This is where Aral’s research has been influential. Often drawing on the lexicon of public health, Aral and his colleagues tackle difficult problems of understanding influence in social networks and the spread of behaviours. For example, a study of 1.3 million Facebook users found: younger users were more susceptible to the influence of other users in adopting products (in this case a cinema-related Facebook app); that men were more influential than women, though men were more likely to be influenced by women using the app than other women were; and that influential individuals form clusters, which subsequently are less likely to be influenced.
He found, in a study of 1.4 million Facebook friends of 10,000 experimental users, that viral features designed into products and marketing campaigns generated econometrically identifiable peer influence and social contagion effects, with passive-broadcast features resulting in greater peer adoption than active-personalised ones. A project conducted with a news-aggregation site found prior ratings created significant bias in individual rating behaviour, though that herding effect was asymmetrical, with upvotes having far greater effect than downvotes. In each of these studies, the method used to reach findings was itself a research contribution in addition to the individual results. To put it more simply, this research is showing us how to see and understand change cascades as they take place.
These days, the datasets needed for social data science are increasingly collected and held by government and large corporations
At Imperial College London, the DSI and Imperial Business Analytics are also tackling basic problems of social data science. In innovation and politics, cascades start so slowly and innocuously that it seems for a long time nothing is happening. Until, that is, it is too late to respond. In markets, however, cascades happen so quickly they are indistinguishable from in-unison changes – at least by analyses using traditional methods and data sources. In securities markets, for example, reactions to superior information that is not widely known – perhaps because it is insider trading – can spread so quickly the change seems like a reaction to information that might have been public. And in supply chains, both outages and gluts are the visible pile-ups of knock-on effects that cascade through complex systems.
These basic questions run through the many challenges being taken up by DSI fellows in Imperial Business Analytics – with their collaborators around the world. How did Brexit happen? What is fake news? When does a social movement become real? Can we spot disruptions earlier? How will AI-based automation shape the organisations of the future, but economy and society overall?
Already, we have developed an impressive library of visualisations for sharing the results of these and others projects. Since opening the Data Observatory in late 2015, we have shared this work with hundreds of distinguished visitors and, through the Imperial Festival, members of the general public. Our distinguished visitors include not just academics like Aral, but also heads of state, government officials, corporate boards, and C-suite officers from many organisations large and small. To discover and share new findings with such visitors, we use the Data Observatory as astronomers use telescopes, or as life scientists use microscopes. Using visualisation to help us discover and explain patterns of consequence, the Data Observatory creates an immersive experience that both brings discoveries to life and makes them more accessible.
Men were more influential than women, though men were more likely to be influenced by women
The DSI speaker series and the visit from Professor Aral are an expression of our collaborative approach – an essential strategy for social data science research. Social data science is a complex and expensive undertaking that calls not only for interdisciplinary dialogue about theory, methods, and research methods, but also collaboration with industry and government. These days, the datasets needed for social data science are increasingly collected and held by government and large corporations. Gaining access to these data is essential for the advancement of the field, and this project requires cooperation from leading academics and institutions around the world.
To that end, we look forward to hosting “Socialising Data Science”, our first international conference on social data science in May 2018. Convening leaders from academics, industry and government, our aim is to highlight the research questions and practical problems that demand collaboration between academics and leaders in industry and government. To close on Aral’s umbrellas, we believe rains of change are coming, and we want to be ready.