data butterfly



Progress isn’t just about innovation

In the information age, data is our most valuable resource. We use it to make our decisions, to run our economy and to guarantee our security. This is all because of the innovations that have emerged in a relatively short amount of time.  But data is not like any other resource humanity has encountered. Unlike oil, gas or water, our ability to produce data is seemingly limitless and growing.

In a 2017 report, IBM revealed that 90 per cent of all data ever generated up until that point was from the past two years. That rate of growth has only accelerated. This year, software company Domo estimated that by 2020, every person will generate 1.7MB of data a second – that’s 146,880GB a day. Think what humanity could achieve if our energy supply could multiply at such a rate.

Data, data, everywhere…

With so much information available, you would think we would have the answers to the world’s biggest problems, but we don’t. Technology has gifted us with rich seams of data that have had a massive economic benefit. And yet, much of the potential for positive social impact is still locked up.

For the impoverished, the sick and the disenfranchised the fruits of our data revolution are scarce. The secret to getting meaningful and positive outcomes from this vast sea of information is in ethical data science, a field of study that gives us the ability to extract actionable insight from big data that can change lives.

Worldwide, artificial intelligence (AI) and machine learning have become an integral part of everyday life. The advances in these fields have the potential to transform society, but systemic imbalances in society still exist and must be addressed to fulfil the United Nations Sustainable Development Goals.

AI requires high-quality data. If the data is of poor quality, the best and fastest algorithms in the world are meaningless. It is important to understand the source of our data so it is relevant and prevents unintended biases.  We need a systematic approach and the right infrastructure to collect this data and to document the data collection process. Data collection, retention and use should be done in a way that is professional, legal and ethical.

In order for data to have a significant social impact, the technology has to be universally accessible so it can transcend cultural and linguistic boundaries

This is why the work being done by the Data Science for Social Good (DSSG) Summer Fellowship is essential.  The 12-week full time summer fellowship is the result of a collaboration between the Gandhi Centre for Inclusive Innovation, part of Imperial College Business School, and the University of Chicago.

The programme sits at the nexus of several disciplines where Imperial College London has traditionally excelled. It brings together the brightest fellows – comprising undergraduates and recent graduates – from all over the world to work on machine learning, big data, and data science projects with social impact to find solutions to solve real-world problems.

Real-world solutions

Data science in the social sector is very important as it can provide us with new ways of understanding and tracking problems, designing and scaling solutions, and communicating the results of our work. Working within a fair and ethical framework, the systems developed must also alleviate and not exacerbate the circumstances of those we wish to help. In order for data to have a significant social impact, the technology has to be universally accessible so it can transcend cultural and linguistic boundaries.

Currently the projects being worked on by the DSSG fellowship include:

  • The use of data science to expand a pre-emptive scheme to identify high-frequency 911 callers, improve healthcare and free up emergency services.
  • A partnership with a Uganda-based not-for-profit group which offers legal aid to people with no access to lawyers and uses questionnaire data to increase the capacity and efficiency.
  • A partnership with the City of London helping to prevent air pollution exposure, improve traffic and better inform policymakers by building a library that classifies vehicles, estimates vehicle counts and velocity.
  • Improving heart health diagnosis from echocardiogram images using machine learning, in collaboration with the cardiology AI team at the University of Salamanca Hospital
  • Enabling data-driven recommendations for the Institute of Employment and Vocational Training in Portugal to connect job seekers with more relevant and effective jobs and interventions

Building a community

What the DSSG reveals is true world-changing progress does not come from innovation alone, but through innovation with a conscience. To do this we need the right people that can not only solve problems with data but can also seed a new generation of socially aware scientists.

But this work cannot be done in isolation. We must find ways to achieve continuity and sustainability. One of the ways in which this can be done is by creating a system that provides a feedback loop on external factors and incorporates domain knowledge for experts in the NGO sector and can help set expectations.

By leveraging technology to have a greater social impact, we can unleash the potential of information so that we are not just improving the lives of just a few. The successful collaboration of various stakeholders is essential to applying data science for social good and creating a deep understanding of the problem domain.