Releasing the Value of Big Data

Catalysing Economic Growth: Releasing the Value of Big Data

 Principal InvestigatorDr Aija Leiponen

Researchers: Professor Jonathan HaskelDr Emil LupuDr Catherine MulliganDr Pantelis KoutroumpisDr Peter GoodridgeDr Llewellyn ThomasDr Konstantina Spanaki

Funder: EPSRC

Duration: This project was funded by the EPSRC from October 2013 – May 2016

“Big Data” (very large or complex datasets beyond standard data management technologies) could transform how societies and communities view themselves, and how governments, large corporations, and entrepreneurial start-ups relate to those populations.  So, how will Big Data affect innovation, growth and well-being in the UK economy? This project created, analysed and described new economic and business models for valuing Big Data and evaluating the impacts this could have.

The project examined how firms innovate and compete around Big Data and how the technical and regulatory frameworks enabled the commercialisation and consumption of the information that Big Data generates. Uniting leading scholars in Economics, Business, and Computing across Imperial College London, this research investigated critical societal and industrial issues related to Big Data by analysing data value chains to shed light on value creation, firms’ strategies in standardisation and governance of data assets and activities in network industries, as well as individuals’ preferences and constraints in creating and sharing data. It also investigated the regional implications for Big Data capabilities and its effects on macroeconomic performance, whilst addressing issues of data quality, management, protection, and sharing agreements.

By exploring and generating new conceptual models informed by technical developments in computer science, this project described the emerging economic and business phenomena, collected and analysed new samples of empirical and experimental data, and integrated technological, economic and business insights to provide specific policy and managerial advice.

Key Research Questions and Findings

The project considered the following three strands of research:

Big Data Supply Chains, Governance, and Commercialization

Large datasets are everywhere. With the vast expansion of data availability, the focus of attention has shifted to the exchange processes of sharing, trading, and distribution. Nevertheless, data are not being shared or traded openly and transparently on a large scale so we analysed the economic characteristics of data to understand why this is the case. For example, data is an easily disseminated digital good that is difficult to protect, making data owners understandably wary of data leaks, is often “inalienable” (so cannot be disconnected from their source be it a person or organization) and is highly “inferable”, so combining a few anonymized sources of data could in all likelihood reveal identity.

Additional impediments to data trading include the legal and regulatory environments that provide weak legal protection and potential future regulatory liability for data owners. Basically, data is an economically valuable good that cannot be owned! As a result, most data products are traded bilaterally between companies based on contracts and relying on the reputations of the trading parties, rather than anonymously via digital marketplaces which would provide much less costly trading mechanisms. Data transactions are thus protected by secrecy, contractual specifications such and non-compete and confidentiality agreements, and by product designs that hide the underlying data – and not by intellectual property rights which are applied to other forms of content. However, we note the emergence of technologies such as blockchains potentially enabling anonymous trading of some relatively high-value data assets. Once such trading technologies are adopted in specific digital industries, we anticipate the creation of veritable data supply chains that take full advantage of the unlimited flexibility of creating new information products out of data assets.

Data Licensing

To assess how data trading takes place in practice, we analysed the terms and conditions of both private agreements and open data licenses from hundreds of online services to find that open data trading is still in a turbulent phase with little standardisation or alignment over terms and conditions. However, commercial data exchanges are more restrictive on the further sharing or commercial use of the data than government or non-profit sources while services providing personal data online are the least permissive, reflecting a concern over inalienability.

The researchers also compared proprietary data license agreements between commercial entities with other types of intellectual property licenses such as those for patents, trademarks and copyrights. Data licensing offers noticeably short license terms of 1-2 years, reflecting the rapid evolution of the data marketplace – in many contexts, data assets may depreciate very quickly. They also frequently stipulate confidentiality and abundant use restrictions, whereas agreements for other forms of intellectual property rarely do so. Finally, a data license agreement sets up an ongoing relationship between the parties, where the data provider commits to correcting, refunding, replacing, or updating the data if mistakes or technical problems are found, and the data user commits to data handling system audits by the data provider to verify use and management. As a result, data are usually monetized via an annual subscription rather than a one-time fee or royalty arrangements.

Investment and the Macroeconomic Impact of Big Data

Big data and analytics are widespread and growing rapidly but little was previously known about the extent of UK businesses investment. The team utilized publicly available profiles of workers registered on a social media network and estimate that ‘big data employment’ in the UK market sector was c.190,000 in 2010. In comparison, UK firms employed 178,000 workers in R&D (2013) and 749,000 workers in software (2010) – big data-related activities have thus exceeded R&D in terms of human resources. Furthermore, we estimated the UK market sector investment in data-based assets at $7bn in 2013.  UK R&D investments, on the other hand, were $15.5bn in 2012.

However, data investments grow more rapidly in the economy, so the next step was to investigate whether big data investments have equally significant returns to the UK economy. We assessed the contribution of data-based capital to GDP to be on average 0.015% annually between 2005 and 2012. This reflects the fact that data is in its very early stages of commercialization and growth, and we expect its contribution to grow when companies begin to better understand how to make the most of these investments.

Publications

  • Chebli, O., Goodridge, P., Haskel, JMeasuring Activity in Big Data: New Estimates of Big Data Employment in the UK Market Sector, Discussion Paper
  • Goodridge, P., Haskel, J.How Much is UK Business Investing in Big Data?, Discussion Paper
  • Goodridge, P., Haskel, J., How does big data affect GDP? Theory and evidence for the UK, Discussion Paper
  • Goodridge, P., Haskel, J., Big Data in UK industries: An intangible investment approach, Discussion Paper (in progress)
  • Koutroumpis P.Thomas L. and Leiponen A., The (unfulfilled) potential of data marketplaces (in progress)
  • Koutroumpis P.Leiponen A., and Thomas L., In ICT, Small is Big: The impact of R&D on ICT firm performance (in progress)
  • Koutroumpis P.Leiponen A., and Thomas L., How important is ‘IT’? Assessing the influence of information and communication technologies over a century of invention (in progress)
  • Koutroumpis P.Leiponen A., and Thomas L., Invention machines: how instruments and information technologies drive global technological progress (in progress)
  • Koutroumpis P.Leiponen A., and Thomas L., ICT is different: a PageRank analysis of PatStat (in progress)
  • Koutroumpis P. and Leiponen A., Understanding the value of big data (in progress)
  • Ahlfeldt G., Koutroumpis P. and Valletti T., Evaluating access to universal digital highways (in progress)
  • Spanaki K., Adams R., Mulligan C., Lupu E., Data Supply Chains (DSC): Development and validation of a measurement instrument (in progress)
  • Spanaki K., Adams R., Mulligan C., Lupu E., A research agenda on Data Supply Chains (DSC) (in progress)
  • Thomas, L. and Leiponen A., Big data commercialization (in progress)
  • Thomas L.Koutroumpis P. and Leiponen A., Understanding traded data (in progress)
  • Thomas L.Koutroumpis P. and Leiponen A., Data Contracts (in progress)