Data science team


2 min read

The challenge

  1. Clean deals dataset and merge with timecard-transaction data
  2. Investigate and test current hypotheses about the effects of M&A deal variables on legal work
  3. Generate insight on the legal work of future deals given certain variables

How did we help?

The team received anonymised timecard-transaction data and ‘deals’ data, which recorded characteristics of historic deals. The timecard data was rolled-up to a deal level with key aggregated variables from an assortment of 28 tables using SSMS. The deals data was then parsed and cleaned using a Python data pipeline before being merged with the timecard dataset.

After creating the working dataset, the team investigated the effects of deal characteristics on multiple target variables using a series of visualisations and test-models. These target variables were related to the volume, intensity and type of legal work completed on M&A transactions. Two interactive visualisations created in R were also used to quickly display combinations of effects against the target variable.

Some of the quantified effects stood in line with legal intuition and other effects were initially surprising but later explained. In all instances, intuition and feeling were quantified, which opened up interesting discussions on estimating the required legal work for future M&A deals.

Key facts & findings

– Merged and cleaned two distinct databases into a working signal-rich dataset

– Engineered 1,000,000 timecards into principal deal variables

– Identified key drivers of the volume, intensity and type of legal work

– Estimated effects of drivers on future legal work

– Developed two interactive visualisation tools for east data insights and discovery


Team members: Otto Godwin (team lead), Žana Krstičević, Mingyang Tham, Mark O’Shea, Wenyu Guo

Author: Otto Godwin