Data Processing Systems

Module aims

In this module you will have the opportunity to:
- advance your knowledge of data-structures and algorithms to data-processing algorithms and  applications
- acquire theoretical and practical knowledge of data processing systems design and implementation for correct results and (close-to) optimal performance
- understand how Database Management Systems (DBMSs) optimize query performance
- understand Data Processing System tuning
- explore challenges & opportunities of cloud-native Data Processing Systems
- explore research directions such as Big Data or data management on modern hardware

Learning outcomes

Upon successful completion of this module you will be able to:
- select, apply and implement appropriate algorithms for common data-processing problems
- plan and optimize the execution of declarative queries
- design and implement a query processor
- assess fundamental bottlenecks in data management applications and how to optimise for them
- solve data processing scalability challenges through scale-up and scale-out techniques
- reason about concurrency control and transactions

Module syllabus

We will assume prior knowledge of the following material/courses:
- 40007: Introduction to Databases
- 40005: Introduction to Computer Architecture
- 50001: Algorithm Design and Analysis

We will cover
- Data Processing Algorithms
- Data Storage Models
- Data Processing Models
- Query planning and optimization
- Data Indexing
- Concurrency Control
- Scale-Up and Scale-Out Data Processing

Teaching methods

The module will be delivered in a flipped-classroom style: two hours of pre-recorded lectures per week combined with one hour of interactive discussion/Q&A and one hour of unassessed exercises/worksheets/short coding assignments (also per week). The module has a practical flavor: we will discuss specific techniques by attempting their implementation (under simplified assumptions).

An online service will be used as a discussion forum for the module. 


Coursework is implementation-focused and team-based, counting for 30% of the module mark; the remaining 70% comes from a written examination which is designed to assess both theoretical and practical aspects of the subject.

There will be detailed feedback on the coursework exercise which will include written feedback on your submissions and class-wide feedback explaining common pitfalls and suggestions for improvement.

Module leaders

Dr Holger Pirk

Reading list

To be advised - module reading list in Leganto