Imperial College London

ProfessorAlastairDonaldson

Faculty of EngineeringDepartment of Computing

Professor of Programming Languages
 
 
 
//

Contact

 

+44 (0)20 7594 8266alastair.donaldson Website

 
 
//

Location

 

422Huxley BuildingSouth Kensington Campus

//

Summary

 

Publications

Citation

BibTex format

@inproceedings{Deligiannis:2016,
author = {Deligiannis, P and McCutchen, M and Thomson, P and Chen, S and Donaldson, AF and Erickson, J and Huang, C and Lal, A and Mudduluru, R and Qadeer, S and Schulte, W},
publisher = {USENIX},
title = {Uncovering Bugs in Distributed Storage Systems during Testing (not in Production!)},
url = {http://hdl.handle.net/10044/1/31770},
year = {2016}
}

RIS format (EndNote, RefMan)

TY  - CPAPER
AB - Testing distributed systems is challenging due to multiplesources of nondeterminism. Conventional testingtechniques, such as unit, integration and stress testing,are ineffective in preventing serious but subtle bugs fromreaching production. Formal techniques, such as TLA+,can only verify high-level specifications of systems at thelevel of logic-based models, and fall short of checkingthe actual executable code. In this paper, we present anew methodology for testing distributed systems. Ourapproach applies advanced systematic testing techniquesto thoroughly check that the executable code adheresto its high-level specifications, which significantly improvescoverage of important system behaviors.Our methodology has been applied to three distributedstorage systems in the Microsoft Azure cloud computingplatform. In the process, numerous bugs were identified,reproduced, confirmed and fixed. These bugs required asubtle combination of concurrency and failures, makingthem extremely difficult to find with conventional testingtechniques. An important advantage of our approach isthat a bug is uncovered in a small setting and witnessedby a full system trace, which dramatically increases theproductivity of debugging.
AU - Deligiannis,P
AU - McCutchen,M
AU - Thomson,P
AU - Chen,S
AU - Donaldson,AF
AU - Erickson,J
AU - Huang,C
AU - Lal,A
AU - Mudduluru,R
AU - Qadeer,S
AU - Schulte,W
PB - USENIX
PY - 2016///
TI - Uncovering Bugs in Distributed Storage Systems during Testing (not in Production!)
UR - http://hdl.handle.net/10044/1/31770
ER -