Imperial College London


Faculty of EngineeringDyson School of Design Engineering

Reader in Human-Centred Systems



+44 (0)20 7594 2584h.haddadi Website




Dyson BuildingSouth Kensington Campus





Publication Type

98 results found

Amar Y, Haddadi H, Mortier R, Brown A, Colley J, Crabtree Aet al., An Analysis of Home IoT Network Traffic and Behaviour

Internet-connected devices are increasingly present in our homes, and privacybreaches, data thefts, and security threats are becoming commonplace. In orderto avoid these, we must first understand the behaviour of these devices. In this work, we analyse network traces from a testbed of common IoT devices,and describe general methods for fingerprinting their behavior. We then use theinformation and insights derived from this data to assess where privacy andsecurity risks manifest themselves, as well as how device behavior affectsbandwidth. We demonstrate simple measures that circumvent attempts at securingdevices and protecting privacy.

Journal article

Mo F, Shamsabadi AS, Katevas K, Cavallaro A, Haddadi Het al., Towards Characterizing and Limiting Information Exposure in DNN Layers

Pre-trained Deep Neural Network (DNN) models are increasingly used insmartphones and other user devices to enable prediction services, leading topotential disclosures of (sensitive) information from training data capturedinside these models. Based on the concept of generalization error, we propose aframework to measure the amount of sensitive information memorized in eachlayer of a DNN. Our results show that, when considered individually, the lastlayers encode a larger amount of information from the training data compared tothe first layers. We find that, while the neuron of convolutional layers canexpose more (sensitive) information than that of fully connected layers, thesame DNN architecture trained with different datasets has similar exposure perlayer. We evaluate an architecture to protect the most sensitive layers withinthe memory limits of Trusted Execution Environment (TEE) against potentialwhite-box membership inference attacks without the significant computationaloverhead.

Working paper

Malekzadeh M, Athanasakis D, Haddadi H, Livshits Bet al., Privacy-Preserving Bandits

Contextual bandit algorithms~(CBAs) often rely on personal data to providerecommendations. Centralized CBA agents utilize potentially sensitive data fromrecent interactions to provide personalization to end-users. Keeping thesensitive data locally, by running a local agent on the user's device, protectsthe user's privacy, however, the agent requires longer to produce usefulrecommendations, as it does not leverage feedback from other users. This paperproposes a technique we call Privacy-Preserving Bandits (P2B); a system thatupdates local agents by collecting feedback from other local agents in adifferentially-private manner. Comparisons of our proposed approach with anon-private, as well as a fully-private (local) system, show competitiveperformance on both synthetic benchmarks and real-world data. Specifically, weobserved only a decrease of 2.6% and 3.6% in multi-label classificationaccuracy, and a CTR increase of 0.0025 in online advertising for a privacybudget $\epsilon \approx 0.693$. These results suggest P2B is an effectiveapproach to challenges arising in on-device privacy-preserving personalization.

Working paper

Katevas K, Bagdasaryan E, Waterman J, Safadieh MM, Birrell E, Haddadi H, Estrin Det al., Policy-Based Federated Learning

We are increasingly surrounded by applications, connected devices, services,and smart environments which require fine-grained access to various personaldata. The inherent complexities of our personal and professional policies andpreferences in interactions with these analytics services raise importantchallenges in privacy. Moreover, due to sensitivity of the data and regulatoryand technical barriers, it is not always feasible to do these policynegotiations in a centralized manner. In this paper we present PoliFL, a decentralized, edge-based framework forpolicy-based personal data analytics. PoliFL brings together a number ofexisting established components to provide privacy-preserving analytics withina distributed setting. We evaluate our framework using a popular exemplar ofprivate analytics, Federated Learning, and demonstrate that for varying modelsizes and use cases, PoliFL is able to perform accurate model training and inference within veryreasonable resource and time budgets.

Journal article

Mandalari AM, Kolcun R, Haddadi H, Dubois DJ, Choffnes Det al., Towards Automatic Identification and Blocking of Non-Critical IoT Traffic Destinations

The consumer Internet of Things (IoT) space has experienced a significantrise in popularity in the recent years. From smart speakers, to baby monitors,and smart kettles and TVs, these devices are increasingly found in householdsaround the world while users may be unaware of the risks associated with owningthese devices. Previous work showed that these devices can threatenindividuals' privacy and security by exposing information online to a largenumber of service providers and third party analytics services. Our analysisshows that many of these Internet connections (and the information they expose)are neither critical, nor even essential to the operation of these devices.However, automatically separating out critical from non-critical networktraffic for an IoT device is nontrivial, and requires expert analysis based onmanual experimentation in a controlled setting. In this paper, we investigatewhether it is possible to automatically classify network traffic destinationsas either critical (essential for devices to function properly) or not, henceallowing the home gateway to act as a selective firewall to block undesired,non-critical destinations. Our initial results demonstrate that some IoTdevices contact destinations that are not critical to their operation, andthere is no impact on device functionality if these destinations are blocked.We take the first steps towards designing and evaluating IoTrimmer, a frameworkfor automated testing and analysis of various destinations contacted bydevices, and selectively blocking the ones that do not impact devicefunctionality.

Journal article

Aloufi R, Haddadi H, Boyle D, Privacy-preserving Voice Analysis via Disentangled Representations

Voice User Interfaces (VUIs) are increasingly popular and built intosmartphones, home assistants, and Internet of Things (IoT) devices. Despiteoffering an always-on convenient user experience, VUIs raise new security andprivacy concerns for their users. In this paper, we focus on attributeinference attacks in the speech domain, demonstrating the potential for anattacker to accurately infer a target user's sensitive and private attributes(e.g. their emotion, sex, or health status) from deep acoustic models. Todefend against this class of attacks, we design, implement, and evaluate auser-configurable, privacy-aware framework for optimizing speech-related datasharing mechanisms. Our objective is to enable primary tasks such as speechrecognition and user identification, while removing sensitive attributes in theraw speech data before sharing it with a cloud service provider. We leveragedisentangled representation learning to explicitly learn independent factors inthe raw data. Based on a user's preferences, a supervision signal informs thefiltering out of invariant factors while retaining the factors reflected in theselected preference. Our experimental evaluation over five datasets shows thatthe proposed framework can effectively defend against attribute inferenceattacks by reducing their success rates to approximately that of guessing atrandom, while maintaining accuracy in excess of 99% for the tasks of interest.We conclude that negotiable privacy settings enabled by disentangledrepresentations can bring new opportunities for privacy-preservingapplications.

Journal article

Malekzadeh M, Clegg RG, Cavallaro A, Haddadi Het al., DANA: Dimension-Adaptive Neural Architecture for Multivariate Sensor Data

Current deep neural architectures for processing sensor data are mainlydesigned for data coming from a fixed set of sensors, with a fixed samplingrate. Changing the dimensions of the input data causes considerable accuracyloss, unnecessary computations, or application failures. To address thisproblem, we introduce a {\em dimension-adaptive pooling}~(DAP) layer that makesdeep architectures robust to temporal changes in sampling rate and in sensoravailability. DAP operates on convolutional filter maps of variable dimensionsand produces an input of fixed dimensions suitable for feedforward andrecurrent layers. Building on this architectural improvement, we propose a {\emdimension-adaptive training}~(DAT) procedure to generalize over the entirespace of feasible data dimensions at the inference time. DAT comprises therandom selection of dimensions during the forward passes and optimization withaccumulated gradients of several backward passes. We then combine DAP and DATto transform existing non-adaptive deep architectures into a {\emDimension-Adaptive Neural Architecture}~(DANA) without altering otherarchitectural aspects. Our solution does not need up-sampling or imputation,thus reduces unnecessary computations at inference time. Experimental resultson public datasets show that DANA prevents losses in classification accuracy ofthe state-of-the-art deep architectures, under dynamic sensor availability andvarying sampling rates.

Journal article

Mo F, Borovykh A, Malekzadeh M, Haddadi H, Demetriou Set al., Layer-wise Characterization of Latent Information Leakage in Federated Learning

Training a deep neural network (DNN) via federated learning allowsparticipants to share model updates (gradients), instead of the data itself.However, recent studies show that unintended latent information (e.g. gender orrace) carried by the gradients can be discovered by attackers, compromising thepromised privacy guarantee of federated learning. Existing privacy-preservingtechniques (e.g. differential privacy) either have limited defensive capacityagainst the potential attacks, or suffer from considerable model utility loss.Moreover, characterizing the latent information carried by the gradients andthe consequent privacy leakage has been a major theoretical and practicalchallenge. In this paper, we propose two new metrics to address thesechallenges: the empirical $\mathcal{V}$-information, a theoretically groundednotion of information which measures the amount of gradient information that isusable for an attacker, and the sensitivity analysis that utilizes the Jacobianmatrix to measure the amount of changes in the gradients with respect to latentinformation which further quantifies private risk. We show that these metricscan localize the private information in each layer of a DNN and quantify theleakage depending on how sensitive the gradients are with respect to the latentinformation. As a practical application, we design LatenTZ: a federatedlearning framework that lets the most sensitive layers to run in the clients'Trusted Execution Environments (TEE). The implementation evaluation of LatenTZshows that TEE-based approaches are promising for defending against powerfulproperty inference attacks without a significant overhead in the clients'computing resources nor trading off the model's utility.

Journal article

This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.

Request URL: Request URI: /respub/WEB-INF/jsp/search-html.jsp Query String: id=00964123&limit=30&person=true&page=4&respub-action=search.html