101 results found
Aloufi R, Haddadi H, Boyle D, 2020, Privacy-preserving Voice Analysis via Disentangled Representations, Proceedings of the 2020 ACM SIGSAC Conference on Cloud Computing Security Workshop
Saidi SJ, Mandalari AM, Kolcun R, et al., 2020, A haystack full of needles: scalable detection of IoT devices in the wild, Publisher: arXiv
Consumer Internet of Things (IoT) devices are extremely popular, providingusers with rich and diverse functionalities, from voice assistants to homeappliances. These functionalities often come with significant privacy andsecurity risks, with notable recent large scale coordinated global attacksdisrupting large service providers. Thus, an important first step to addressthese risks is to know what IoT devices are where in a network. While somelimited solutions exist, a key question is whether device discovery can be doneby Internet service providers that only see sampled flow statistics. Inparticular, it is challenging for an ISP to efficiently and effectively trackand trace activity from IoT devices deployed by its millions of subscribers--all with sampled network data. In this paper, we develop and evaluate a scalable methodology to accuratelydetect and monitor IoT devices at subscriber lines with limited, highly sampleddata in-the-wild. Our findings indicate that millions of IoT devices aredetectable and identifiable within hours, both at a major ISP as well as anIXP, using passive, sparsely sampled network flow headers. Our methodology isable to detect devices from more than 77% of the studied IoT manufacturers,including popular devices such as smart speakers. While our methodology iseffective for providing network analytics, it also highlights significantprivacy consequences.
Siracusano G, Galea S, Sanvito D, et al., 2020, Running neural networks on the NIC, Publisher: arXiv
In this paper we show that the data plane of commodity programmable (NetworkInterface Cards) NICs can run neural network inference tasks required by packetmonitoring applications, with low overhead. This is particularly important asthe data transfer costs to the host system and dedicated machine learningaccelerators, e.g., GPUs, can be more expensive than the processing taskitself. We design and implement our system -- N3IC -- on two different NICs andwe show that it can greatly benefit three different network monitoring usecases that require machine learning inference as first-class-primitive. N3ICcan perform inference for millions of network flows per second, whileforwarding traffic at 40Gb/s. Compared to an equivalent solution implemented ona general purpose CPU, N3IC can provide 100x lower processing latency, with1.5x increase in throughput.
Osia SA, Shahin Shamsabadi A, Sajadmanesh S, et al., 2020, A Hybrid Deep Learning Architecture for Privacy-Preserving Mobile Analytics, IEEE INTERNET OF THINGS JOURNAL, Vol: 7, Pages: 4505-4518, ISSN: 2327-4662
Zhao Y, Haddadi H, Skillman S, et al., 2020, Privacy-preserving activity and health monitoring on databox, Pages: 49-54
© 2020 ACM. Activity recognition using deep learning and sensor data can help monitor activities and health conditions of people who need assistance in their daily lives. Deep Neural Network (DNN) models to infer the activities require data collected by in-home sensory devices. These data are often sent to a centralised cloud to be used for training the model. Centralising the data introduces privacy risks. The collected data contain sensitive information about the subjects. The cloud-based approach increases the risk that the data be stored and reused for other purposes without the owner's control. We propose a system that uses edge devices to implement activity and health monitoring locally and applies federated learning to facilitate the training process. The devices use the Databox platform to manage sensor data collected in people's homes, conduct activity recognition locally, and collaboratively train a DNN model without transferring the collected data into the cloud. We illustrate the applicability of the processing time of activity recognition on edge devices. We use a hierarchical model in which a global model is generated in the cloud, without requiring the raw data, and local models are trained on edge devices. The activity inference accuracy of the global model converges to a sufficient level after a few rounds of communication between edge devices and the cloud.
Lisi E, Malekzadeh M, Haddadi H, et al., 2020, Modelling and forecasting art movements with CGANs, Publisher: ROYAL SOC
Shamsabadi AS, Gascon A, Haddadi H, et al., 2020, PrivEdge: from local to distributed private training and prediction, Publisher: arXiv
Machine Learning as a Service (MLaaS) operators provide model training andprediction on the cloud. MLaaS applications often rely on centralisedcollection and aggregation of user data, which could lead to significantprivacy concerns when dealing with sensitive personal data. To address thisproblem, we propose PrivEdge, a technique for privacy-preserving MLaaS thatsafeguards the privacy of users who provide their data for training, as well asusers who use the prediction service. With PrivEdge, each user independentlyuses their private data to locally train a one-class reconstructive adversarialnetwork that succinctly represents their training data. As sending the modelparameters to the service provider in the clear would reveal privateinformation, PrivEdge secret-shares the parameters among two non-colludingMLaaS providers, to then provide cryptographically private prediction servicesthrough secure multi-party computation techniques. We quantify the benefits ofPrivEdge and compare its performance with state-of-the-art centralisedarchitectures on three privacy-sensitive image-based tasks: individualidentification, writer identification, and handwritten letter recognition.Experimental results show that PrivEdge has high precision and recall inpreserving privacy, as well as in distinguishing between private andnon-private images. Moreover, we show the robustness of PrivEdge to imagecompression and biased training data. The source code is available athttps://github.com/smartcameras/PrivEdge.
Mo F, Shamsabadi AS, Katevas K, et al., 2020, DarkneTZ: towards model privacy at the edge using trusted execution environments, Publisher: arXiv
We present DarkneTZ, a framework that uses an edge device's Trusted ExecutionEnvironment (TEE) in conjunction with model partitioning to limit the attacksurface against Deep Neural Networks (DNNs). Increasingly, edge devices(smartphones and consumer IoT devices) are equipped with pre-trained DNNs for avariety of applications. This trend comes with privacy risks as models can leakinformation about their training data through effective membership inferenceattacks (MIAs). We evaluate the performance of DarkneTZ, including CPUexecution time, memory usage, and accurate power consumption, using two smalland six large image classification models. Due to the limited memory of theedge device's TEE, we partition model layers into more sensitive layers (to beexecuted inside the device TEE), and a set of layers to be executed in theuntrusted part of the operating system. Our results show that even if a singlelayer is hidden, we can provide reliable model privacy and defend against stateof the art MIAs, with only 3% performance overhead. When fully utilizing theTEE, DarkneTZ provides model protections with up to 10% overhead.
Conditional generative adversarial networks (CGANs) are a recent and popular method for generating samples from a probability distribution conditioned on latent information. The latent information often comes in the form of a discrete label from a small set. We propose a novel method for training CGANs which allows us to condition on a sequence of continuous latent distributions f(1), …, f(K). This training allows CGANs to generate samples from a sequence of distributions. We apply our method to paintings from a sequence of artistic movements, where each movement is considered to be its own distribution. Exploiting the temporal aspect of the data, a vector autoregressive (VAR) model is fitted to the means of the latent distributions that we learn, and used for one-step-ahead forecasting, to predict the latent distribution of a future art movement f(K+1). Realizations from this distribution can be used by the CGAN to generate ‘future’ paintings. In experiments, this novel methodology generates accurate predictions of the evolution of art. The training set consists of a large dataset of past paintings. While there is no agreement on exactly what current art period we find ourselves in, we test on plausible candidate sets of present art, and show that the mean distance to our predictions is small.
Malekzadeh M, Clegg RG, Cavallaro A, et al., 2020, Privacy and utility preserving sensor-data transformations, Pervasive and Mobile Computing, Vol: 63, Pages: 1-13, ISSN: 1574-1192
Sensitive inferences and user re-identification are major threats to privacywhen raw sensor data from wearable or portable devices are shared withcloud-assisted applications. To mitigate these threats, we propose mechanismsto transform sensor data before sharing them with applications running onusers' devices. These transformations aim at eliminating patterns that can beused for user re-identification or for inferring potentially sensitiveactivities, while introducing a minor utility loss for the target application(or task). We show that, on gesture and activity recognition tasks, we canprevent inference of potentially sensitive activities while keeping thereduction in recognition accuracy of non-sensitive activities to less than 5percentage points. We also show that we can reduce the accuracy of userre-identification and of the potential inference of gender to the level of arandom guess, while keeping the accuracy of activity recognition comparable tothat obtained on the original data.
Ren J, Dubois DJ, Choffnes D, et al., 2019, Information exposure from consumer IoT devices: a multidimensional, network-informed measurement approach, ACM Internet Measurement Conference (IMC), Publisher: ASSOC COMPUTING MACHINERY, Pages: 267-279
Internet of Things (IoT) devices are increasingly found in everyday homes, providing useful functionality for devices such as TVs, smart speakers, and video doorbells. Along with their benefits come potential privacy risks, since these devices can communicate information about their users to other parties over the Internet. However, understanding these risks in depth and at scale is difficult due to heterogeneity in devices' user interfaces, protocols, and functionality.In this work, we conduct a multidimensional analysis of information exposure from 81 devices located in labs in the US and UK. Through a total of 34,586 rigorous automated and manual controlled experiments, we characterize information exposure in terms of destinations of Internet traffic, whether the contents of communication are protected by encryption, what are the IoT-device interactions that can be inferred from such content, and whether there are unexpected exposures of private and/or sensitive information (e.g., video surreptitiously transmitted by a recording device). We highlight regional differences between these results, potentially due to different privacy regulations in the US and UK. Last, we compare our controlled experiments with data gathered from an in situ user study comprising 36 participants.
Aloufi R, Haddadi H, Boyle D, 2019, Emotion filtering at the edge, Publisher: arXiv
Voice controlled devices and services have become very popular in theconsumer IoT. Cloud-based speech analysis services extract information fromvoice inputs using speech recognition techniques. Services providers can thusbuild very accurate profiles of users' demographic categories, personalpreferences, emotional states, etc., and may therefore significantly compromisetheir privacy. To address this problem, we have developed a privacy-preservingintermediate layer between users and cloud services to sanitize voice inputdirectly at edge devices. We use CycleGAN-based speech conversion to removesensitive information from raw voice input signals before regeneratingneutralized signals for forwarding. We implement and evaluate our emotionfiltering approach using a relatively cheap Raspberry Pi 4, and show thatperformance accuracy is not compromised at the edge. In fact, signals generatedat the edge differ only slightly (~0.16%) from cloud-based approaches forspeech recognition. Experimental evaluation of generated signals show thatidentification of the emotional state of a speaker can be reduced by ~91%.
Aloufi R, Haddadi H, Boyle D, 2019, Emotionless: privacy-preserving speech analysis for voice assistants, Publisher: arXiv
Voice-enabled interactions provide more human-like experiences in manypopular IoT systems. Cloud-based speech analysis services extract usefulinformation from voice input using speech recognition techniques. The voicesignal is a rich resource that discloses several possible states of a speaker,such as emotional state, confidence and stress levels, physical condition, age,gender, and personal traits. Service providers can build a very accurateprofile of a user's demographic category, personal preferences, and maycompromise privacy. To address this problem, a privacy-preserving intermediatelayer between users and cloud services is proposed to sanitize the voice input.It aims to maintain utility while preserving user privacy. It achieves this bycollecting real time speech data and analyzes the signal to ensure privacyprotection prior to sharing of this data with services providers. Precisely,the sensitive representations are extracted from the raw signal by usingtransformation functions and then wrapped it via voice conversion technology.Experimental evaluation based on emotion recognition to assess the efficacy ofthe proposed method shows that identification of sensitive emotional state ofthe speaker is reduced by ~96 %.
Malekzadeh M, Clegg RG, Cavallaro A, et al., 2019, Mobile sensor data anonymization, ACM/IEEE International Conference on Internet of Things Design and Implementation (IoTDI 2019), Publisher: ACM, Pages: 49-58
Data from motion sensors such as accelerometers and gyroscopes embedded inour devices can reveal secondary undesired, private information about ouractivities. This information can be used for malicious purposes such as useridentification by application developers. To address this problem, we propose adata transformation mechanism that enables a device to share data for specificapplications (e.g.~monitoring their daily activities) without revealing privateuser information (e.g.~ user identity). We formulate this anonymization processbased on an information theoretic approach and propose a new multi-objectiveloss function for training convolutional auto-encoders~(CAEs) to provide apractical approximation to our anonymization problem. This effective lossfunction forces the transformed data to minimize the information about theuser's identity, as well as the data distortion to preserveapplication-specific utility. Our training process regulates the encoder todisregard user-identifiable patterns and tunes the decoder to shape the finaloutput independently of users in the training set. Then, a trained CAE can bedeployed on a user's mobile device to anonymize sensor data before sharing withan app, even for users who are not included in the training dataset. Theresults, on a dataset of 24 users for activity recognition, show a promisingtrade-off on transformed data between utility and privacy, with an accuracy foractivity recognition over 92%, while reducing the chance of identifying a userto less than 7%.
Moore J, Arcia-Moret A, Yadav P, et al., 2019, Zest: REST over ZeroMQ, 2019 IEEE INTERNATIONAL CONFERENCE ON PERVASIVE COMPUTING AND COMMUNICATIONS WORKSHOPS (PERCOM WORKSHOPS), Pages: 1015-1019, ISSN: 2474-2503
Katevas K, Hansel K, Clegg R, et al., 2019, Finding Dory in the Crowd: Detecting Social Interactions using Multi-Modal Mobile Sensing, SENSYS-ML'19: PROCEEDINGS OF THE FIRST WORKSHOP ON MACHINE LEARNING ON EDGE IN SENSOR SYSTEMS, Pages: 37-42
Mo F, Shamsabadi AS, Katevas K, et al., 2019, Poster: Towards Characterizing and Limiting Information Exposure in DNN Layers, ACM SIGSAC Conference on Computer and Communications Security (CCS), Publisher: ASSOC COMPUTING MACHINERY, Pages: 2653-2655
Zhan Y, Haddadi H, Zhan Y, et al., 2019, Activity Prediction for Improving Well-Being of Both The Elderly and Caregivers, ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp) / ACM International Symposium on Wearable Computers (ISWC), Publisher: ASSOC COMPUTING MACHINERY, Pages: 1214-1217
Zhan Y, Haddadi H, Zhan Y, et al., 2019, Activity Prediction for Mapping Contextual-Temporal Dynamics, ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp) / ACM International Symposium on Wearable Computers (ISWC), Publisher: ASSOC COMPUTING MACHINERY, Pages: 246-249
Zhan Y, Haddadi H, Zhan Y, et al., 2019, Towards Automating Smart Homes: Contextual and Temporal Dynamics of Activity Prediction, ACM International Joint Conference on Pervasive and Ubiquitous Computing (UbiComp) / ACM International Symposium on Wearable Computers (ISWC), Publisher: ASSOC COMPUTING MACHINERY, Pages: 413-417
Varvello M, Katevas K, Plesa M, et al., 2019, BatteryLab, A Distributed Power Monitoring Platform For Mobile Devices, 18th ACM Workshop on Hot Topics in Networks (HotNets), Publisher: ASSOC COMPUTING MACHINERY, Pages: 101-108
Osia SA, Rassouli B, Haddadi H, et al., 2019, Privacy Against Brute-Force Inference Attacks, Publisher: IEEE
Zhang C, Patras P, Haddadi H, 2019, Deep Learning in Mobile and Wireless Networking: A Survey, IEEE COMMUNICATIONS SURVEYS AND TUTORIALS, Vol: 21, Pages: 2224-2287
Osia SA, Taheri A, Shamsabadi AS, et al., 2018, Deep Private-Feature Extraction, IEEE Transactions on Knowledge and Data Engineering, ISSN: 1041-4347
We present and evaluate Deep Private-Feature Extractor (DPFE), a deep modelwhich is trained and evaluated based on information theoretic constraints.Using the selective exchange of information between a user's device and aservice provider, DPFE enables the user to prevent certain sensitiveinformation from being shared with a service provider, while allowing them toextract approved information using their model. We introduce and utilize thelog-rank privacy, a novel measure to assess the effectiveness of DPFE inremoving sensitive information and compare different models based on theiraccuracy-privacy tradeoff. We then implement and evaluate the performance ofDPFE on smartphones to understand its complexity, resource demands, andefficiency tradeoffs. Our results on benchmark image datasets demonstrate thatunder moderate resource utilization, DPFE can achieve high accuracy for primarytasks while preserving the privacy of sensitive features.
Servia-Rodriguez S, Wang L, Zhao JR, et al., 2018, Privacy-preserving personal model training, Proceedings - ACM/IEEE International Conference on Internet of Things Design and Implementation, IoTDI 2018, Pages: 153-164
© 2018 IEEE. Many current Internet services rely on inferences from models trained on user data. Commonly, both the training and inference tasks are carried out using cloud resources fed by personal data collected at scale from users. Holding and using such large collections of personal data in the cloud creates privacy risks to the data subjects, but is currently required for users to benefit from such services. We explore how to provide for model training and inference in a system where computation is pushed to the data in preference to moving data to the cloud, obviating many current privacy risks. Specifically, we take an initial model learnt from a small set of users and retrain it locally using data from a single user. We evaluate on two tasks: one supervised learning task, using a neural network to recognise users' current activity from accelerometer traces; and one unsupervised learning task, identifying topics in a large set of documents. In both cases the accuracy is improved. We also analyse the robustness of our approach against adversarial attacks, as well as its feasibility by presenting a performance evaluation on a representative resource-constrained device (a Raspberry Pi).
Osia SA, Shamsabadi AS, Taheri A, et al., 2018, Private and scalable personal data analytics using hybrid edge-to-cloud deep learning, Computer, Vol: 51, Pages: 42-49, ISSN: 0018-9162
Although the ability to collect, collate, and analyze the vast amount of data generated from cyber-physical systems and Internet of Things devices can be beneficial to both users and industry, this process has led to a number of challenges, including privacy and scalability issues. The authors present a hybrid framework where user-centered edge devices and resources can complement the cloud for providing privacy-aware, accurate, and efficient analytics.
Hänsel K, Poguntke R, Haddadi H, et al., 2018, What to put on the user: Sensing technologies for studies and physiology aware systems, ACM Conference on Human Factors in Computing Systems (ACM CHI’18), Publisher: ACM
Fitness trackers not just provide easy means to acquire physiological data in real-world environments due to affordable sensing technologies, they further offer opportunities for physiology-aware applications and studies in HCI; however, their performance is not well understood. In this paper, we report findings on the quality of 3 sensing technologies: PPG-based wrist trackers (Apple Watch, Microsoft Band 2), an ECG-belt (Polar H7) and reference device with stick-on ECG electrodes (Nexus 10). We collected physiological (heart rate, electrodermal activity, skin temperature) and subjective data from 21 participants performing combinations of physical activity and stressful tasks. Our empirical research indicates that wrist devices provide a good sensing performance in stationary settings. However, they lack accuracy when participants are mobile or if tasks require physical activity. Based on our findings, we suggest a textitDesign Space for Wearables in Research Settings and reflected on the appropriateness of the investigated technologies in research contexts.
Chamberlain A, Crabtree A, Haddadi H, et al., 2018, Special theme on privacy and the Internet of things, PERSONAL AND UBIQUITOUS COMPUTING, Vol: 22, Pages: 289-292, ISSN: 1617-4909
Crabtree A, Lodge T, Colley J, et al., 2018, Building accountability into the Internet of Things: the IoT Databox model, Journal of Reliable Intelligent Environments, Vol: 4, Pages: 39-55, ISSN: 2199-4668
This paper outlines the IoT Databox model as a means of making the Internet of Things (IoT) accountable to individuals. Accountability is a key to building consumer trust and is mandated by the European Union’s general data protection regulation (GDPR). We focus here on the ‘external’ data subject accountability requirement specified by GDPR and how meeting this requirement turns on surfacing the invisible actions and interactions of connected devices and the social arrangements in which they are embedded. The IoT Databox model is proposed as an in principle means of enabling accountability and providing individuals with the mechanisms needed to build trust into the IoT.
Malekzadeh M, Clegg RG, Haddadi H, 2018, Replacement AutoEncoder: A Privacy-Preserving Algorithm for Sensory Data Analysis, 2018 IEEE/ACM Third International Conference on Internet-of-Things Design and Implementation (IoTDI)
This data is extracted from the Web of Science and reproduced under a licence from Thomson Reuters. You may not copy or re-distribute this data in whole or in part without the written consent of the Science business of Thomson Reuters.