Privacy gaps in Apple’s data collection scheme revealed

20 September 2022

Image of Apple desktop computer, laptop and iPhone.

Imperial researchers have demonstrated how Apple’s use of a widely adopted data protection model could expose individuals to privacy attacks.

By investigating Apple’s use of the model, called local differential privacy (LDP), the researchers found that individuals’ preferred emoji skin tone and political leanings could be inferred from the company’s data.

Companies collect behavioural data generated by users’ devices at scale to improve apps and services. These data, however, contain fine-grained records and can reveal sensitive information about individual users.

"Our results emphasise the need for more research on how to apply these safeguards effectively in practice to protect users’ data." Dr Yves-Alexandre de Montjoye Department of Computing

Companies such as Apple and Microsoft use LDP to collect user data without learning private information that could be traced back to individuals. However, the new paper, presented at the peer-reviewed USENIX Security Symposium, reveals how individuals’ emoji and website usage patterns collected using LDP can be used to collect information about an individual’s use of emoji skin tones and political affiliations.

The Imperial College London researchers say that this violates the guarantees that LDP is supposed to offer, and that more must be done to protect Apple customers’ data.

Senior author Dr Yves-Alexandre de Montjoye, from the Department of Computing, said: “Learning users’ sensitive information like the emoji skin tone used during chats or the political orientation of their most-visited news websites would constitute a concrete privacy violation.

“Our paper shows that Apple’s current implementation of LDP leaves users’ data vulnerable to privacy attacks. Our results emphasise the need for more research on how to apply these safeguards effectively in practice to protect users’ data.”

Safeguarding concerns

To safeguard user data, Apple uses LDP on its iOS and macOS devices when collecting some types of data. They say this helps the company discover the usage patterns of a vast number of users without compromising individual privacy.

LDP works by adding ’noise’ to a user's data locally on the user's device to produce scrambled records, which are then sent to Apple’s servers.

Dr de Montjoye said: “If implemented strictly, LDP guarantees that anyone collecting noisy records from users — including Apple itself — will never be able to learn anything sensitive about individual users, no matter how hard they try.”

However, questions have previously been raised about how the company chooses to implement LDP and whether it could be attacked in practice.

Now, researchers have found that even noisy records can reveal sensitive information about individual users in the event of a new type of attack called pool inference.

"LDP is a powerful technology for collecting data while preserving privacy, but it must be implemented carefully to provide robust guarantees of privacy." Andrea Gadotti Department of Computing

They modelled two types of attack, looking at emoji use and website visits. They found that users were vulnerable to both attacks despite current privacy safeguards.

The attack was shown to be very effective against users with particularly revealing phone usage, like those who visit news websites most often, and those with strong views who might tend to visit news websites mostly of the same political orientation.

The researchers say that Apple must do more to ensure LDP is implemented properly.

Lead author Andrea Gadotti, also of the Department of Computing, said: “LDP is a powerful technology for collecting data while preserving privacy, but it must be implemented carefully to provide robust guarantees of privacy. Currently, Apple’s implementation is vulnerable to attacks which could be used to infer an individual’s political leanings and other sensitive information with potential for abuse and discrimination.

“While Apple does put in place some measures that would, in theory, mitigate our attack, these measures fully rely on trusting that Apple will enforce them, defeating the purpose of using LDP as a technical safeguard that does not rely on trust.”

'Pool Inference Attacks on Local Differential Privacy: Quantifying the Privacy Guarantees of Apple’s Count Mean Sketch in Practice' by Gadotti et al., published August 2022 at USENIX Security Symposium.

Main image credit: Shutterstock.

Reporter

Caroline Brogan

Communications Division

Email: press.office@imperial.ac.uk
Articles by this author