close
Computer

A New AI Model Can Aid in Preventing Harmful and Expensive Data Breaches

An AI algorithm developed by Imperial privacy researchers checks privacy-preserving systems for potential data leaks automatically. This is the first time AI has been used to automatically find weaknesses in this kind of system; Google Maps and Facebook are two examples.

The specialists from Imperial’s Computational Privacy Group examined assaults on interfaces that are controlled by query-based systems (QBS), which analysts employ to query data and get useful aggregate knowledge about the world. They subsequently created a brand-new AI-enabled technique dubbed QuerySnout to discover QBS assaults.

QBS gives analysts access to statistics collections compiled from personal information like location and demographics. They are currently utilized in Facebook’s audience measurement function to estimate the size of the audience in a specific place or demographic to aid in advertising promotions or in Google Maps to display real-time information on how busy an area is.

In their new study, published as part of the 29th ACM Conference on Computer and Communications Security, the team including the Data Science Institute’s Ana Maria Cretu, Dr. Florimond Houssiau, Dr. Antoine Cully and Dr. Yves-Alexandre de Montjoye found that powerful and accurate attacks against QBS can easily be automatically detected at the pressing of a button.

According to Senior Author Dr. Yves-Alexandre de Montjoye: “Attacks have so far been manually developed using highly skilled expertise. This means it was taking a long time for vulnerabilities to be discovered, which leaves systems at risk.”

“OuerySnout is already outperforming humans at discovering vulnerabilities in real-world systems.”

The need for query-based systems

In the previous ten years, our capacity for data collection and storage has multiplied. Even though the majority of this data is personal, its use creates substantial privacy issues that are covered by legislation like the EU’s General Data Protection Regulation.

Therefore, a topical and important question for data scientists and privacy specialists is how to allow data to be used for a good while maintaining our fundamental right to privacy.

QBS may make it possible to perform anonymous data analysis at scale while protecting privacy. In QBS, curators maintain authority over the data and can therefore analyze and scrutinize queries sent by analysts to guarantee that any returned information does not include personally identifiable information.

However, malicious attackers can get around such systems by creating queries that infer personal information about specific people by taking advantage of system flaws or implementation errors.

Testing the system

The creation and implementation of QBS have been halted due to the dangers of unknown, powerful “zero-day” assaults, in which attackers take advantage of security flaws in systems.

Data breach attacks can be mimicked to find information leaks and potential vulnerabilities, testing the robustness of these systems in a manner similar to penetration testing in cyber-security.

The manual design and implementation of these attacks against complicated QBS is a challenging and drawn-out procedure, though.

Accordingly, the researchers assert that reducing the possibility of powerful unmitigated attacks is crucial for enabling QBS to be implemented safely and effectively while upholding the privacy rights of individuals.

QuerySnout

The Imperial team created a brand-new AI-enabled technique called QuerySnout that functions by learning what queries to pose to the system in order to elicit responses. It then learns to automatically aggregate the responses in order to find potential privacy concerns.

The model can develop an attack using a series of questions and a combination of answers to reveal a specific piece of sensitive information by applying machine learning. The QuerySnout model may learn the proper sets of queries to ask through the use of a fully automated process known as “evolutionary search.”

This takes place in a ‘black-box setting’ which means the AI only needs access to the system but does not need to know how the system works in order to detect the vulnerabilities.

Co-first Author Ana-Maria Cretu said: “We demonstrate that QuerySnout finds more powerful attacks than those currently known on real-world systems. This means our AI model is better than humans at finding these attacks.”

Next steps

Presently, QuerySnout only tests a small number of functionalities. According to Dr de Montjoye: “The main challenge moving forward will be to scale the search to a much larger number of functionalities to make sure it discovers even the most advanced attacks.”

Even so, the model enables analysts to test the resistance of QBS to various kinds of attackers. The creation of QuerySnout is a significant advancement in protecting people’s privacy when it comes to query-based systems.

“QuerySnout: Automating the Discovery of Attribute Interference Attacks against Query-Based Systems” by A. M. Cretu, F. Houssiau, A. Cully, and Y. A. de Montjoye, published on 7 November 2022 in Proceedings of the 2022 ACM SIGSAC Conference on Computer and Communications Security.

Topic : Article