By utilizing advanced machine learning algorithms and natural language processing, AI models can be trained to detect and prevent potential data breaches in real-time. They can analyze vast amounts of data, identify patterns and anomalies, and flag any potential security threats, reducing the risk of data breaches and protecting sensitive information. Additionally, AI-powered security systems can continuously learn and adapt to new security threats, making them a valuable tool in preventing damaging and costly data breaches.
Privacy experts have developed an AI algorithm that detects potential data leaks in privacy-preserving systems. Imperial privacy experts developed an AI algorithm that detects potential data leaks in privacy-preserving systems. This is the first time AI has been used to automatically detect vulnerabilities in this type of system, with Google Maps and Facebook serving as examples.
The experts from Imperial’s Computational Privacy Group investigated attacks on query-based systems (QBS), which are controlled interfaces that allow analysts to query data to extract useful aggregate information about the world. They then created QuerySnout, a new AI-enabled method for detecting QBS attacks.
We show that QuerySnout detects more powerful attacks than are currently known to exist on real-world systems. This means that our AI model outperforms humans in detecting these attacks.
Ana-Maria Cretu
QBS provide analysts with access to statistical collections derived from individual-level data such as location and demographics. They are currently used in Google Maps to display real-time information on how busy an area is, as well as in Facebook’s Audience Measurement feature to estimate audience size in a specific location or demographic to aid in advertising promotions.
The team, which included Ana Maria Cretu of the Data Science Institute, Dr Florimond Houssiau, Dr Antoine Cully, and Dr Yves-Alexandre de Montjoye, discovered that powerful and accurate attacks against QBS can be easily detected automatically at the press of a button in their new study, which was published as part of the 29th ACM Conference on Computer and Communications Security.
According to Senior Author Dr Yves-Alexandre de Montjoye: “Attacks have so far been manually developed using highly skilled expertise. This means it was taking a long time for vulnerabilities to be discovered, which leaves systems at risk.
“OuerySnout is already outperforming humans at discovering vulnerabilities in real-world systems.”
The need for query-based systems
Our ability to collect and store data has exploded in the last decade. Although this data can help drive scientific advancements, most of it is personal and hence its use raises serious privacy concerns, protected by laws such as the EU’s General Data Protection Regulation. Therefore, enabling data to be used for good while preserving our fundamental right to privacy is a timely and crucial question for data scientists and privacy experts.
QBS have the potential to enable anonymous, privacy-preserving data analysis at scale. Curators maintain control over the data in QBS, allowing them to check and examine queries sent by analysts to ensure that the answers returned do not reveal personal information about individuals. However, illegal attackers can circumvent such systems by creating queries that infer personal information about specific people by exploiting system vulnerabilities or implementation bugs.
Testing the system
The risks of unknown strong “zero-day” attacks, in which attackers exploit system vulnerabilities, have stymied QBS development and deployment. Data breach attacks can be simulated to detect information leakages and identify potential vulnerabilities to test the robustness of these systems, similar to penetration testing in cyber-security. Manually designing and implementing these attacks against complex QBS, on the other hand, is a difficult and time-consuming process.
According to the researchers, limiting the potential for strong unmitigated attacks is critical to enabling QBS to be usefully and safely implemented while protecting individual privacy rights.
QuerySnout
The Imperial team developed a new AI-enabled method called QuerySnout which works by learning which questions to ask the system to gain answers. It then learns to combine the answers automatically to detect potential privacy vulnerabilities.
The model can create an attack using machine learning that consists of a collection of queries that combine the answers in order to reveal a specific piece of private information. This process is completely automated, and it employs a technique known as ‘evolutionary search,’ which allows the QuerySnout model to discover the appropriate sets of questions to ask. This occurs in a ‘black-box setting,’ which means that the AI only needs access to the system but does not need to understand how it works to detect the vulnerabilities.
Ana-Maria Cretu, co-first author, stated: “We show that QuerySnout detects more powerful attacks than are currently known to exist on real-world systems. This means that our AI model outperforms humans in detecting these attacks.”