Machine-learning (ML) systems, such as facial expression recognition systems, are becoming more common not only in technology that touches our daily lives but also in those who see them.
Companies that create and use widely used services rely on so-called privacy preservation techniques, which frequently employ generative adversarial networks (GANs), which are typically created by a third party to scrub photos of people’s identities.
But how effective are they?
The answer is “not very,” according to researchers at NYU Tandon School of Engineering who looked at the machine-learning frameworks that underpin these technologies.
In the paper “Subverting Privacy-Preserving GANs: Hiding Secrets in Sanitized Images,” presented last month at the 35th AAAI Conference on Artificial Intelligence, a team led by Siddharth Garg, Institute Associate Professor of electrical and computer engineering at NYU Tandon, explored whether private data could still be recovered from images that had been “sanitized” by such deep-learning discriminators as privacy-protecting GANs (PP-GANs) and that had even passed empirical tests.
The researchers discovered that PP-GAN designs can be subverted to pass privacy checks while still allowing secret information to be extracted from sanitized images. Lead author Kang Liu, a Ph.D. candidate, and Benjamin Tan, research assistant professor of electrical and computer engineering, were part of the team.
Machine-learning-based privacy solutions offer a wide range of applications, including deleting location-relevant information from vehicle camera data, masking the identity of a person who created a handwriting sample, and removing barcodes from photographs.
Many third-party tools for protecting the privacy of people who may show up on a surveillance or data-gathering camera use these PP-GANs to manipulate images. Versions of these systems are designed to sanitize images of faces and other sensitive data so that only application-critical information is retained. While our adversarial PP-GAN passed all existing privacy checks, we found that it actually hid secret data pertaining to the sensitive attributes, even allowing for reconstruction of the original private image.
Siddharth Garg
Because of the complexity, the design and training of GAN-based solutions are outsourced to vendors.
“Many third-party tools for protecting the privacy of people who may show up on a surveillance or data-gathering camera use these PP-GANs to manipulate images,” said Garg.
“Versions of these systems are designed to sanitize images of faces and other sensitive data so that only application-critical information is retained. While our adversarial PP-GAN passed all existing privacy checks, we found that it actually hid secret data pertaining to the sensitive attributes, even allowing for reconstruction of the original private image.”
The study provides information on PP-GANs and associated empirical privacy checks, formulates an attack scenario to see if empirical privacy checks can be bypassed, and outlines a method for doing so.
The researchers present the first full security analysis of privacy-preserving GANs and show that current privacy measures are insufficient to detect sensitive data leaking.
They adversarially tweak a state-of-the-art PP-GAN to hide a secret (the user ID) from ostensibly cleaned face photos using a unique steganographic approach. teeing personal space
They demonstrate that the adversarial PP-GAN they propose may successfully hide sensitive properties in “sanitized” output images that satisfy privacy checks, with a secret recovery rate of 100%.
Garg and his coworkers claim that such privacy checks lack the requisite rigor for ensuring privacy because empirical measurements are dependent on discriminators’ learning capacities and training budgets.
“From a practical standpoint, our results sound a note of caution against the use of data sanitization tools, and specifically PP-GANs, designed by third parties,” explained Garg. “Our experimental results highlighted the insufficiency of existing DL-based privacy checks and the potential risks of using untrusted third-party PP-GAN tools.”