close
Computer Sciences

Scientists are Duped by Generative AI Using Fake Data, Bringing Automated Data Analysis Closer

The goal of fully automated data analysis has been advanced by the development of artificial intelligence (AI) technology that can currently create artificial scientific data.

Researchers at the University of Illinois Urbana-Champaign have created an artificial intelligence (AI) that generates fake data from microscopy studies, which are frequently used to define atomic-level material structures.

Drawing on the same technology as art generators, the AI enables the researchers to include experimental flaws and background noise in the data collected, making it possible to detect material properties much more quickly and effectively than before.

“Generative AIs take information and generate new things that haven’t existed before in the world, and now we’ve leveraged that for the goal of automated data analysis,” said Pinshane Huang, a U. of I. professor of materials science and engineering and a project co-lead. “What is used to make paintings of llamas in the style of Monet on the internet can now make scientific data so good it fools me and my colleagues.”

In order to help with data analysis, other types of AI and machine learning are frequently utilized in the field of materials science, although they necessitate frequent, time-consuming human participation. It takes a sizable collection of labeled data to instruct the algorithm what to look for in order to make these analysis techniques more efficient.

In addition, for the data set to be useful, a variety of background noise and experimental flaws must be taken into consideration, effects that are challenging to model.

There’s a ‘generator’ whose job is to imitate a provided data set, and there’s a ‘discriminator’ whose job is to spot the differences between the generator and the real data. They take turns trying to foil each other, improving themselves based on what the other was able to do. Ultimately, the generator can produce artificial data that is virtually indistinguishable from the real data.

Professor Bryan Clark

Since collecting and labeling such a vast data set using a real microscope is infeasible, Huang worked with U. of I. physics professor Bryan Clark to develop a generative AI that could create a large set of artificial training data from a comparatively small set of real, labeled data. To achieve this, the researchers used a cycle generative adversarial network, or CycleGAN.

“You can think of a CycleGAN as a competition between two entities,” Clark said. “There’s a ‘generator’ whose job is to imitate a provided data set, and there’s a ‘discriminator’ whose job is to spot the differences between the generator and the real data. They take turns trying to foil each other, improving themselves based on what the other was able to do. Ultimately, the generator can produce artificial data that is virtually indistinguishable from the real data.”

By providing CycleGAN with a small sample of real microscopy images, the AI learned to generate images that were used to train the analysis routine. Despite the background noise and systematic flaws, it can now recognize a variety of structural traits.

“The remarkable part of this is that we never had to tell the AI what things like background noise and imperfections like aberration in the microscope are,” Clark said. “That means even if there’s something that we hadn’t thought about, the CycleGAN can learn it and run with it.”

The CycleGAN has been used by Huang’s research team in experiments to identify flaws in two-dimensional semiconductors, a class of materials that show promise for use in electronics and optics but are challenging to characterize without the aid of AI. However, she observed that the method has a much broader reach.

“The dream is to one day have a ‘self-driving’ microscope, and the biggest barrier was understanding how to process the data,” she said. “Our work fills in this gap. We show how you can teach a microscope how to find interesting things without having to know what you’re looking for.”

Topic : Article