Vikas Nanda has spent more than two decades researching the intricate properties of proteins, the incredibly intricate materials found in all living things. The scientist at Rutgers has long wondered how the distinct arrangements of amino acids that make up proteins determine whether they become anything from collagen to hemoglobin, as well as the subsequent, enigmatic process of self-assembly, where only specific proteins clump together to form even more complex substances.
Nanda, a researcher at the Center for Advanced Biotechnology and Medicine (CABM) at Rutgers, was one of those at the top of the list when scientists wanted to conduct an experiment comparing a human with a profound, intuitive understanding of protein design and self-assembly against the predictive capabilities of an artificially intelligent computer program.
The findings of the competition to see who or what could forecast which protein sequences would combine most successfully are now available. The fight was close but ultimately successful, Nanda reports in Nature Chemistry with colleagues from across the country and scientists at Argonne National Laboratory in Illinois. The competition matching Nanda and several colleagues against an artificial intelligence (AI) program has been won, ever so slightly, by the computer program.
Protein self-assembly is of great interest to scientists because they think that a better understanding of it could help them develop a variety of ground-breaking products for both industrial and medical applications, such as artificial human tissue for wounds and catalysts for novel chemical products.
“Despite our extensive expertise, the AI did as good or better on several data sets, showing the tremendous potential of machine learning to overcome human bias,” said Nanda, a professor in the Department of Biochemistry and Molecular Biology at Rutgers Robert Wood Johnson Medical School.
Proteins are made of large numbers of amino acids joined end to end. In order to create three-dimensional molecules with intricate shapes, the chains fold up. Each protein’s function is governed by its specific shape and the amino acids it contains.
We’re working to get a fundamental understanding of the chemical nature of interactions that lead to self-assembly, so I worried that using these programs would prevent important insights. But what I’m beginning to really understand is that machine learning is just another tool, like any other.
Vikas Nanda
Nanda is one researcher who “designs” proteins by constructing sequences that result in novel proteins. Recently, Nanda and a group of scientists created a synthetic protein that swiftly recognizes the lethal nerve toxin VX, opening the door for new biosensors and therapies.
Proteins will self-assemble with other proteins to create superstructures that are crucial to biology for reasons that are mostly unknown. When proteins self-assemble into a virus’s protective capsid, for example, it appears as though they are following a design. In other instances, when something goes wrong, they self-assemble and produce lethal biological structures that are linked to diseases as diverse as Alzheimer’s and sickle cell.
“Understanding protein self-assembly is fundamental to making advances in many fields, including medicine and industry,” Nanda said.
Nanda and five other researchers were asked to forecast which proteins on a list would be most likely to self-assemble. They were contrasted with the computer program’s predictions.
The human specialists selected 11 proteins they believed would self-assemble using general guidelines based on their observations of protein activity in studies, including patterns of electrical charges and degree of water repulsion. Nine proteins were selected by computer software, which used a sophisticated machine-learning algorithm.
The humans were correct for six out of the 11 proteins they chose. Six of the nine proteins that the computer software suggested might self-assemble, giving it a greater success rate.
The experiment revealed that human experts occasionally made the wrong decisions because they “favored” specific amino acids over others. Additionally, the computer algorithm successfully identified certain proteins that lacked characteristics that would make them obvious candidates for self-assembly, providing a starting point for additional research.
The experience has made Nanda, once a doubter of machine learning for protein assembly investigations, more open to the technique.
“We’re working to get a fundamental understanding of the chemical nature of interactions that lead to self-assembly, so I worried that using these programs would prevent important insights,” Nanda said. “But what I’m beginning to really understand is that machine learning is just another tool, like any other.”
Other researchers on the paper included Rohit Batra, Henry Chan, Srilok Srinivasan, Harry Fry, and Subramanian Sankaranarayanan, all with the Argonne National Laboratory; Troy Loeffler, SLAC National Accelerator Laboratory; Honggang Cui, Johns Hopkins University; Ivan Korendovych, Syracuse University; Liam Palmer, Northwestern University; and Lee Solomon, George Mason University.