Using Nanopores and Supercomputers to Identify Individual Proteins

The amount and types of proteins produced by our cells provide crucial information about our health and how our bodies function. However, our current methods for identifying and measuring individual proteins are insufficient for the task.

Not only is the variety of proteins unknown, but post-translational changes frequently alter amino acids after synthesis. Nanopores are tiny membranes large enough to let an unspooled DNA strand pass through, but only just. In recent years, tremendous progress has been achieved in DNA reading utilizing nanopores.

Biologists have been able to quickly detect the order of base pairs in the sequence by carefully measuring the ionic voltage of the nanopore when DNA crosses over. In fact, nanopores were utilized this year to sequence the whole human genome, which had previously been impossible with existing technologies.

Researchers from Delft University of Technology in the Netherlands and the University of Illinois at Urbana-Champaign (UIUC) in the United States have extended their DNA nanopore successes and provided proof-of-concept that the same method can be used to characterize proteins with single-amino-acid resolution and vanishingly small (10-6 or 1 in a million) margins of error in a new study published in Science magazine.

“This nanopore peptide reader provides site-specific information about the peptide’s primary sequence that may find applications in single-molecule protein fingerprinting and variant identification,” the authors wrote.

Proteins are lengthy peptide strings made up of 20 different types of amino acids that serve as the workhorses of our cells. The researchers used an enzyme called helicase Hel308 that can connect to DNA-peptide hybrids and drag them through a biological nanopore called MspA in a regulated manner (Mycobacterium smegmatis porin A). 

The Hel308 DNA helicase was chosen because it can draw peptides through the pore in half-nucleotide visible increments, which are closely related to single amino acids. Because the amino acid partially blocks an electrical current carried by ions via the nanopore, each step through the tight gate theoretically provides a unique current signal.

For each specific conformation, we could see what happened to the sidechain, whether it interacts with the surface or remains inside of the pore. Then we could establish directly that the binding of the sidechain enhanced the current.


Lead author Henry Brinkerhoff compares the protein to a necklace with different-sized beads, which he pioneered as a postdoc in physicist Cees Dekker’s group.

“Imagine you turn on the tap as you slowly move that necklace down the drain, which in this case is the nanopore,” he said. “If a big bead is blocking the drain, the water flowing through will only be a trickle; if you have smaller beads in the necklace right at the drain, more water can flow through.”

Because the step-wise transit through the pore is uneven, the researchers can estimate the quantity of ion current with their technique very precisely but not perfectly. The researchers can receive multiple distinct, overlapping reads of the same molecule by loading the liquid medium with helicases, or, to put it another way, they can “rewind” the protein and read its amino acid sequence again. As a result, the number of errors dropped from 13% to almost zero.

The researchers were able to distinguish peptide variations that differed by only one amino acid, which they demonstrated by producing synthetic peptides with only one amino acid modified and demonstrating that the system could discriminate between them.

However, in order to read out the individual amino acids, they had to first figure out what kind of signal each one emits as it passes through the pore. The researchers discovered that some of these signals are counterintuitive.

In comparison to the tiny and medium-sized variations, when the bulky tryptophan amino acid went through constriction, the ion current first reduced and then increased, which was unexpected.

To figure out where these patterns came from, the researchers used supercomputer simulations by computational biologist Aleksei Aksimentiev (UIUC), which were run on three of the world’s fastest supercomputers: Frontera at the Texas Advanced Computing Center, Blue Waters at the National Center for Supercomputing Applications, and Expanse at the San Diego Supercomputer Center.

The behavior of the nanopore, proteins, and the surrounding medium was recreated with atomic resolution by Aksimentiev’s team using a method called molecular dynamics simulation.

Such simulations are unable to adequately capture the nanopore activity’s genuine timescale, which is measured in seconds. The researchers was able to extract statistics for distinct peptide confirmations by generating 40 to 50 beginning states at different places and then performing 70 simulations in parallel.

They calculated the current from these and compared it to trials. Jingqian Liu, a biophysics graduate student at Aksimentiev’s lab, lead the computational study.

The simulations contained 30,000 atoms interacting over a 200-500 millisecond time period and were able to match experimental data. They also demonstrated why certain amino acids produce illogical signals as they travel through the nanopore.

The signal in the tryptophan version may be traced back to the peptide side-chain attaching to the nanopore surface above the constriction.

“For each specific conformation, we could see what happened to the sidechain, whether it interacts with the surface or remains inside of the pore,” said Aksimentiev, professor of Physics at UIUC. “Then we could establish directly that the binding of the sidechain enhanced the current.”

Frontera, the world’s tenth fastest supercomputer and the most powerful at any university, required weeks to construct the simulations. With the type of computing cluster available on most campuses, however, it would have taken years.

On November 4, 2021, Science published the single protein identification research for which there is a global competition for success as a “First Release” online. The research was funded by the Dutch Research Council, the National Institutes of Health in the United States, and the National Science Foundation in the United States, among others.

“There’s tremendous opportunities to develop diagnostics by reading individual protein using this nanopore approach,” Aksimentiev said. “The computation will play a big role in developing these technologies. It’s amazing that with computer models we can reproduce experiments and tell what sort of interactions are going on on the nano-scale.”

Not only that, but computer models offer a new design approach, allowing researchers to experiment with nanopores of various sizes and strategically placed residues to produce increased signals.

More work is needed to perform readings longer than 20 amino acids and to identify heterogeneously charged amino acids, but Aksimentiev believes that a workable model might be developed in three to five years.

“We think that our new approach will allow us to detect post-translational changes,” said Dekker, “and thus shine some light on the proteins that we carry with us.”

Topic : Article