Researchers at The University of Texas at Dallas and Novartis Pharmaceuticals Corp. have developed a computer-based drug discovery platform that could make the process more effective, efficient, and cost-effective.
Dr. Baris Coskunuzer, professor of mathematical sciences at UT Dallas, and his colleagues developed a method based on topological data analysis to virtually screen thousands of potential drug candidates and significantly narrow the compound candidates to those most suitable for laboratory and clinical testing.
The findings will be presented at the 36th Conference on Neural Information Processing Systems. In the early stages of drug discovery, researchers typically identify a biological target, such as a protein associated with a disease of interest. The next step is to screen libraries of thousands of potential chemical compounds to see if they are effective or if they can be modified to affect the target in order to alleviate the disease’s cause or symptoms. The most promising candidates proceed to the time-consuming and costly process of laboratory and clinical testing, as well as regulatory approval.
Our algorithm has the advantage of screening about 100,000 compounds in a couple of days, which is much faster than other methods. If you find a good compound but it lacks the desired molecular properties, for example, it is not soluble, it is unlikely to work.
Coskunuzer
“The drug-discovery process can take 10 to 15 years and cost a billion dollars,” Coskunuzer said. “Drug companies want a more cost-effective way to do this. They want to find the most promising compounds at the beginning of the process so they’re not wasting time testing dead ends.
“We have provided a completely new method of virtual screening that is computationally efficient and ranks compounds based on how likely they are to work.” While virtual screening of libraries of chemical compounds is not new, Coskunuzer said his group’s approach significantly outperforms other state-of-the-art methods on large data sets.
The UTD and Novartis team framed the virtual screening process as a new type of topology-based graph ranking problem from topological data analysis, a branch of mathematics. Their method classifies each molecular compound based on the shape of its underlying physical substructure – its topology – as well as a set of physical and chemical properties of the molecule’s constituents. Based on this data, the researchers create a unique “topological fingerprint” for each compound, which is used to rank it based on how well it matches the desired properties.
“Our algorithm has the advantage of screening about 100,000 compounds in a couple of days, which is much faster than other methods,” Coskunuzer explained. The method will then be generalized to molecular property prediction, which will include scoring a compound based on how soluble it is in water. Solubility is important for a drug’s efficacy in the human body.
“If you find a good compound but it lacks the desired molecular properties — for example, it is not soluble — it is unlikely to work. You want to be able to test these properties before a drug candidate progresses too far” According to Coskunuzer.