In comparison to current approaches that rely on manual record checks, an automated procedure that combines natural language processing and machine learning was able to swiftly and accurately identify people who inject drugs (PWID) in electronic health data.
Currently, the International Classification of Diseases (ICD) codes are used to identify people who inject drugs. These codes are either entered by healthcare practitioners in patients’ electronic health records or are extracted from those data by trained human coders who examine them for billing purposes.
However, as there is no unique ICD code for injecting drugs, physicians and coders must utilize a combination of generic codes as a stand-in for PWIDs. This is a laborious process that runs the risk of being inaccurate.
Staphylococcus aureus bacteremia, a typical illness that happens when the bacteria enters openings in the skin, such as those at injection sites, was manually evaluated in 1,000 records from 2003 to 2014 of patients hospitalized to Veterans Administration hospitals with the condition. They then created and trained algorithms utilizing machine learning and natural language processing, comparing them with 11 proxy ICD code combinations to detect PWIDs.
By using natural language processing and machine learning, we could identify people who inject drugs in thousands of notes in a matter of minutes compared to several weeks that it would take a manual reviewer to do this. This would allow health systems to identify PWIDs to better allocate resources like syringe services programs and substance use and mental health treatment for people who use drugs.
Dr. David Goodman-Meza
Limitations to the study include potentially poor documentation by providers. The dataset utilized spans the years 2003 to 2014, however, the epidemic of injectable drug use has since changed from heroin and prescription opioids to synthetic opioids like fentanyl, which the algorithm might not recognize because there aren’t many cases of that medication in the dataset from which it learnt the categorization.
Finally, given that the conclusions are solely based on data from the Veterans Administration, they could not be applicable to other situations.
The use of this artificial intelligence model accelerates the identification of PWIDs dramatically, which may enhance clinical judgment, healthcare research, and administrative surveillance.
“By using natural language processing and machine learning, we could identify people who inject drugs in thousands of notes in a matter of minutes compared to several weeks that it would take a manual reviewer to do this,” said lead author Dr. David Goodman-Meza, assistant professor of medicine in the division of infectious diseases at the David Geffen School of Medicine at UCLA. “This would allow health systems to identify PWIDs to better allocate resources like syringe services programs and substance use and mental health treatment for people who use drugs.”
The study’s other researchers are Dr. Amber Tang, Dr. Matthew Bidwell Goetz, Steven Shoptaw, and Alex Bui of UCLA; Dr. Michihiko Goto of the University of Iowa and Iowa City VA Medical Center; Dr. Babak Aryanfar of VA Greater Los Angeles Healthcare System; Sergio Vazquez of Dartmouth College; and Dr. Adam Gordon of the University of Utah and VA Salt Lake City Health Care System. Goodman-Meza and Goetz also have appointments with VA Greater Los Angeles Healthcare System.