Study Shows that ChatGPT Scores Nearly 50% on Ophthalmology Board Certification Practice Test

In research using ChatGPT, it was discovered that the artificial intelligence tool properly answered fewer than half of the test questions from a study guide frequently used by doctors in order to become certified in ophthalmology.

In the study, which was headed by St. Michael’s Hospital, a Unity Health Toronto facility, and published in JAMA Ophthalmology, it was discovered that when it was first used in January 2023, ChatGPT correctly answered 46% of questions. One month later, when researchers repeated the test, ChatGPT received results that were more than 10% higher.

The potential of AI in medicine and exam preparation has garnered excitement since ChatGPT became publicly available in Nov. 2022. Concern is also being raised about the possibility of false information and academic dishonesty. Anyone with an internet connection can use ChatGPT for free, and it operates in a conversational way.

“ChatGPT may have an increasing role in medical education and clinical practice over time, however, it is important to stress the responsible use of such AI systems,” said Dr. Rajeev H. Muni, principal investigator of the study and a researcher at the Li Ka Shing Knowledge Institute at St. Michael’s. “ChatGPT, as used in this investigation, did not answer sufficient multiple choice questions correctly for it to provide substantial assistance in preparing for board certification at this time.”

Researchers selected a series of sample multiple-choice questions from the OphthoQuestions free trial, a popular tool for preparing for board certification exams. Prior to entering each question, entries or conversations with ChatGPT were cleared, and a new ChatGPT account was utilized to guarantee that the responses provided by ChatGPT were not impacted by other talks going on at the same time.

ChatGPT performed most accurately on general medicine questions, answering 79 percent of them correctly. On the other hand, its accuracy was considerably lower on questions for ophthalmology subspecialties. For instance, the chatbot answered 20 percent of questions correctly on oculoplastics and zero percent correctly from the subspecialty of retina. The accuracy of ChatGPT will likely improve most in niche subspecialties in the future.
Andrew Mihalache

Questions that used images and videos were not included because ChatGPT only accepts text input.

Of 125 text-based multiple-choice questions, ChatGPT answered 58 (46 per cent) questions correctly when the study was first conducted in Jan. 2023. Researchers repeated the analysis on ChatGPT in Feb. 2023, and the performance improved to 58 per cent.

“ChatGPT is an artificial intelligence system that has tremendous promise in medical education. Though it provided incorrect answers to board certification questions in ophthalmology about half the time, we anticipate that ChatGPT’s body of knowledge will rapidly evolve,” said Dr. Marko Popovic, a co-author of the study and a resident physician in the Department of Ophthalmology and Vision Sciences at the University of Toronto.

ChatGPT selected the same multiple-choice response as the most frequent response given by ophthalmology trainees 44% of the time, precisely mirroring how trainees respond to queries.

Among ophthalmology trainees, ChatGPT chose the multiple-choice response that was least popular 11% of the time, second least popular 8% of the time, and second most popular 22% of the time.

“ChatGPT performed most accurately on general medicine questions, answering 79 percent of them correctly. On the other hand, its accuracy was considerably lower on questions for ophthalmology subspecialties. For instance, the chatbot answered 20 percent of questions correctly on oculoplastics and zero percent correctly from the subspecialty of retina. The accuracy of ChatGPT will likely improve most in niche subspecialties in the future,” said Andrew Mihalache, lead author of the study and undergraduate student at Western University.

Study Shows that ChatGPT Scores Nearly 50% on Ophthalmology Board Certification Practice Test

you might also like

Next-generation memory devices made possible by artificial hafnia

A new study looks at how presumptions impact motion capture technology.

Mice with tiny VR goggles could improve neuroscience research

The phenomenon known as “false vacuum decay” is clarified by a new study.

Researchers find an energy-saving answer to the world’s water dilemma.