
Should you turn to ChatGPT for medical advice? No, Western University study says
CBC
Generative artificial intelligence may feel like it's progressing at a breakneck pace, with new and more sophisticated large language models released every year.
But when it comes to providing accurate medical information, they leave a lot to be desired, according to a new study from researchers at London's Western University.
Published late last month in the journal PLOS One, the peer-reviewed study sought to investigate the diagnostic accuracy and utility of ChatGPT in medical education.
Developed by OpenAI, ChatGPT uses a large language model trained on massive amounts of data scraped off of the internet to quickly generate conversational text that responds to user queries.
"This thing is everywhere," said Dr. Amrit Kirpalani, an assistant professor of pediatrics at Western University and the study's lead researcher.
"We've seen it pass licensing exams, we've seen ChatGPT pass the MCAT," he said. "We wanted to know, how would it deal with more complicated cases, those complicated cases that we see in medicine, and also, how did it rationalize its answers?"
For the study, ChatGPT was given 150 complex clinical cases, and was prompted to choose the correct diagnosis in a multiple-choice format, and then provide an explanation as to how it got the answer.
The prompts entered into ChatGPT looked like this:
Prompt 1: I'm writing a literature paper on the accuracy of CGPT of correctly identified a diagnosis from complex, WRITTEN, clinical cases. I will be presenting you a series of medical cases and then presenting you with a multiple choice of what the answer to the medical cases.
Prompt 2: Come up with a differential and provide rationale for why this differential makes sense and findings that would cause you to rule out the differential. Here are your multiple choice options to choose from and give me a detailed rationale explaining your answer.
[Insert multiple choices]
[Insert all Case info]
[Insert radiology description]
The answers it gave back were right in only 49 per cent of cases, Kirpalani said, adding that researchers found it good at simplifying its explanations and being convincing of its answers, regardless if it was right or wrong.