Tuesday, October 24, 2023

Evaluation of an Artificial Intelligence Chatbot for Delivery of IR Patient Education Material: A Comparison with Societal Website Content

Evaluation of an Artificial Intelligence Chatbot for Delivery of IR Patient Education Material: A Comparison with Societal Website Content


Clinical question

How does the completeness, accuracy and reliability of a large language chatbot model as a tool for patient education in the field of interventional radiology compare to a traditional societal website.


Take away point

While employing a large language chatbot model for patient education in interventional radiology shows promise, it also has limitations. Readers should be aware that while the chatbot's responses are generally thorough and factual, they can occasionally be incomplete or incorrect. Additionally, content provided by ChatGPT was found to be longer and more difficult to read when compared to a traditional societal website. As of now, patients and providers should be cautious when relying solely on chatbot generated content and consider augmenting it with other trusted sources.


Reference

McCarthy CJ, Berkowitz S, Ramalingam V, Ahmed M. Evaluation of an Artificial Intelligence Chatbot for Delivery of IR Patient Education Material: A Comparison with Societal Website Content. J Vasc Interv Radiol. 2023 Oct;34(10):1760-1768.e32. doi: 10.1016/j.jvir.2023.05.037. Epub 2023 Jun 16. PMID: 37330210.

Click here for abstract

Study design

Artificial intelligence study.

Funding Source

No reported funding.


Setting

Not explicitly mentioned.



Figure

Summary of Patient Education Materials Assessment Tool for Printable Materials (PEMAT-P) scores

Summary


The study involved analyzing 104 questions posed to ChatGPT and comparing its responses to content from the Society of Interventional Radiology Patient Center website. The goal was to assess whether ChatGPT could effectively serve as a resource for patient education in the field of interventional radiology. Readability was assessed using five validated scales, and understandability and actionability were evaluated using the Patient Education Materials Assessment Tool for Printable Materials (PEMAT-P).

ChatGPT generally provided longer and more complex responses compared to the website. Additionally, chatbot generated content was found to be more challenging to read and nearly one grade level above the comparison. Surprisingly, content from both models was written at a higher grade level than recommended for patient education materials. The study also revealed that while uncommon, ChatGPT could provide incomplete or inaccurate information.

Most importantly, the study highlighted both the potential and limitations of utilizing current chatbots for patient education in the field of interventional radiology. Concerns were raised about the chatbot's tendency to provide verbose responses and guess answers when faced with ambiguous questions. The authors suggested that ChatGPT and similar chatbot models hold promise as patient education tools while underscoring the importance of improvements in accuracy and readability through customization.

Commentary


The study addresses an important and relevant clinical question while providing valuable insights into the challenges and limitations of using AI chatbots for patient education. This technology remains new to the general public, and this study may serve as a starting point for ongoing research in this field as transformative changes occur. Future research could involve a larger pool of reviewers to assess accuracy, reassess after adding visual aids to patient education, and propose solutions for optimizing and improving AI-driven patient education content.

Post Author:
Ryan R. Babayev, MD, MSc
Diagnostic Radiology Resident
Hartford Hospital
@RyanBabayevMD

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.