Parents rate child health care information generated by expert-supervised prompts in ChatGPT as more accurate and trustworthy than information developed by human experts, according to study results published in the Journal of Pediatric Psychology.
Artificial intelligence (AI) chatbots and large language models, like ChatGPT, are rapidly transforming the way individuals consume health information. Although early research suggests that people trust AI responses to basic health questions, lack of domain-specific expertise, lack of nuance in prompt-engineering, and known errors in large language models have raised concerns regarding health literacy.
To evaluate how parents perceive health information generated by expert-supervised ChatGPT vs health-related text generated by a human expert, researchers recruited parents aged 18 to 65 years on Amazon Mechanical Turk to participate in a cross-sectional study. Participants were asked to read 2 vignettes of equivalent length (about 500 words) – 1 of which was developed by ChatGPT and 1 was developed by an expert. The vignettes covered health topics, including over-the-counter cold medicines, infant sleep training, and children’s diets. ChatGPT vignettes were created using expert-supervised prompt engineering to match expert vignettes in structure and readability.
The participants completed questionnaires assessing behavioral intentions before and after reading vignettes and perceptions of trustworthiness, expertise, accuracy, morality, and likelihood of relying on the information. The researchers aimed to assess how parents perceive and are influenced by health information generated by expert-supervised ChatGPT compared with content written by human experts.
“
Given that parents will trust and rely on information generated by ChatGPT, it is critically important that human domain-specific expertise be applied to healthcare information that will ultimately be presented to consumers (e.g., parents).
A total of 116 parents were included in the analysis. The parents had a mean (SD) age of 45.02 (10.92) years, 55.9% were women, 55.9% were White, and 68.6% reported a family income of $70,000 or below. Nearly all participants (97%) somewhat agreed, agreed, or strongly agreed that they seek health information online.
Overall, the participants reported that they learned new information from the vignettes on medication (mean=5.54; SD, 1.49), sleep (mean=5.10; SD, 1.72), and diet (mean=5.14; SD, 1.61), with no significant difference between ChatGPT and expert vignettes.
For behavioral intentions, the researchers found that participants were significantly less likely to recommend over-the-counter medication for pediatric viral infections after reading both ChatGPT (t60 =3.69; P <.001) and expert vignettes (t55 =4.70; P <.001). Similarly, behavioral intentions regarding infant sleep training and improving a child’s diet changed significantly after reading both vignettes, with neither source outperforming the other (all P <.001).
The researchers observed no significant difference in parent ratings of author morality and expertise between the ChatGPT and expert vignettes. However, the participants found ChatGPT more trustworthy for over-the-counter medication information (t116 =4.32; P <.001), were more likely to rely on information from ChatGPT for over-the-counter medication (t115 =4.17; P <.001), and viewed ChatGPT as more accurate for medication information (t116 =4.91; P <.001). No significant differences were observed between the vignettes for sleep and diet information.
The researchers concluded, “Given that parents will trust and rely on information generated by ChatGPT, it is critically important that human domain-specific expertise be applied to healthcare information that will ultimately be presented to consumers (e.g., parents).”
Study limitations include the relatively low sample size and the inability to run a single model to address shared variance due to the complexity of the factorial design.