Evaluating AI-Generated Geriatric Case Studies for Interprofessional Education: Systematic Analysis Across 5 Platforms. Article

Ruggiano, Nicole, Sahoo, Sudikshya, Brashear, Ava et al. (2026). Evaluating AI-Generated Geriatric Case Studies for Interprofessional Education: Systematic Analysis Across 5 Platforms. . 12 e83085. 10.2196/83085

cited authors

  • Ruggiano, Nicole; Sahoo, Sudikshya; Brashear, Ava; Nwatu, Uche; Brunson, Amie; Noh, Hyunjin; Cole, Heather; McKinney, Robert; Framil Suarez, C Victoria; Brown, Ellen L; Prevost, Suzanne

authors

abstract

  • Background

    Simulation-based learning (SBL) has become standard practice in educating health care professionals to apply their knowledge and skills in patient care. While SBL has demonstrated its value in education, many educators find the process of developing new, unique scenarios to be time-intensive, creating limits to the variety of issues students may experience within educational settings. Generative artificial intelligence (AI) platforms, such as ChatGPT (OpenAI), have emerged as a potential tool for developing simulation case studies more efficiently, though little is known about the performance of AI in generating high-quality case studies for interprofessional education.

    Objective

    This study aimed to generate geriatric case scenarios across 5 AI platforms by a transdisciplinary team and systematically evaluate them for quality, accuracy, and bias.

    Methods

    Ten geriatric case studies were generated using the same prompt from 5 different generative AI platforms (N=50): ChatGPT, Claude (Anthropic AI), Copilot (Microsoft), Gemini (Google), and Grok (xAI). An evaluation tool was developed to collect evaluative data to assess the content and quality of each case, sociodemographic data of the featured patient, the appropriateness of each case for interprofessional education, and potential bias. Case quality was evaluated using the Simulation Scenario Evaluation Tool (SSET). Each case was evaluated by 3 team members who had experience in SBL education. Assessment scores were averaged, and qualitative responses were extracted to triangulate patterns found in the quantitative data.

    Results

    While each AI platform was able to generate 10 unique case studies, the quality of studies varied within and across platforms. Generally, evaluators felt that the content in the cases was accurate, though some cases were not realistic. Some patient populations and common conditions among older adults were underrepresented or absent across the cases. All cases were set within traditional health care settings (eg, hospitals and routine medical visits). No cases featured home-based care. Based on the average SSET scores, reviewers assessed ChatGPT to be the highest overall performer (mean 3.27, SD 0.45, 95% CI 2.95-3.59) while Grok received the lowest scores (mean 1.61, SD 1.26, 95% CI 0.71-2.51). Platforms performed best at generating learning objectives (mean 3.35, SD 1.08, 95% CI 3.04-3.65) and lowest on their ability to describe supplies and materials that may be available in hypothetical scenarios (mean 1.27, SD 0.84, 95% CI 1.03-1.51).

    Conclusions

    This study is the first to systematically evaluate and compare multiple generative AI platforms for case study generation using a validated assessment tool (SSET) and provides evidence-based guidance on selecting and using AI tools effectively. The findings offer practical direction for educators navigating available generative AI tools to enhance training for health care professionals, including specific strategies for prompt engineering that can improve the quality of SBL resources in interprofessional education. These insights enable educators to leverage AI capabilities while maintaining pedagogical rigor.

publication date

  • January 1, 2026

keywords

  • Aged
  • Artificial Intelligence
  • Female
  • Geriatrics
  • Health Personnel
  • Humans
  • Interprofessional Education
  • Male
  • Simulation Training

Digital Object Identifier (DOI)

Medium

  • Electronic

start page

  • e83085

volume

  • 12