Integrating AI Predictions With Clinician Expertise

NCT07457840 · Status: NOT_YET_RECRUITING · Phase: NA · Type: INTERVENTIONAL · Enrollment: 100

Last updated 2026-03-09

No results posted yet for this study

Summary

Optimizing the interaction between the human and the machine is a major topic when deploying artificial intelligence (AI) at the bedside. The goal of this randomized clinical vignette study is to learn if presenting AI model outputs via continuous Bayesian updates and/or uncertainty quantification can improve diagnostic accuracy and clinician trust in healthcare professionals (physicians, residents, fellows, physician assistants (PAs), and nurse practitioners (NPs)) from US academic institutions evaluating patients with chest pain or dyspnea.

The main questions it aims to answer are:

* Does presenting AI predictions as Bayesian-updated post-test probabilities improve diagnostic accuracy compared to standard predicted probabilities?
* Does the addition of uncertainty quantification (95% confidence intervals) to AI predictions improve diagnostic accuracy?
* Do these interventions (Bayesian updating and/or uncertainty quantification) help clinicians recover from the negative effects of intentionally misleading AI predictions?

Comparison: Researchers will compare standard AI predicted probabilities (presented without uncertainty) to Bayesian-updated post-test probabilities and/or outputs containing 95% confidence intervals to see if the interventions improve diagnostic accuracy, clinician confidence, and resilience against misleading AI.

Participants will:

* Review 8 clinical vignettes (simulated patient cases) focusing on chest pain or dyspnea.
* Provide an initial "pre-test" diagnostic probability for 5 possible diagnoses based on the clinical history alone.
* View AI model outputs that vary by experimental condition (standard probability vs. Bayesian update, with or without uncertainty intervals, and accurate vs. misleading).
* Provide an updated "post-test" diagnostic probability for the diagnoses after viewing the AI output.
* Select and rank diagnostic tests and therapeutic steps for each vignette. Complete a post-survey regarding their trust in the AI, comfort with the data presentation, and demographics.

Conditions

Diagnostic Decision Making

Interventions

BEHAVIORAL

Bayesian-Updated Post-Test Probability

Rather than presenting the AI model's raw predicted probability, the system takes the clinician's pre-test probability (entered before seeing AI output) and applies a continuous likelihood ratio (CLR) derived from the AI model to calculate a Bayesian-updated post-test probability. The output is displayed as a shift from the clinician's own assessment (e.g., "Your assessment: 45% -\> Updated assessment: 72%"). The raw AI prediction is not shown. This approach mirrors how clinicians use diagnostic test results such as D-dimer to update pre-test probability of pulmonary embolism.

BEHAVIORAL

Standard AI Predicted Probability

AI model prediction is presented as a simple predicted probability (0-100%) for each of the possible diagnoses, together with the top 3 clinical features driving the prediction (e.g., "Acute Myocardial Infarction: 68% - Key factors: elevated troponin, ST-segment changes on ECG, chest pain radiation to left arm"). This represents the most common current approach to presenting AI-based diagnostic predictions in clinical settings.

BEHAVIORAL

Uncertainty Quantification (95% Confidence Interval)

The AI output (whether Bayesian-updated post-test probability or standard predicted probability) is presented together with a 95% confidence band displayed as error bars on probability bars. For accurate AI predictions, confidence interval width is approximately +/-12-15 percentage points. For misleading AI predictions, confidence intervals are widened by a factor of 1.5x (approximately +/-18-23 percentage points) to simulate reduced model confidence in unfamiliar or edge-case scenarios. Confidence intervals are constrained to the 0-100% range.

Sponsors & Collaborators

University of California, San Francisco
lead OTHER

Principal Investigators

Romain Pirracchio, MD, PhD, MPH · University of California, San Francisco

Study Design

Allocation: RANDOMIZED
Purpose: OTHER
Masking: SINGLE
Model: FACTORIAL

Eligibility

Min Age: 18 Years
Sex: ALL
Healthy Volunteers: Yes

Timeline & Regulatory

Start: 2026-02-28
Primary Completion: 2026-04-30
Completion: 2026-12-31

More Related Trials

Entities

Companies

University of California, San Francisco

Summary

Conditions

Interventions

Sponsors & Collaborators

Principal Investigators

Study Design

Eligibility

Timeline & Regulatory

Related Clinical Trials

Artificial Intelligence for Learning Point-of-Care Ultrasound

Physician Response Evaluation With Contextual Insights vs. Standard Engines - Artificial Intelligence RAG vs LLM Clinical Decision Support

Physician Reasoning on Diagnostic Cases With Large Language Models

Mitigating Automation Bias in Physician-LLM Diagnostic Reasoning Using Behavioral Nudges

Qualitative Research Among Physicians and Junior Doctors Into the Preconditions for Implementing a CDSS Based on AI in the ICU

More Related Trials

Entities