Trial Outcomes & Findings for Artificial Intelligent Clinical Decision Support System Simulation Center Study for Technology Acceptance (NCT NCT05816473)
NCT ID: NCT05816473
Last Updated: 2026-05-22
Results Overview
The study will use a common set of dependent variables to assess baseline and post-intervention attitudes towards machine learning algorithms in clinical care using an adapted Unified Theory of Acceptance and Use of Technology (UTAUT) survey assessing perceived usefulness of the system, perceived ease of use, attitudes towards using it, behavioral intentions, and trust, measured with a 5-point Likert scale. Percent change in UTAUT survey response between Large Language Model-based Interaction and Machine Learning Dashboard at recruitment prior to administration of scenarios and immediately after completion of scenarios. The difference in time between the two will be approximately 60 minutes. Higher change indicates greater acceptance/intention to use the GutGPT+Dashboard.
COMPLETED
NA
108 participants
Approximately 60 minutes
2026-05-22
Participant Flow
Participant milestones
| Measure |
Large Language Model-based Interaction
LLM-powered chatbot with the machine learning dashboard to provide the risk assessment and provide rationale based on interpretability metrics provided by the dashboard in which study participants can directly interact with using natural language. Participants will be provided the Generative Pre-trained Transformer (GPT) chatbot powered machine learning model dashboard.
|
Machine Learning Dashboard
Machine learning algorithm output with an interactive dashboard that can be used to explain, or interpret the input factors that contribute most towards the generated risk score. Participants will have access to the machine learning dashboard only.
|
|---|---|---|
|
Overall Study
STARTED
|
52
|
56
|
|
Overall Study
COMPLETED
|
52
|
54
|
|
Overall Study
NOT COMPLETED
|
0
|
2
|
Reasons for withdrawal
| Measure |
Large Language Model-based Interaction
LLM-powered chatbot with the machine learning dashboard to provide the risk assessment and provide rationale based on interpretability metrics provided by the dashboard in which study participants can directly interact with using natural language. Participants will be provided the Generative Pre-trained Transformer (GPT) chatbot powered machine learning model dashboard.
|
Machine Learning Dashboard
Machine learning algorithm output with an interactive dashboard that can be used to explain, or interpret the input factors that contribute most towards the generated risk score. Participants will have access to the machine learning dashboard only.
|
|---|---|---|
|
Overall Study
Lost to Follow-up
|
0
|
2
|
Baseline Characteristics
Not collected
Baseline characteristics by cohort
| Measure |
Large Language Model-based Interaction
n=52 Participants
LLM-powered chatbot with the machine learning dashboard to provide the risk assessment and provide rationale based on interpretability metrics provided by the dashboard in which study participants can directly interact with using natural language. Participants will be provided the Generative Pre-trained Transformer (GPT) chatbot powered machine learning model dashboard.
|
Machine Learning Dashboard
n=54 Participants
Machine learning algorithm output with an interactive dashboard that can be used to explain, or interpret the input factors that contribute most towards the generated risk score. Participants will have access to the machine learning dashboard only.
|
Total
n=106 Participants
Total of all reporting groups
|
|---|---|---|---|
|
Age, Customized
18 - 24 years
|
2 Participants
n=52 Participants
|
0 Participants
n=54 Participants
|
2 Participants
n=106 Participants
|
|
Age, Customized
25-29 years
|
28 Participants
n=52 Participants
|
32 Participants
n=54 Participants
|
60 Participants
n=106 Participants
|
|
Age, Customized
30-34 years
|
15 Participants
n=52 Participants
|
16 Participants
n=54 Participants
|
31 Participants
n=106 Participants
|
|
Age, Customized
35-39 years
|
6 Participants
n=52 Participants
|
4 Participants
n=54 Participants
|
10 Participants
n=106 Participants
|
|
Age, Customized
40-44 years
|
1 Participants
n=52 Participants
|
1 Participants
n=54 Participants
|
2 Participants
n=106 Participants
|
|
Age, Customized
45-49 years
|
0 Participants
n=52 Participants
|
1 Participants
n=54 Participants
|
1 Participants
n=106 Participants
|
|
Sex: Female, Male
Female
|
9 Participants
n=52 Participants
|
6 Participants
n=54 Participants
|
15 Participants
n=106 Participants
|
|
Sex: Female, Male
Male
|
43 Participants
n=52 Participants
|
48 Participants
n=54 Participants
|
91 Participants
n=106 Participants
|
|
Race (NIH/OMB)
American Indian or Alaska Native
|
0 Participants
n=52 Participants
|
0 Participants
n=54 Participants
|
0 Participants
n=106 Participants
|
|
Race (NIH/OMB)
Asian
|
7 Participants
n=52 Participants
|
17 Participants
n=54 Participants
|
24 Participants
n=106 Participants
|
|
Race (NIH/OMB)
Native Hawaiian or Other Pacific Islander
|
0 Participants
n=52 Participants
|
0 Participants
n=54 Participants
|
0 Participants
n=106 Participants
|
|
Race (NIH/OMB)
Black or African American
|
6 Participants
n=52 Participants
|
7 Participants
n=54 Participants
|
13 Participants
n=106 Participants
|
|
Race (NIH/OMB)
White
|
34 Participants
n=52 Participants
|
26 Participants
n=54 Participants
|
60 Participants
n=106 Participants
|
|
Race (NIH/OMB)
More than one race
|
0 Participants
n=52 Participants
|
0 Participants
n=54 Participants
|
0 Participants
n=106 Participants
|
|
Race (NIH/OMB)
Unknown or Not Reported
|
5 Participants
n=52 Participants
|
4 Participants
n=54 Participants
|
9 Participants
n=106 Participants
|
|
Ethnicity (NIH/OMB)
Hispanic or Latino
|
—
|
—
|
0 Participants
Not collected
|
|
Ethnicity (NIH/OMB)
Not Hispanic or Latino
|
—
|
—
|
0 Participants
Not collected
|
|
Ethnicity (NIH/OMB)
Unknown or Not Reported
|
—
|
—
|
0 Participants
Not collected
|
|
Training level
Residency
|
40 Participants
n=52 Participants
|
41 Participants
n=54 Participants
|
81 Participants
n=106 Participants
|
|
Training level
Medical Student
|
12 Participants
n=52 Participants
|
13 Participants
n=54 Participants
|
25 Participants
n=106 Participants
|
|
Familiarity with Artificial Intelligence (AI)
Some AI Coursework
|
3 Participants
n=52 Participants
|
4 Participants
n=54 Participants
|
7 Participants
n=106 Participants
|
|
Familiarity with Artificial Intelligence (AI)
Not at all or slightly
|
42 Participants
n=52 Participants
|
41 Participants
n=54 Participants
|
83 Participants
n=106 Participants
|
|
Familiarity with Artificial Intelligence (AI)
Unknown/Did not answer
|
7 Participants
n=52 Participants
|
9 Participants
n=54 Participants
|
16 Participants
n=106 Participants
|
|
Mean baseline Unified Theory of Acceptance and Use of Technology (UTAUT) survey score
Behavioral Intention
|
3.3 score on a scale
STANDARD_DEVIATION 0.1 • n=52 Participants
|
3.5 score on a scale
STANDARD_DEVIATION 0.1 • n=54 Participants
|
3.4 score on a scale
STANDARD_DEVIATION 0.1 • n=106 Participants
|
|
Mean baseline Unified Theory of Acceptance and Use of Technology (UTAUT) survey score
Performance Expectancy
|
3.4 score on a scale
STANDARD_DEVIATION 0.1 • n=52 Participants
|
3.6 score on a scale
STANDARD_DEVIATION 0.1 • n=54 Participants
|
3.5 score on a scale
STANDARD_DEVIATION 0.1 • n=106 Participants
|
|
Mean baseline Unified Theory of Acceptance and Use of Technology (UTAUT) survey score
Effort Expectancy
|
2.8 score on a scale
STANDARD_DEVIATION 0.1 • n=52 Participants
|
3.0 score on a scale
STANDARD_DEVIATION 0.1 • n=54 Participants
|
2.9 score on a scale
STANDARD_DEVIATION 0.1 • n=106 Participants
|
|
Mean baseline Unified Theory of Acceptance and Use of Technology (UTAUT) survey score
Social Influence
|
3.4 score on a scale
STANDARD_DEVIATION 0.1 • n=52 Participants
|
3.7 score on a scale
STANDARD_DEVIATION 0.1 • n=54 Participants
|
3.6 score on a scale
STANDARD_DEVIATION 0.1 • n=106 Participants
|
|
Mean baseline Unified Theory of Acceptance and Use of Technology (UTAUT) survey score
Facilitating Conditions
|
2.8 score on a scale
STANDARD_DEVIATION 0.1 • n=52 Participants
|
2.7 score on a scale
STANDARD_DEVIATION 0.1 • n=54 Participants
|
2.8 score on a scale
STANDARD_DEVIATION 0.1 • n=106 Participants
|
PRIMARY outcome
Timeframe: Approximately 60 minutesThe study will use a common set of dependent variables to assess baseline and post-intervention attitudes towards machine learning algorithms in clinical care using an adapted Unified Theory of Acceptance and Use of Technology (UTAUT) survey assessing perceived usefulness of the system, perceived ease of use, attitudes towards using it, behavioral intentions, and trust, measured with a 5-point Likert scale. Percent change in UTAUT survey response between Large Language Model-based Interaction and Machine Learning Dashboard at recruitment prior to administration of scenarios and immediately after completion of scenarios. The difference in time between the two will be approximately 60 minutes. Higher change indicates greater acceptance/intention to use the GutGPT+Dashboard.
Outcome measures
| Measure |
Large Language Model-based Interaction (GutGPT+ Dashboard)
n=52 Participants
LLM-powered chatbot with the machine learning dashboard to provide the risk assessment and provide rationale based on interpretability metrics provided by the dashboard in which study participants can directly interact with using natural language. Participants will be provided the Generative Pre-trained Transformer (GPT) chatbot powered machine learning model dashboard.
|
Machine Learning Dashboard
n=54 Participants
Machine learning algorithm output with an interactive dashboard that can be used to explain, or interpret the input factors that contribute most towards the generated risk score. Participants will have access to the machine learning dashboard only.
|
|---|---|---|
|
Median Change in Attitudes Towards Machine Learning Algorithms in Clinical Care Using UTAUT
Behavioral intentions
|
0.0 units on a scale
Interval 0.0 to 0.3
|
0.0 units on a scale
Interval 0.0 to 0.3
|
|
Median Change in Attitudes Towards Machine Learning Algorithms in Clinical Care Using UTAUT
Performance Expectancy
|
0.0 units on a scale
Interval 0.0 to 0.3
|
0.3 units on a scale
Interval 0.0 to 0.5
|
|
Median Change in Attitudes Towards Machine Learning Algorithms in Clinical Care Using UTAUT
Effort Expectancy
|
0.6 units on a scale
Interval 0.3 to 1.0
|
0.3 units on a scale
Interval 0.0 to 0.5
|
|
Median Change in Attitudes Towards Machine Learning Algorithms in Clinical Care Using UTAUT
Social influence
|
0.0 units on a scale
Interval 0.0 to 0.3
|
0.0 units on a scale
Interval 0.0 to 0.3
|
|
Median Change in Attitudes Towards Machine Learning Algorithms in Clinical Care Using UTAUT
Facilitating conditions
|
0.1 units on a scale
Interval 0.0 to 0.3
|
0.0 units on a scale
Interval 0.0 to 0.3
|
|
Median Change in Attitudes Towards Machine Learning Algorithms in Clinical Care Using UTAUT
Trust
|
0.2 units on a scale
Interval 0.1 to 0.6
|
0.4 units on a scale
Interval 0.2 to 0.8
|
|
Median Change in Attitudes Towards Machine Learning Algorithms in Clinical Care Using UTAUT
Benefit
|
0.2 units on a scale
Interval 0.0 to 0.5
|
0.2 units on a scale
Interval 0.2 to 0.5
|
|
Median Change in Attitudes Towards Machine Learning Algorithms in Clinical Care Using UTAUT
Risk
|
-0.1 units on a scale
Interval -0.3 to 0.0
|
-0.1 units on a scale
Interval -0.4 to 0.0
|
SECONDARY outcome
Timeframe: Approximately 60 minutesMean percentage of decision accuracy per participant. Accuracy is defined as the percentage of times participants accurately choose the correct clinical decision for each simulation scenario of acute upper GI bleeding for each treatment condition. Immediately after completion of scenarios (60 minutes from initiation of study for each participant). No further follow up afterwards.
Outcome measures
| Measure |
Large Language Model-based Interaction (GutGPT+ Dashboard)
n=52 Participants
LLM-powered chatbot with the machine learning dashboard to provide the risk assessment and provide rationale based on interpretability metrics provided by the dashboard in which study participants can directly interact with using natural language. Participants will be provided the Generative Pre-trained Transformer (GPT) chatbot powered machine learning model dashboard.
|
Machine Learning Dashboard
n=54 Participants
Machine learning algorithm output with an interactive dashboard that can be used to explain, or interpret the input factors that contribute most towards the generated risk score. Participants will have access to the machine learning dashboard only.
|
|---|---|---|
|
Clinician Decision Making of Triage of GI Bleeding
|
91.7 percent accuracy per participant
Standard Deviation 27.9
|
92.1 percent accuracy per participant
Standard Deviation 27.2
|
Adverse Events
Large Language Model-based Interaction
Machine Learning Dashboard
Serious adverse events
Adverse event data not reported
Other adverse events
Adverse event data not reported
Additional Information
Results disclosure agreements
- Principal investigator is a sponsor employee
- Publication restrictions are in place