Trial Outcomes & Findings for Project 3 Example: Human-AI Collaboration Tester (HAICT) Exp. 7 (NCT NCT05272189)
NCT ID: NCT05272189
Last Updated: 2026-01-20
Results Overview
D' (d-prime) is the signal detection theory measure of the level of performance on a task. It is computed by calculating the proportion of true positive responses =(true positive trials)/(true positive + false negative trials) = p(TP) and by calculating the proportion of false positive responses =(false positive trials)/(false positive + true negative trials) = p(FP). These values are transformed into 'z-scores' (for example, using NORMSINV in Excel to calculate the inverse of the standard normal distribution). D' is defined as Z(TP)-Z(FP). Its range is from 0 for cases where no signal can be discriminated from the noise, to \~4.0. The upper limit is not defined, but 4 would mean that and observer is essentially perfect at discriminating signal from noise.
COMPLETED
NA
12 participants
Data are collected within a session of about an hour.
2026-01-20
Participant Flow
Participant milestones
| Measure |
Experiment
All participants are tested in all conditions of this experiment.
Simulated Second Reader AI: In this experiment, in some conditions, the participant makes their decision in the presence of information about a simulated artificial intelligence decision.
Target Prevalence: The frequency with which targets are presented varies from 10% to 90%
|
|---|---|
|
Overall Study
STARTED
|
12
|
|
Overall Study
COMPLETED
|
12
|
|
Overall Study
NOT COMPLETED
|
0
|
Reasons for withdrawal
Withdrawal data not reported
Baseline Characteristics
Project 3 Example: Human-AI Collaboration Tester (HAICT) Exp. 7
Baseline characteristics by cohort
| Measure |
Experiment
n=12 Participants
All participants are tested in all conditions of this experiment.
Simulated Second Reader AI: In this experiment, in some conditions, the participant makes their decision in the presence of information about a simulated artificial intelligence decision.
Target Prevalence: The frequency with which targets are presented varies from 10% to 90%
|
|---|---|
|
Age, Continuous
|
31 years
STANDARD_DEVIATION 9 • n=37 Participants
|
|
Sex: Female, Male
Female
|
7 Participants
n=37 Participants
|
|
Sex: Female, Male
Male
|
5 Participants
n=37 Participants
|
|
Ethnicity (NIH/OMB)
Hispanic or Latino
|
0 Participants
n=37 Participants
|
|
Ethnicity (NIH/OMB)
Not Hispanic or Latino
|
12 Participants
n=37 Participants
|
|
Ethnicity (NIH/OMB)
Unknown or Not Reported
|
0 Participants
n=37 Participants
|
|
Race (NIH/OMB)
American Indian or Alaska Native
|
0 Participants
n=37 Participants
|
|
Race (NIH/OMB)
Asian
|
1 Participants
n=37 Participants
|
|
Race (NIH/OMB)
Native Hawaiian or Other Pacific Islander
|
0 Participants
n=37 Participants
|
|
Race (NIH/OMB)
Black or African American
|
4 Participants
n=37 Participants
|
|
Race (NIH/OMB)
White
|
7 Participants
n=37 Participants
|
|
Race (NIH/OMB)
More than one race
|
0 Participants
n=37 Participants
|
|
Race (NIH/OMB)
Unknown or Not Reported
|
0 Participants
n=37 Participants
|
PRIMARY outcome
Timeframe: Data are collected within a session of about an hour.D' (d-prime) is the signal detection theory measure of the level of performance on a task. It is computed by calculating the proportion of true positive responses =(true positive trials)/(true positive + false negative trials) = p(TP) and by calculating the proportion of false positive responses =(false positive trials)/(false positive + true negative trials) = p(FP). These values are transformed into 'z-scores' (for example, using NORMSINV in Excel to calculate the inverse of the standard normal distribution). D' is defined as Z(TP)-Z(FP). Its range is from 0 for cases where no signal can be discriminated from the noise, to \~4.0. The upper limit is not defined, but 4 would mean that and observer is essentially perfect at discriminating signal from noise.
Outcome measures
| Measure |
Experiment
n=12 Participants
All participants are tested in all conditions of this experiment.
|
|---|---|
|
D'
Baseline, Prevalence = .1 'Prevalence' = proportion of trials that are 'positive'. Here 10%.
|
1.69 d'
Standard Error .21
|
|
D'
Baseline, Prevalence = .33 = 33% target present trials.
|
1.93 d'
Standard Error .13
|
|
D'
Baseline, Prevalence = .67
|
1.86 d'
Standard Error .12
|
|
D'
Baseline, Prevalence = .9
|
1.92 d'
Standard Error .15
|
|
D'
Second Reader, Prevalence = .1
|
2.44 d'
Standard Error .09
|
|
D'
Second Reader, Prevalence = .3
|
2.28 d'
Standard Error .07
|
|
D'
Second Reader, Prevalence = .67
|
.230 d'
Standard Error .10
|
|
D'
Second Reader, Prevalence = .9
|
2.34 d'
Standard Error .15
|
PRIMARY outcome
Timeframe: Data are collected within a session of about an hour.Criterion, like D' (see above) is calculated from z(TP) and z(FP). Criterion ( c ) = (z(TP)+z(FP))/-2. A value of zero means that the observer is equally likely to make a positive (e.g. 'target present') response as a negative (absent) response. Positive values mean that the observer is more likely to say "absent" (a "conservative" criterion). Negative values mean the observer is more likely to say "present" (a "liberal" criterion). Liberal and conservative have no political connotations in this case. Criterion values almost always fall between -2 and 2.
Outcome measures
| Measure |
Experiment
n=12 Participants
All participants are tested in all conditions of this experiment.
|
|---|---|
|
Criterion
Second Reader, Prevalence = 0.33
|
.23 criterion (c)
Standard Error .06
|
|
Criterion
Baseline, Prevalence = 0.1
|
.7 criterion (c)
Standard Error .14
|
|
Criterion
Baseline, Prevalence = 0.33
|
.24 criterion (c)
Standard Error .09
|
|
Criterion
Baseline, Prevalence = 0.67
|
-.26 criterion (c)
Standard Error .08
|
|
Criterion
Baseline, Prevalence = 0.9
|
-.54 criterion (c)
Standard Error .14
|
|
Criterion
Second Reader, Prevalence = 0.1
|
.47 criterion (c)
Standard Error .13
|
|
Criterion
Second Reader, Prevalence = 0.67
|
-.16 criterion (c)
Standard Error 05
|
|
Criterion
Second Reader, Prevalence = 0.9
|
-.35 criterion (c)
Standard Error .10
|
SECONDARY outcome
Timeframe: Data are collected within a session of about an hour.This is the measure of how long it takes to make a response.
Outcome measures
| Measure |
Experiment
n=12 Participants
All participants are tested in all conditions of this experiment.
|
|---|---|
|
Reaction Time
Baseline, Prevalence =0.1
|
543 Response Time (msec)
Standard Error 56
|
|
Reaction Time
Baseline, Prevalence =0.3
|
735 Response Time (msec)
Standard Error 108
|
|
Reaction Time
Baseline, Prevalence =0.67
|
620 Response Time (msec)
Standard Error 70
|
|
Reaction Time
Baseline, Prevalence =0.9
|
592 Response Time (msec)
Standard Error 74
|
|
Reaction Time
Second Reader, Prevalence =0.1
|
510 Response Time (msec)
Standard Error 49
|
|
Reaction Time
Second Reader, Prevalence =0.33
|
772 Response Time (msec)
Standard Error 77
|
|
Reaction Time
Second Reader, Prevalence =0.67
|
963 Response Time (msec)
Standard Error 9
|
|
Reaction Time
Second Reader, Prevalence =0.9
|
557 Response Time (msec)
Standard Error 66
|
Adverse Events
Experiment
Serious adverse events
Adverse event data not reported
Other adverse events
Adverse event data not reported
Additional Information
Jeremy M Wolfe, Professor and Principle Investigator
Brigham & Women's Hospital
Results disclosure agreements
- Principal investigator is a sponsor employee
- Publication restrictions are in place