Trial Outcomes & Findings for Project 3 Example: Human-AI Collaboration Tester (HAICT) Exp. 7 (NCT NCT05272189)

Last Updated: 2026-01-20

Results Overview

D' (d-prime) is the signal detection theory measure of the level of performance on a task. It is computed by calculating the proportion of true positive responses =(true positive trials)/(true positive + false negative trials) = p(TP) and by calculating the proportion of false positive responses =(false positive trials)/(false positive + true negative trials) = p(FP). These values are transformed into 'z-scores' (for example, using NORMSINV in Excel to calculate the inverse of the standard normal distribution). D' is defined as Z(TP)-Z(FP). Its range is from 0 for cases where no signal can be discriminated from the noise, to \~4.0. The upper limit is not defined, but 4 would mean that and observer is essentially perfect at discriminating signal from noise.

Recruitment status

COMPLETED

Study phase

Target enrollment

12 participants

Primary outcome timeframe

Data are collected within a session of about an hour.

Results posted on

2026-01-20

Participant Flow

Participant milestones

Participant milestones
Measure	Experiment All participants are tested in all conditions of this experiment. Simulated Second Reader AI: In this experiment, in some conditions, the participant makes their decision in the presence of information about a simulated artificial intelligence decision. Target Prevalence: The frequency with which targets are presented varies from 10% to 90%
Overall Study STARTED	12
Overall Study COMPLETED	12
Overall Study NOT COMPLETED	0

Reasons for withdrawal

Withdrawal data not reported

Baseline Characteristics

Project 3 Example: Human-AI Collaboration Tester (HAICT) Exp. 7

Baseline characteristics by cohort

Baseline characteristics by cohort
Measure	Experiment n=12 Participants All participants are tested in all conditions of this experiment. Simulated Second Reader AI: In this experiment, in some conditions, the participant makes their decision in the presence of information about a simulated artificial intelligence decision. Target Prevalence: The frequency with which targets are presented varies from 10% to 90%
Age, Continuous	31 years STANDARD_DEVIATION 9 • n=37 Participants
Sex: Female, Male Female	7 Participants n=37 Participants
Sex: Female, Male Male	5 Participants n=37 Participants
Ethnicity (NIH/OMB) Hispanic or Latino	0 Participants n=37 Participants
Ethnicity (NIH/OMB) Not Hispanic or Latino	12 Participants n=37 Participants
Ethnicity (NIH/OMB) Unknown or Not Reported	0 Participants n=37 Participants
Race (NIH/OMB) American Indian or Alaska Native	0 Participants n=37 Participants
Race (NIH/OMB) Asian	1 Participants n=37 Participants
Race (NIH/OMB) Native Hawaiian or Other Pacific Islander	0 Participants n=37 Participants
Race (NIH/OMB) Black or African American	4 Participants n=37 Participants
Race (NIH/OMB) White	7 Participants n=37 Participants
Race (NIH/OMB) More than one race	0 Participants n=37 Participants
Race (NIH/OMB) Unknown or Not Reported	0 Participants n=37 Participants

PRIMARY outcome

Timeframe: Data are collected within a session of about an hour.

Outcome measures

Outcome measures
Measure	Experiment n=12 Participants All participants are tested in all conditions of this experiment.
D' Baseline, Prevalence = .1 'Prevalence' = proportion of trials that are 'positive'. Here 10%.	1.69 d' Standard Error .21
D' Baseline, Prevalence = .33 = 33% target present trials.	1.93 d' Standard Error .13
D' Baseline, Prevalence = .67	1.86 d' Standard Error .12
D' Baseline, Prevalence = .9	1.92 d' Standard Error .15
D' Second Reader, Prevalence = .1	2.44 d' Standard Error .09
D' Second Reader, Prevalence = .3	2.28 d' Standard Error .07
D' Second Reader, Prevalence = .67	.230 d' Standard Error .10
D' Second Reader, Prevalence = .9	2.34 d' Standard Error .15

PRIMARY outcome

Timeframe: Data are collected within a session of about an hour.

Criterion, like D' (see above) is calculated from z(TP) and z(FP). Criterion ( c ) = (z(TP)+z(FP))/-2. A value of zero means that the observer is equally likely to make a positive (e.g. 'target present') response as a negative (absent) response. Positive values mean that the observer is more likely to say "absent" (a "conservative" criterion). Negative values mean the observer is more likely to say "present" (a "liberal" criterion). Liberal and conservative have no political connotations in this case. Criterion values almost always fall between -2 and 2.

Outcome measures

Outcome measures
Measure	Experiment n=12 Participants All participants are tested in all conditions of this experiment.
Criterion Second Reader, Prevalence = 0.33	.23 criterion (c) Standard Error .06
Criterion Baseline, Prevalence = 0.1	.7 criterion (c) Standard Error .14
Criterion Baseline, Prevalence = 0.33	.24 criterion (c) Standard Error .09
Criterion Baseline, Prevalence = 0.67	-.26 criterion (c) Standard Error .08
Criterion Baseline, Prevalence = 0.9	-.54 criterion (c) Standard Error .14
Criterion Second Reader, Prevalence = 0.1	.47 criterion (c) Standard Error .13
Criterion Second Reader, Prevalence = 0.67	-.16 criterion (c) Standard Error 05
Criterion Second Reader, Prevalence = 0.9	-.35 criterion (c) Standard Error .10

SECONDARY outcome

Timeframe: Data are collected within a session of about an hour.

This is the measure of how long it takes to make a response.

Outcome measures

Outcome measures
Measure	Experiment n=12 Participants All participants are tested in all conditions of this experiment.
Reaction Time Baseline, Prevalence =0.1	543 Response Time (msec) Standard Error 56
Reaction Time Baseline, Prevalence =0.3	735 Response Time (msec) Standard Error 108
Reaction Time Baseline, Prevalence =0.67	620 Response Time (msec) Standard Error 70
Reaction Time Baseline, Prevalence =0.9	592 Response Time (msec) Standard Error 74
Reaction Time Second Reader, Prevalence =0.1	510 Response Time (msec) Standard Error 49
Reaction Time Second Reader, Prevalence =0.33	772 Response Time (msec) Standard Error 77
Reaction Time Second Reader, Prevalence =0.67	963 Response Time (msec) Standard Error 9
Reaction Time Second Reader, Prevalence =0.9	557 Response Time (msec) Standard Error 66

Adverse Events

Experiment

Serious events: 0 serious events

Other events: 0 other events

Deaths: 0 deaths

Serious adverse events

Adverse event data not reported

Other adverse events

Adverse event data not reported

Additional Information

Jeremy M Wolfe, Professor and Principle Investigator

Brigham & Women's Hospital

Phone: 6178511166

Email: [email protected]

Results disclosure agreements

Principal investigator is a sponsor employee
Publication restrictions are in place