Trial Outcomes & Findings for Assessing the Performance of Artificial Intelligence (AI)-Augmented Electronic Health Record (EHR) Data Abstraction for Clinical Trial Patient Screening (NCT NCT06561217)
NCT ID: NCT06561217
Last Updated: 2025-07-25
Results Overview
The primary outcome measured was mean chart-level accuracy, defined as the percentage of elements identified by clinical research coordinators among all elements in the gold-standard set, measured for each chart, and averaged across all charts. Research coordinator-abstracted responses were identified as being accurate when they exactly matched with the gold-standard set. The gold-standard set was determined by 2-3 clinicians blinded to experimental arms.
COMPLETED
355 participants
1 year
2025-07-25
Participant Flow
Each record will be reviewed via all 3 arms, so the total enrollment will be 355.
Participant milestones
| Measure |
All Participants
Patients that underwent all chart reviews (Human-alone, AI-alone, and Human + AI)
|
|---|---|
|
Human-alone
STARTED
|
355
|
|
Human-alone
COMPLETED
|
355
|
|
Human-alone
NOT COMPLETED
|
0
|
|
AI-alone
STARTED
|
355
|
|
AI-alone
COMPLETED
|
355
|
|
AI-alone
NOT COMPLETED
|
0
|
|
Human + AI
STARTED
|
355
|
|
Human + AI
COMPLETED
|
355
|
|
Human + AI
NOT COMPLETED
|
0
|
Reasons for withdrawal
Withdrawal data not reported
Baseline Characteristics
Sex/gender were not collected from any participant.
Baseline characteristics by cohort
| Measure |
All Participants
n=355 Participants
Patients who underwent Human-alone, AI-alone, and Human+AI chart review
|
|---|---|
|
Age, Customized
<18 years
|
0 Participants
n=355 Participants
|
|
Age, Customized
≥18 years
|
355 Participants
n=355 Participants
|
|
Region of Enrollment
United States
|
355 participants
n=355 Participants
|
|
Cancer Type
Non-Small Cell Lung Cancer
|
195 Participants
n=355 Participants
|
|
Cancer Type
Colorectal Cancer
|
160 Participants
n=355 Participants
|
PRIMARY outcome
Timeframe: 1 yearPopulation: The study population was drawn from a 15-physician community oncology practice in California serving patients from urban and surrounding rural communities. This study cohort consisted of unstructured medical records from patients within the dataset with 1) a diagnosis of non-small cell lung cancer (NSCLC) or colorectal cancer (CrCa), 2) a minimum of five clinical documents available, and 3) the most recent document being within five years from the time of data extraction.
The primary outcome measured was mean chart-level accuracy, defined as the percentage of elements identified by clinical research coordinators among all elements in the gold-standard set, measured for each chart, and averaged across all charts. Research coordinator-abstracted responses were identified as being accurate when they exactly matched with the gold-standard set. The gold-standard set was determined by 2-3 clinicians blinded to experimental arms.
Outcome measures
| Measure |
Human + AI
n=355 Participants
Charts reviewed by AI-augmented human reviewers
|
Human-alone
n=355 Participants
Charts reviewed by human abstraction alone - no AI support.
|
AI-alone
n=355 Participants
Charts reviewed by AI model alone, without Human support
|
|---|---|---|---|
|
Abstracted Chart-level Accuracy
|
76.10 % of elements correctly abstracted
Standard Deviation 20.59
|
71.48 % of elements correctly abstracted
Standard Deviation 24.92
|
59.92 % of elements correctly abstracted
Standard Deviation 23.75
|
SECONDARY outcome
Timeframe: 1 yearPopulation: The study population was the same as the study population described in the primary outcome analysis population description section. However, AI-alone chart reviews were NOT analyzed for efficiency. Data for the secondary outcome of efficiency was not collected for the AI-alone arm and therefore cannot be reported in the outcome table. This was prespecified, as its fully automated nature renders direct comparison with human-involved workflows inappropriate and uninformative.
Efficiency was calculated as the number of minutes spent on each chart abstraction.
Outcome measures
| Measure |
Human + AI
n=355 Participants
Charts reviewed by AI-augmented human reviewers
|
Human-alone
n=355 Participants
Charts reviewed by human abstraction alone - no AI support.
|
AI-alone
Charts reviewed by AI model alone, without Human support
|
|---|---|---|---|
|
Efficiency of Chart-level Abstraction (in Minutes)
|
32.12 Number of minutes spent on abstraction
Interval 20.55 to 48.67
|
31.75 Number of minutes spent on abstraction
Interval 20.27 to 49.34
|
—
|
Adverse Events
AI-alone
Human-alone
Human + AI
Serious adverse events
Adverse event data not reported
Other adverse events
Adverse event data not reported
Additional Information
Ravi B. Parikh, MD, MPP
Emory University School of Medicine
Results disclosure agreements
- Principal investigator is a sponsor employee
- Publication restrictions are in place