Trial Outcomes & Findings for Assessing the Performance of Artificial Intelligence (AI)-Augmented Electronic Health Record (EHR) Data Abstraction for Clinical Trial Patient Screening (NCT NCT06561217)

Results Overview

The primary outcome measured was mean chart-level accuracy, defined as the percentage of elements identified by clinical research coordinators among all elements in the gold-standard set, measured for each chart, and averaged across all charts. Research coordinator-abstracted responses were identified as being accurate when they exactly matched with the gold-standard set. The gold-standard set was determined by 2-3 clinicians blinded to experimental arms.

Recruitment status

COMPLETED

Target enrollment

355 participants

Primary outcome timeframe

1 year

Results posted on

2025-07-25

Participant Flow

Each record will be reviewed via all 3 arms, so the total enrollment will be 355.

Participant milestones

Participant milestones
Measure	All Participants Patients that underwent all chart reviews (Human-alone, AI-alone, and Human + AI)
Human-alone STARTED	355
Human-alone COMPLETED	355
Human-alone NOT COMPLETED	0
AI-alone STARTED	355
AI-alone COMPLETED	355
AI-alone NOT COMPLETED	0
Human + AI STARTED	355
Human + AI COMPLETED	355
Human + AI NOT COMPLETED	0

Reasons for withdrawal

Withdrawal data not reported

Baseline Characteristics

Sex/gender were not collected from any participant.

Baseline characteristics by cohort

Baseline characteristics by cohort
Measure	All Participants n=355 Participants Patients who underwent Human-alone, AI-alone, and Human+AI chart review
Age, Customized <18 years	0 Participants n=355 Participants
Age, Customized ≥18 years	355 Participants n=355 Participants
Region of Enrollment United States	355 participants n=355 Participants
Cancer Type Non-Small Cell Lung Cancer	195 Participants n=355 Participants
Cancer Type Colorectal Cancer	160 Participants n=355 Participants

PRIMARY outcome

Timeframe: 1 year

Population: The study population was drawn from a 15-physician community oncology practice in California serving patients from urban and surrounding rural communities. This study cohort consisted of unstructured medical records from patients within the dataset with 1) a diagnosis of non-small cell lung cancer (NSCLC) or colorectal cancer (CrCa), 2) a minimum of five clinical documents available, and 3) the most recent document being within five years from the time of data extraction.

The primary outcome measured was mean chart-level accuracy, defined as the percentage of elements identified by clinical research coordinators among all elements in the gold-standard set, measured for each chart, and averaged across all charts. Research coordinator-abstracted responses were identified as being accurate when they exactly matched with the gold-standard set. The gold-standard set was determined by 2-3 clinicians blinded to experimental arms.

Outcome measures

Outcome measures
Measure	Human + AI n=355 Participants Charts reviewed by AI-augmented human reviewers	Human-alone n=355 Participants Charts reviewed by human abstraction alone - no AI support.	AI-alone n=355 Participants Charts reviewed by AI model alone, without Human support
Abstracted Chart-level Accuracy	76.10 % of elements correctly abstracted Standard Deviation 20.59	71.48 % of elements correctly abstracted Standard Deviation 24.92	59.92 % of elements correctly abstracted Standard Deviation 23.75

SECONDARY outcome

Timeframe: 1 year

Population: The study population was the same as the study population described in the primary outcome analysis population description section. However, AI-alone chart reviews were NOT analyzed for efficiency. Data for the secondary outcome of efficiency was not collected for the AI-alone arm and therefore cannot be reported in the outcome table. This was prespecified, as its fully automated nature renders direct comparison with human-involved workflows inappropriate and uninformative.

Efficiency was calculated as the number of minutes spent on each chart abstraction.

Outcome measures

Outcome measures
Measure	Human + AI n=355 Participants Charts reviewed by AI-augmented human reviewers	Human-alone n=355 Participants Charts reviewed by human abstraction alone - no AI support.	AI-alone Charts reviewed by AI model alone, without Human support
Efficiency of Chart-level Abstraction (in Minutes)	32.12 Number of minutes spent on abstraction Interval 20.55 to 48.67	31.75 Number of minutes spent on abstraction Interval 20.27 to 49.34	—

Adverse Events

AI-alone

Serious events: 0 serious events

Other events: 0 other events

Deaths: 0 deaths

Human-alone

Serious events: 0 serious events

Other events: 0 other events

Deaths: 0 deaths

Human + AI

Serious events: 0 serious events

Other events: 0 other events

Deaths: 0 deaths

Serious adverse events

Adverse event data not reported

Other adverse events

Adverse event data not reported

Additional Information

Ravi B. Parikh, MD, MPP

Emory University School of Medicine

Phone: (352) 422-4285

Email: [email protected]

Results disclosure agreements

Principal investigator is a sponsor employee
Publication restrictions are in place