A Privacy-Preserving OCR-LLM System for Coronary Syndrome Subtyping From Admission HPI: Multicenter Validation in China and the US

NCT07449429 · Status: NOT_YET_RECRUITING · Type: OBSERVATIONAL · Enrollment: 10

Last updated 2026-03-04

No results posted yet for this study

Summary

This study develops and validates a privacy-preserving OCR-LLM pipeline that converts admission history of present illness (HPI) records into structured coronary syndrome subtypes (STEMI, NSTEMI, unstable angina, and chronic coronary syndrome). The system first extracts text from de-identified HPI images using locally deployed OCR, then applies large language models with a fixed diagnostic prompt to generate subtype classification and evidence. Performance is evaluated in an internal validation cohort and multiple external datasets covering heterogeneous EHR templates, emergency department cases, and an English dataset from MIMIC-IV. A clinician usability study assesses changes in diagnostic accuracy and time with and without tool assistance.

Conditions

  • Coronary Artery Disease (CAD) (E.G., Angina, Myocardial Infarction, and Atherosclerotic Heart Disease (ASHD))
  • Acute Coronary Syndromes
  • ST-segment Elevation Myocardial Infarction (STEMI)
  • Non-ST-Segment Elevation Myocardial Infarction (NSTEMI)

Interventions

DEVICE

OCR-Prompt-LLM Information Extraction and Classification Workflow (OCR-Prompt-LLM)

An automated clinical data management workflow integrating Optical Character Recognition (OCR), optimized prompt engineering, and large language models (LLMs). The system processes unstructured inpatient/ED records (primarily admission history of present illness and related narrative text) to extract prespecified key clinical indicators (e.g., left ventricular ejection fraction, coronary syndrome subtype, medications) and to classify cases into prespecified coronary artery disease categories (e.g., unstable angina, STEMI, NSTEMI, chronic coronary syndrome). The workflow outputs structured fields and a classification result with supporting evidence excerpts.

DEVICE

Manual Clinical Data Review

Standard manual process in which experienced clinicians review patient medical records and extract the same prespecified clinical indicators and coronary artery disease categories using routine clinical judgment and documentation review. This manual abstraction serves as the human benchmark for comparing diagnostic accuracy, completeness, and operational efficiency against the automated OCR-Prompt-LLM workflow.

Sponsors & Collaborators

  • China National Center for Cardiovascular Diseases

    lead OTHER_GOV

Eligibility

Min Age
18 Years
Sex
ALL
Healthy Volunteers
No

Timeline & Regulatory

Start
2026-02-28
Primary Completion
2026-03-08
Completion
2026-03-08

More Related Trials

Read the full study record

This page highlights key information. For complete eligibility criteria, study locations, investigator contacts, and the full protocol, visit the original record on ClinicalTrials.gov.

View NCT07449429 on ClinicalTrials.gov