Construction of a Benchmark for Breast Ultrasound AI Interpretation and Performance Evaluation of Multimodal AI Models

NCT07500428 · Status: RECRUITING · Type: OBSERVATIONAL · Enrollment: 1380

Last updated 2026-03-30

No results posted yet for this study

Summary

This single-center, retrospective, observational study aims to construct a standardized benchmark evaluation system for intelligent breast ultrasound image interpretation and to systematically assess the diagnostic performance of current mainstream multimodal artificial intelligence (AI) models.

De-identified B-mode breast ultrasound images with confirmed pathological diagnoses will be retrospectively collected from the institutional archive (2018-2025) and supplemented with images from published open-access datasets. Expert radiologists with varying experience levels will independently annotate all images according to the American College of Radiology (ACR) Breast Imaging Reporting and Data System (BI-RADS) v2025 criteria, including glandular tissue composition, lesion characterization (mass vs. non-mass lesion), morphological descriptors, and final BI-RADS classification.

Baseline deep learning models (CNN-based ResNet-50 and Transformer-based USFM) will be trained to establish performance baselines and to stratify cases by diagnostic difficulty through cross-architecture consensus. Multiple multimodal large language models (MLLMs), including both general-purpose and medical-domain models, will then be evaluated via standardized API calls using BI-RADS-guided chain-of-thought prompts at temperature 0 for reproducibility.

Primary endpoints include BI-RADS classification accuracy and diagnostic AUC for benign-malignant differentiation. Model robustness and safety will be assessed through out-of-distribution rejection testing, temperature-stability experiments, and thinking-mode ablation studies. This study adheres to the FLAIR and TRIPOD-LLM reporting guidelines.

Conditions

Breast Neoplasms
Breast Diseases
Ultrasonography

Interventions

DIAGNOSTIC_TEST

Multimodal AI Model Diagnostic Evaluation

Retrospective evaluation of de-identified breast ultrasound images by multiple AI systems, including baseline deep learning models (ResNet-50, USFM) and multimodal large language models, using standardized BI-RADS-guided chain-of-thought prompts via API. No patient contact or clinical decision-making is involved.

Sponsors & Collaborators

Chinese Academy of Medical Sciences
collaborator OTHER
Peking Union Medical College Hospital
lead OTHER

Principal Investigators

Qingli Zhu, MD · Peking Union Medical College Hospital

Eligibility

Min Age: 18 Years
Max Age: 75 Years
Sex: FEMALE
Healthy Volunteers: Yes

Timeline & Regulatory

Start: 2026-03-12
Primary Completion: 2026-12-01
Completion: 2027-03-01

Countries

China

Construction of a Benchmark for Breast Ultrasound AI Interpretation and Performance Evaluation of Multimodal AI Models

Summary

Conditions

Interventions

Sponsors & Collaborators

Principal Investigators

Eligibility

Timeline & Regulatory

Countries

Study Locations

More Related Trials

Summary

Conditions

Interventions

Sponsors & Collaborators

Principal Investigators

Eligibility

Timeline & Regulatory

Countries

Study Locations

Related Clinical Trials

Using Deep Learning Methods to Analyze Automated Breast Ultrasound and Hand-held Ultrasound Images, to Establish a Diagnosis, Therapy Assessment and Prognosis Prediction Model of Breast Cancer.

Multi-center Study of Deep Learning AI in Breast Mass

Development of Artificial Intelligence System for Detection and Diagnosis of Breast Lesion Using Mammography

Using Deep Learning and Radiomics to Diagnose Benign and Malignant Breast Lesions Based on Ultrasound

Artificial Intelligence Assisted Breast Ultrasound in Breast Cancer Screening

More Related Trials