Clarity in Motion Phase-1 Perceptual Study of Speech Intelligibility

NCT07020728 · Status: COMPLETED · Phase: NA · Type: INTERVENTIONAL · Enrollment: 72

Last updated 2025-06-25

No results posted yet for this study

Summary

Participants received a bilateral pure-tone hearing screen administered by the research team. All potential participants who failed the hearing screen were provided with information about its meaning and referral for further audiological testing.

Participants who passed the hearing screen and other inclusion criteria were divided into 6 groups, each of which were presented with 144 stimuli equally distributed among processing conditions. Listeners choose a comfortable listening level using supplied headphones and were able to control the rate of presentation. Following a short practice session, listeners were be asked to transcribe each target sentence. The intelligibility of each stimulus was estimated by determining the mean percentage of content words correctly transcribed. After transcription, listeners were asked for two qualitative judgments: (1) the "clarity" of the stimulus, and (2) the "listening effort" involved. The quality of each stimulus was estimated by the median quality judgment, and the effort likewise. Listening sessions were located in a quiet room and presentation was controlled by the Superlab presentation software program.

The Stimuli consisted of audio recordings of target spondaic words embedded in a carrier sentence produced by a male and a female native speaker of American English recorded under quiet conditions. Each stimulus presented to the listeners for identification was either unmasked pristine speech or speech that had been processed in one of five ways with different mixtures of noise and sensor movement. The latter are identified as QoS Levels 1-5.

Collectively, the estimates of word intelligibility, clarity, and listening effort under the different conditions shed light on the effectiveness with which the tested algorithm preserves listener intelligibility with acceptable effort and quality.

Conditions

  • Healthy

Interventions

BEHAVIORAL

Solo: Unmasked Speech Stimuli

Speech stimuli recorded using non-moving speakers and mics. No masking sources present. No BSS applied to multi-channel recordings. Very high output QoS values.

BEHAVIORAL

Raw: Fully masked speech--no motion stimuli

Speech stimuli recorded using non-moving speakers and mics. All masking sources present. No speech separation or extraction methods applied to multi-channel recordings. Very low output QoS values.

BEHAVIORAL

StatScrub: Extracted Speech--no motion stimuli

Speech stimuli recorded using non-moving speakers and mics. All masking sources present. Joint ACES scrubbing of both noise sources applied to multi-channel recordings. Very high output QoS values.

BEHAVIORAL

SlideSpch: Scrubbed Speech emitted from linearly moving speaker stimuli

Speech stimuli recorded using linearly moving speech source and stationary masking sources and mics. All masking sources present. Joint ACES scrubbing of both noise sources applied to multi-channel recordings. Moderately high output QoS values.

BEHAVIORAL

SlideNoise: Speech Scrubbed from linearly moving and stationary noise stimuli

Mixed speech and noise sources recorded using a stationary speech source, a stationary noise source, and a linearly moving noise source. A valid source hypothesis of the speech source is used to extract the speech source. High output QoS values.

BEHAVIORAL

SlideMic: Stationary sources scrubbed from a linearly moving mic stimuli

Mixed speech and noise sources recorded using all stationary sources, and a linearly moving microphone (mic 1). Joint ACES scrubbing of both noise sources is used to reduce the response of Mic 1 to a residue of speech. Low output QoS values.

Sponsors & Collaborators

  • National Institute on Deafness and Other Communication Disorders (NIDCD)

    collaborator NIH
  • University of Cincinnati

    collaborator OTHER
  • Speech Technology and Applied Research Corp.

    lead INDUSTRY

Principal Investigators

  • Richard S Goldhor, PhD · Speech Technology & Applied Research Corp.

Study Design

Allocation
RANDOMIZED
Purpose
BASIC_SCIENCE
Masking
SINGLE
Model
CROSSOVER

Eligibility

Min Age
18 Years
Sex
ALL
Healthy Volunteers
Yes

Timeline & Regulatory

Start
2024-08-01
Primary Completion
2024-08-31
Completion
2024-08-31

Countries

  • United States

Study Locations

More Related Trials

Read the full study record

This page highlights key information. For complete eligibility criteria, study locations, investigator contacts, and the full protocol, visit the original record on ClinicalTrials.gov.

View NCT07020728 on ClinicalTrials.gov