Center Aligned Title

Ragini Sinha, Ann-Christin Scherer, Simon Doclo, Christian Rollwage, Jan Rennies

Abstract: Speaker-conditioned target speaker extraction algorithms aim at extracting the target speaker from a mixture of multiple speakers by using additional information about the target speaker. Previous studies have evaluated the performance of these algorithms using either instrumental measures or subjective assessments with normal-hearing listeners or with hearing-impaired listeners. Notably, a previous study employing a quasi-causal algorithm reported significant intelligibility improvements for both normal-hearing and hearing-impaired listeners, while another study demonstrated that a fully causal algorithm could enhance speech intelligibility and reduce listening effort for normal-hearing listeners. Building on these findings, this study focuses on an in-depth subjective assessment of two fully causal deep neural network-based speaker-conditioned target speaker extraction algorithms with hearing-impaired listeners, both without hearing loss compensation (unaided) and with linear hearing loss compensation (aided). Three different subjective performance measurement methods were used to cover a broad range of listening conditions, namely paired comparison, speech recognition thresholds, and categorically scaled perceived listening effort. The subjective evaluation results with fifteen hearing-impaired listeners showed that one algorithm significantly reduced listening effort and improved intelligibility compared to unprocessed stimuli and the other algorithm. The data also suggest that hearing-impaired listeners experience a greater benefit in terms of listening effort (for both male and female interfering speakers) and speech recognition thresholds, especially in the presence of female interfering speakers than normal-hearing listeners, and that hearing loss compensation (linear amplification) is not required to obtain an algorithm benefit.

Evaluating Speaker-conditioned Target Speaker Extraction Algorithms for Hearing-Impaired Listeners

Audio examples (German matrix sentences)

Objective performance evaluations of Algo-1 and Algo-2:

Subjective performance evaluations of Algo-1 and Algo-2 for normal-hearing(NH), and hearing-impaired listeners in both unaided and aided conditions:

Participant-specific scores across each condition: