Abstract:
—We present a variational Bayesian algorithm for joint
speech enhancement and speaker identification that makes use
of speaker dependent speech priors. Our work is built on the
intuition that speaker dependent priors would work better than
priors that attempt to capture global speech properties. We derive
an iterative algorithm that exchanges information between the
speech enhancement and speaker identification tasks. With cleaner
speech we are able to make better identification decisions and
with the speaker dependent priors we are able to improve speech
enhancement performance. We present experimental results
using the TIMIT data set which confirm the speech enhancement
performance of the algorithm by measuring signal-to-noise (SNR)
ratio improvement and perceptual quality improvement via the
Perceptual Evaluation of Speech Quality (PESQ) score. We also
demonstrate the ability of the algorithm to perform voice activity
detection (VAD). The experimental results also demonstrate that
speaker identification accuracy is improved.