The Interspeech 2024 Challenge on Speech Processing Using Discrete Units
On this page
Representing speech and audio signals in discrete units has become acompelling alternative to traditional high-dimensional feature vectors.Numerous studies have highlighted the efficacy of discrete units in variousapplications such as speech compression and restoration, speech recognition,and speech generation. To foster exploration in this domain, we introduce theInterspeech 2024 Challenge, which focuses on new speech processing benchmarksusing discrete units. It encompasses three pivotal tasks, namely multilingualautomatic speech recognition, text-to-speech, and singing voice synthesis, andaims to assess the potential applicability of discrete units in these tasks.This paper outlines the challenge designs and baseline descriptions. We alsocollate baseline and selected submission systems, along with preliminaryfindings, offering valuable contributions to future research in this evolvingfield.
Further reading
- Access Paper in arXiv.org