Audio Anti-Spoofing Detection: A Survey

The availability of smart devices leads to an exponential increase inmultimedia content. However, the rapid advancements in deep learning have givenrise to sophisticated algorithms capable of manipulating or creating multimediafake content, known as Deepfake. Audio Deepfakes pose a significant threat byproducing highly realistic voices, thus facilitating the spread ofmisinformation. To address this issue, numerous audio anti-spoofing detectionchallenges have been organized to foster the development of anti-spoofingcountermeasures. This survey paper presents a comprehensive review of everycomponent within the detection pipeline, including algorithm architectures,optimization techniques, application generalizability, evaluation metrics,performance comparisons, available datasets, and open-source availability. Foreach aspect, we conduct a systematic evaluation of the recent advancements,along with discussions on existing challenges. Additionally, we also exploreemerging research topics on audio anti-spoofing, including partial spoofingdetection, cross-dataset evaluation, and adversarial attack defence, whileproposing some promising research directions for future work. This survey papernot only identifies the current state-of-the-art to establish strong baselinesfor future experiments but also guides future researchers on a clear path forunderstanding and enhancing the audio anti-spoofing detection mechanisms.

Further reading