wavlm
Self-Supervised Speech Pre-training and Representation Learning Toolkit
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
🦜 Synthetic Voice Detection
This repository combines `WavLM`, a powerful speech representation model from Microsoft, with `MSDD` (Multi-Scale Diarization Decoder), a state-of-the-art approach for speaker diarization from Nvidia.