speech-detection
Cross-platform VAD & Audio Event Detection toolkit — Python (PyPI) + TypeScript (npm) + C API. DFSMN models ~2MB, 200x real-time. Runs everywhere: native, browser (WASM), Node.js.
Automagically synchronize subtitles with video.
CNN-based audio segmentation toolkit. Allows to detect speech, music, noise and speaker gender. Has been designed for large scale gender equality studies based on speech time per gender.
Synchronize your subtitles with mahcine learning
Language-agnostic synchronization of subtitles with video via speech detection.