NOVA-VAD, a lightweight and explainable Voice Activity Detector, achieves 93% accuracy on noisy audio from the UrbanSound8K dataset, outperforming WebRTC (58%), Pyannote (62%), and Silero (87%). It uses only scikit-learn, requires no GPU, and provides feature importance and confidence scores in plain English.
NOVA-VAD beats Silero, Pyannote, and WebRTC on noisy audio with 93% accuracy
from English