Sound demos for "Speech Denoising in the Waveform Domain with Self-Attention"

Sound demos for "Speech Denoising in the Waveform Domain with Self-Attention"

Authors: Zhifeng Kong, Wei Ping, Ambrish Dantrey, Bryan Catanzaro

We present audio samples for the causal CleanUNet model proposed in Speech Denoising in the Waveform Domain with Self-Attention. We use CleanUNet with N=5 self attention blocks in the bottleneck layer and L1 plus high-band STFT losses. We compare CleanUNet to other SOTA models including the FAIR-denoiser and FullSubNet. The official PyTorch implementation can be found in this link

Speech Denoising on the DNS (2020) Dataset

Keyboard / Mechanical noise

Noisy	CleanUNet (ours)	FAIR-denoiser	FullSubNet	Clean (reference)

Dog barking

Noisy	CleanUNet (ours)	FAIR-denoiser	FullSubNet	Clean (reference)

Human talking

Noisy	CleanUNet (ours)	FAIR-denoiser	FullSubNet	Clean (reference)

Indoor noise

Noisy	CleanUNet (ours)	FAIR-denoiser	FullSubNet	Clean (reference)

Street noise

Noisy	CleanUNet (ours)	FAIR-denoiser	FullSubNet	Clean (reference)

Shrill noise

Noisy	CleanUNet (ours)	FAIR-denoiser	FullSubNet	Clean (reference)

Wind noise

Noisy	CleanUNet (ours)	FAIR-denoiser	FullSubNet	Clean (reference)