LEARNING INTERPRETABLE FILTERS IN WAV-UNET FOR SPEECH ENHANCEMENT - Equipe Signal, Statistique et Apprentissage Accéder directement au contenu
Communication Dans Un Congrès Année : 2023

LEARNING INTERPRETABLE FILTERS IN WAV-UNET FOR SPEECH ENHANCEMENT

Félix Mathieu
  • Fonction : Auteur
  • PersonId : 1367933
  • IdRef : 276549813
Thomas Courtat
  • Fonction : Auteur

Résumé

Due to their performances, deep neural networks have emerged as a major method in nearly all modern audio processing applications. Deep neural networks can be used to estimate some parameters or hyperparameters of a model, or in some cases the entire model in an end-to-end fashion. Although deep learning can lead to state of the art performances, they also suffer from inherent weaknesses as they usually remain complex and non interpretable to a large extent. For instance, the internal filters used in each layers are chosen in an adhoc manner with only a loose relation with the nature of the processed signal. We propose in this paper an approach to learn interpretable filters within a specific neural architecture which allow to better understand the behaviour of the neural network and to reduce its complexity. We validate the approach on a task of speech enhancement and show that the gain in interpretability does not degrade the performance of the model.
Fichier principal
Vignette du fichier
MATHIEU_ICASSP_2023-2.pdf (466.8 Ko) Télécharger le fichier
Origine : Fichiers produits par l'(les) auteur(s)

Dates et versions

hal-04048829 , version 1 (28-03-2023)

Identifiants

  • HAL Id : hal-04048829 , version 1

Citer

Félix Mathieu, Thomas Courtat, Gael Richard, Geoffroy Peeters. LEARNING INTERPRETABLE FILTERS IN WAV-UNET FOR SPEECH ENHANCEMENT. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), Jun 2023, Rhodes, Greece. ⟨hal-04048829⟩
79 Consultations
298 Téléchargements

Partager

Gmail Facebook X LinkedIn More