Combining Multi-Task Learning and Multi-Channel Variational Auto-Encoders to Exploit Datasets with Missing Observations -Application to Multi-Modal Neuroimaging Studies in Dementia

The joint modeling of neuroimaging data across multiple datasets requires to consistently analyze high-dimensional and heterogeneous information in presence of often non-overlapping sets of views across data samples (e.g. imaging data, clinical scores, biological measurements). This analysis is associated with the problem of missing information across datasets, which can happen in two forms: missing at random (MAR), when the absence of a view is unpredictable and does not depend on the dataset (e.g. due to data corruption); missing not at random (MNAR), when a specific view is absent by design for a specific dataset. In order to take advantage of the increased variability and sample size when pooling together observations from many cohorts and at the same time cope with the ubiquitous problem of missing information, we propose here a multi-task generative latent-variable model where the common variability across datasets stems from the estimation of a shared latent representation across views. Our formulation allows to retrieve a consistent latent representation common to all views and datasets, even in the presence of missing information. Simulations on synthetic data show that our method is able to identify a common latent representation of multi-view datasets, even when the compatibility across datasets is minimal. When jointly analyzing multi-modal neuroimaging and clinical data from real independent dementia studies, our model is able to mitigate the absence of modalities without having to discard any available information. Moreover, the common latent representation inferred with our model can be used to define robust classifiers gathering the combined information across different datasets. To conclude, both on synthetic and real data experiments, our model compared favorably to state of the art benchmark methods, providing a more powerful exploitation of multi-modal observations with missing views.

Mots clés

Multi Task Learning Missing Data Variational Autoencoders Multimodal Data Analysis OPAL-Meso

Domaines

Statistiques [stat] Apprentissage [cs.LG]

Fichier principal

elsarticle-template-harv.pdf (8.09 Mo)

Origine : Fichiers produits par l'(les) auteur(s)

Luigi Antelmi : Connectez-vous pour contacter le contributeur

https://inria.hal.science/hal-03114888

Soumis le : mardi 19 janvier 2021-11:47:30

Dernière modification le : vendredi 5 avril 2024-14:38:02

Archivage à long terme le : mardi 20 avril 2021-19:19:31

Dates et versions

hal-03114888 , version 1 (19-01-2021)

hal-03114888 , version 2 (06-05-2021)

Identifiants

HAL Id : hal-03114888 , version 1

Citer

Luigi Antelmi, Nicholas Ayache, Philippe Robert, Federica Ribaldi, Valentina Garibotto, et al.. Combining Multi-Task Learning and Multi-Channel Variational Auto-Encoders to Exploit Datasets with Missing Observations -Application to Multi-Modal Neuroimaging Studies in Dementia. 2021. ⟨hal-03114888v1⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

550 Consultations

418 Téléchargements