Specializing Static and Contextual Embeddings in the Medical Domain Using Knowledge Graphs: Let's Keep It Simple - Information, Langue Ecrite et Signée Accéder directement au contenu
Communication Dans Un Congrès Année : 2022

Specializing Static and Contextual Embeddings in the Medical Domain Using Knowledge Graphs: Let's Keep It Simple

Résumé

Domain adaptation of word embeddings has mainly been explored in the context of retraining general models on large specialized corpora. While this usually yields good results, we argue that knowledge graphs, which are used less frequently, could also be utilized to enhance existing representations with specialized knowledge. In this work, we aim to shed some light on whether such knowledge injection could be achieved using a basic set of tools: graph-level embeddings and concatenation. To that end, we adopt an incremental approach where we first demonstrate that static embeddings can indeed be improved through concatenation with in-domain node2vec representations. Then, we validate this approach on contextual models and generalize it further by proposing a variant of BERT that incorporates knowledge embeddings within its hidden states through the same process of concatenation. We show that this variant outperforms plain retraining on several specialized tasks, then discuss how this simple approach could be improved further. Both our code and pre-trained models are open-sourced for future research. In this work, we conduct experiments that target the medical domain and the English language.
Fichier principal
Vignette du fichier
ElBoukkouri_LOUHI2022.pdf (1.27 Mo) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
licence : CC BY - Paternité

Dates et versions

hal-04046746 , version 1 (26-03-2023)

Licence

Paternité

Identifiants

  • HAL Id : hal-04046746 , version 1

Citer

Hicham El Boukkouri, Olivier Ferret, Thomas Lavergne, Pierre Zweigenbaum. Specializing Static and Contextual Embeddings in the Medical Domain Using Knowledge Graphs: Let's Keep It Simple. International Workshop on Health Text Mining and Information Analysis (LOUHI), Dec 2022, Abu Dhabi (online), United Arab Emirates. ⟨hal-04046746⟩
66 Consultations
22 Téléchargements

Partager

Gmail Facebook X LinkedIn More