TDAC, the First Time-Domain Astrophysics Corpus: Analysis and First Experiments on Named Entity Recognition - Laboratoire d'Informatique pour la Mécanique et les Sciences de l'Ingénieur Accéder directement au contenu
Communication Dans Un Congrès Année : 2022

TDAC, the First Time-Domain Astrophysics Corpus: Analysis and First Experiments on Named Entity Recognition

Résumé

The increased interest in time-domain astronomy over the last decades has resulted in a substantial increase in observation report publication leading to a saturation of how astrophysicists read, analyze and classify information. Due to the short life span of the detected astronomical events, information related to the characterization of new phenomena has to be communicated and analyzed very rapidly to allow other observatories to react and conduct their follow-up observations. This paper introduces TDAC: a Time-Domain Astrophysics Corpus. TDAC is the first corpus based on astrophysical observation reports. We also present the NLP experiments we made for named entity recognition based on annotations we made and annotations from the WIESP DEAL shared task.
Fichier principal
Vignette du fichier
Alkan_WIESP2022.pdf (356.24 Ko) Télécharger le fichier
Origine : Fichiers éditeurs autorisés sur une archive ouverte
licence : CC BY - Paternité

Dates et versions

hal-04046837 , version 1 (26-03-2023)

Licence

Paternité

Identifiants

  • HAL Id : hal-04046837 , version 1

Citer

Atilla Kaan Alkan, Cyril Grouin, Fabian Schüssler, Pierre Zweigenbaum. TDAC, the First Time-Domain Astrophysics Corpus: Analysis and First Experiments on Named Entity Recognition. Workshop on Information Extraction from Scientific Publications, Nov 2022, Taipei (Online), Taiwan. ⟨hal-04046837⟩
86 Consultations
14 Téléchargements

Partager

Gmail Facebook X LinkedIn More