Multi-disciplinary Research: Open Science Data Lake
Abstract
Open Science aims to establish an interdisciplinary exchange between researchers through knowledge sharing and open data. However, this interdisciplinary exchange requires exchanges between different research domains and there is currently no simple computerized solution to this problem. Although the data lake adapts well to the constraints of variety and volume offered by the Open Science context, it is necessary to adapt this solution to (1) the accompaniment of data with metadata having a specific metadata model depending on the domain and community of origin, (2) the cohabitation of open and closed data within the same open data management platform, and (3) a wide diversity of pre-existing research data management platforms to deal with. We propose to define the Open Science Data Lake (OSDL) by adapting the Data Lake to this particular context and allowing interoperability with pre-existing research data management platforms. We propose a functional architecture that integrates multi-model metadata management, virtual integration of externally stored (meta)data and security mechanisms to manage the openness of the platforms and data. We propose an open-source and plug-and-play technical architecture that makes adoption as easy as possible. We set up a proof-of-concept experiment to evaluate our solution with different users from the research community and show that OSDL can meet the needs of transparent multidisciplinary data research.
Origin | Files produced by the author(s) |
---|