Skip to main content

Towards a FAIRer future

In a recent study, Assistant Professor Agneta Ghose from Aalborg University presents a workflow for the FAIRification of research data, which can serve as broad inspiration for Danish researchers aiming to strengthen their research through better data management practices.
By
26/03/2024 10:03
Billede
Agneta Chose
Foto: Agneta Ghose

The guidelines and recommendations I outline in the article draw upon the experiences and practices of other researchers, which I have attempted to adapt to the LCA domain. In this way, the article serves as an example of how we as researchers can - and should - draw inspiration from good data management solutions across research domains." - Agneta Ghose, Assistant Professor, Aalborg University.

Our models are only as good as the data behind them

Agneta Ghose conducts research in Life Cycle Assessment (LCA), a tool used to assess the environmental impact of products and processes from cradle to grave. According to Ghose, LCA is garnering increasing attention as consumers and businesses focus on the environmental impacts of various products, placing high demands on the handling of the data that underpins research:

"With the significant focus on our results, it is important to clarify the data upon which our modeling systems rely, for ultimately, the quality of our models depends on the data behind them." , Agneta Ghose, Assistant Professor, Aalborg University.

As a researcher, she spends much of her time locating and accessing datasets and converting them into usable formats. She regularly encounters challenges related to inadequate data management, such as missing metadata. According to her, such challenges are common and impede and complicate the research process. Her interest was piqued when she first heard about the FAIR principles at a data management workshop.

Making Data Great Again: the FAIR Data Principles

Following the workshop, she began to investigate how data management occurs within the LCA domain and other research fields and in January 2024, she published the article: "Can LCA be FAIR? Assessing the status quo and opportunities for FAIR data sharing" in the International Journal of Life Cycle Assessment.

"I wanted to examine how we as practitioners share our data and what standards are that guide us. What infrastructure do we have to support data sharing? And based on this: what opportunities exist to adopt better data management principles for our research?", Agneta Ghose.

In the article, she maps out how - and to what extent - LCA data is currently shared and made accessible to other researchers and stakeholders in accordance with FAIR principles. Additionally, she analyzes the guidelines and infrastructure available to support data sharing within LCA research and presents a series of recommendations for better data management practices.

The FAIR principles, introduced in 2016, focus on making data findable, accessible, interoperable, and reusable across different scientific domains. The implementation of FAIR principles has been supported by initiatives such as the European Open Science Cloud (EOSC) and similar international projects aimed at promoting data sharing across disciplines and geographical boundaries. At the request of the Ministry of Education and Research, DeiC has developed a national strategy for data management based on FAIR principles, which has formed the basis for the implementation of FAIR data in Denmark since 2021.

Findability: Data should be easy to find for both humans and machines. This is achieved through the use of unique and persistent identifiers (PIDs), clear metadata, and appropriate placement in searchable registries or databases.

Accessibility: Data should be easily accessible to users, whether they are public or restricted. This entails data being open, or there being clear and reasonable access conditions, such as license agreements or permissions.

Interoperability: Data should be able to be combined and integrated with other data, regardless of origin or format. This requires uniform data formats, standardized terminologies and protocols, and clear relationships between data elements.

Reusability: Data should be easy to reuse for various purposes, including reproducing research results, integrating into new studies, or using in different contexts. This requires data to be well-structured, documented, and free from unnecessary limitations or barriers.

To achieve these goals, support is needed through a so-called FAIR ecosystem, which includes various elements such as data management plans (DMPs), data repositories, technological support, data policies and standards, and data producers/users.

Ghose particularly highlights Data Management plans as an important element. Rather than just being a record of data storage and backup information, DMPs should be living documents containing information about all data and related results in a research project. They should be regularly updated and comply with FAIR principles by being open and accessible, containing essential metadata, and stored in interoperable formats in trusted repositories.

So, how's it going? A deep dive into LCA data sharing

To assess the current practice of data sharing, Ghose examined 25 peer-reviewed LCA articles, and the study reveals several challenges regarding FAIR. Out of the 25 publications, only one contained a separate PID for the dataset, which is crucial for its findability. Additionally, only seven out of the 25 studies were published with an open access license, limiting their accessibility to other researchers. The study also identifies several challenges regarding data format interoperability. Only three studies shared LCI data in an interoperable format, while others shared data in PDF or Word documents. The results underscore the need for clear guidelines for sharing LCA data to ensure their FAIRness.

Furthermore, the study highlights the need to develop standards for LCA data sharing. Currently, there are no overarching standards specifying a procedure for sharing LCA data. While the ISO standard offers technical specifications for reporting LCI data, it lacks the principles for FAIR data sharing. Data formats for LCA such as EcoSpold2 and ILCD formats must expand the template to ensure compliance with FAIR principles.

Workflow for Sharing LCA Data according to FAIR Principles: a step-by-step guide

Based on the study's results, Ghose recommends the following workflow:

Billede
FAIR workflow
Foto: https://link.springer.com/article/10.1007/s11367-024-02280-3

An overview of the FAIR data ecosystem. Can LCA be FAIR? Assessing the status quo and opportunities for FAIR data sharing, Ghose, DATA AVAILABILITY, DATA QUALITY, (2024)

  1. Data collection: Collect data from industrial activities or existing LCA studies.
  2. Data labeling: Add relevant labels to data and define relationships between them using domain-specific terms.
  3. Use of machine-readable formats: Share data in common LCA formats such as EcoSpold2 or ILCD. Also, use JSON-LD format to enable the publication of semantic data.
  4. Metadata: Define the dataset using GLAD's metadata descriptors to enhance data accessibility and discovery.
  5. License: Add a license to the dataset to ensure proper reuse. Consider different licenses, but remember that restrictive licenses can hinder reuse.
  6. Data publication: Publish the FAIR-adapted LCA data in a reliable digital repository such as Zenodo or Figshare.
  7. Use and acknowledgment of FAIR data: Encourage the reuse of FAIR data and acknowledge contributions from those supporting FAIR data.

Ghose emphasizes that the workflow is largely inspired by best practices from other research areas and can therefore also be used by researchers outside the LCA domain.

"I'm not a data management expert, but a researcher trying to find better data management practices that can strengthen the domain, I work in. And the solutions I present in the article are things I've learned from other articles on FAIR data management and adapted to LCA, so I also believe that other researchers can be inspired by them. Data management is of crucial importance in all research domains. And as the research world becomes increasingly digitized, the challenges we face become more and more similar. Therefore, there is also reason to look for solutions and seek inspiration across research areas and to have a better dialogue." Agneta Ghose, Assistant Professor, Aalborg University.

Towards a FAIR future – How can DeiC contribute?

As an educator, Agneta Ghose herself is very aware of emphasizing the importance of good data management practices when teaching young researchers. However, she also points out that while certain solutions can be handled by individual researchers, others require greater effort in the form of introducing standards and domain-specific repositories. She also calls for increased focus on competence and technology development, as well as strategic funding and recognition of best practice to ensure knowledge sharing and progress in the field.

While establishing standards and research-specific repositories can be addressed within each research domain, Agneta Ghose emphasizes DeiC's role in providing the necessary infrastructure, for example, regarding data management plans, as well as in the role of knowledge brokers.

"I believe that DeiC can play an important role in contributing solutions at a more overarching level. Among other things, by identifying examples of good solutions and making them available in the form of workflows and guidelines for researchers across research areas and by offering support in terms of how best to utilize the resources available." Agneta Ghose, Assistant Professor, Aalborg University.