Peter Wittenburg, Research Data Alliance (RDA)


Despite a number of exceptions data practices in general do not allow easy integration and re-purposing to extract new knowledge. A European overview with about 120 intensive interactions with experts from different disciplines and from different types of institutions indicates that still much legacy is being created that can only be made re-usable by investing large sums of capital in curation. It also indicates that senior researchers know about the inefficiency of current data intensive research, but they hesitate to invest since in general we lack guidance through a huge solution space and we lack data professionals who could change practices.

On the other hand we see in the realm of Open Science and Open Data that the pressure to repurpose data (and also tools) and thus enable innovation is increasing. The question is then how we can accelerate agreement finding towards best practices which then help to make data work much more efficient and thus reduce costs. Science can play the role of pioneering again, since the large companies simply have their business in mind – the role the big companies plaid when the basics of Internet were discussed some decades ago.

Science is increasingly global, since the challenges are global or since many scientific questions can only be answered when a global perspective is taken. Despite traditional structures that are still being maintained science is increasingly interdisciplinary. On the other hand trends showed clearly that many data issues (including analytics) are discipline-unspecific. Thus successful agreement forming needs to be cross-boundary (countries, disciplines). This may result in a difficult and time-consuming process.

The Research Data Alliance (RDA) has been established therefore to speed up acceleration forming across disciplines and national boundaries. It’s key elements are “bottom-up process driven by practitioners”, “results to overcome specific barriers within 18 months” and “rough consensus”. These principles have led to the first 5 concrete outputs within the first 20 months of life and a few other activities such as “Data Fabric” that tries to put the various activities on a common landscape to show their relationships and to harmonize them. Still RDA is a very young initiative and taking over the main principles of action of the Internet community does not give guarantees of success. But it is a chance which we need to make use of.

The talk will discuss data principles such as G8 and their implications, widely agreed common trends, and sketch the needed components and services that are required to turn principles and trends into practice. On this background I will illustrate the first concrete RDA results.
Having an education as Electrical Engineer with specialization in Digital Signal Processing and Computer Science I worked as Technical Director of the Max Planck Institute for Psycholinguistics from 1976 to 2012 being responsible for the methodological and technological facilities and innovation. At this MPI all kinds of state-of-the-art methods and technologies were applied to understand how the human brain processes and acquires languages.
From 2012 to 2014 I acted as director of the newly founded language archive developing professional software technology and setting up world's largest archive of language material.
From 2014 I joined the Max Planck Center for Data and Computation as Senior Advisor and focused my work on data infrastructure work and the Research Data Alliance of which I was founding member.
From September 2015 on I will coordinate the European RDA project. From 1988 until 2014 I was member of the central IT Advisory Board of the Max Planck Society and from 2000 I participated in many European and national projects in leading roles.


