Because good research needs good data

RDMF12: notes from breakout group 3 (Crossing Boundaries)

Martin Donnelly | 02 December 2014

"Crossing boundaries: How do we cope with crossing institutional boundaries to communicate with systems we can’t control?"

This breakout group was proposed and chaired by Jez Cope, Research Data Support Manager at Imperial College London. Jez began by outlining his motivation for proposing the topic, namely a recent conversation with an astrophysicist who needs to process data originating from multiple facilities, in different parts of the world, and the challenges presented in capturing this in an institutional data catalogue.

Part of the challenge was knowing what internal systems were in use and relevant. One of the first things we did in this RDMF event was create a crowdsourced list of data-relevant systems, and collectively we came up with 100 of them. This was by no means exhaustive, and it is a huge challenge for a single institutional data manager to develop and maintain a coherent and accurate picture of the changing landscape of systems used within – and beyond – his or her institution.

Some frustrations were expressed that new systems were still being designed and rolled out without interoperability being ‘baked in’ from the start. To an extent this is understandable when commercial vendors want to lock their customers in to a particular solution, but it’s annoying when publicly-backed systems such as Researchfish don’t have the flexibility to automatically pass information between it and other systems. In library systems, workarounds and manual workflows have been developed to overcome these problems, but it remains an unnecessary irritant, and a drain on resources. The REF software and EPrints plugins were cited as examples of this.

Furthermore, the use of standards and APIs help to mitigate the risk of future change, and these should be encouraged/demanded wherever possible. One problem with the procurement process is that the person making the ultimate decision over which system to invest in often has a singular perspective, and so long as the software meets his or her direct needs, little additional thought is given to the other systems which will need to interface with it in the future!

We spoke a little about the researchers’ perspective. They only tend to see this as a problem if it means they get hassled to produce information, or worse produce and enter it more than once. A certain amount of this can be taken care of by institutional/library support staff, but there is a question mark over whether this is the best use of library staff time, and the general consensus is that only the researchers themselves can produce sufficiently high quality metadata, so incompatible (or insufficiently interoperable) subsystems are actually quite a serious problem.

We came up with a list of three concrete outcomes that could emerge from the discussion:

  1. A checklist of software functionality that libraries/institutional data coordinators can provide to senior managers to help them choose products, although we did note that Jisc’s single procurement framework for CRIS systems is a start, at least. This would help to formulate requirements and then subsequently match them against potential solutions, potentially using the Tim Berners-Lee ‘5 star’ Open Data metric to rate offerings.
  2. We identified a potential role for the RDA to provide guidance on implementation of standards for data governance and management, in-keeping with its consensus-driven approach. This would require all institutions to sign up voluntarily to a code of conduct, specifically committing to working with owners of existing standards instead of tweaking to fit their own special use cases! (We also wondered whether there might be a role for DCC and/or Jisc to do the lobbying/advocacy work with the funders and publishers, and whether this might fit into the proposed RDRDS 2?)
  3. And finally, a concrete outcome that we can all start on immediately, namely starting to build relationships with other support services within your institution BEFORE you need to ask them for something — find some contacts at your level and invite them for coffee and a chat, try to understand what they do and help them understand what you do! We noted that system silos and human silos are different problems, but very much related.

The major selling points of these outcomes would be genuine efficiencies in terms of time and money. But widespread cooperation is by no means guaranteed, as it’s not in vendors’ interests to make it easy to chop and change between systems. They want you locked in.

(Report by Martin Donnelly, DCC)