Because good research needs good data

Workshops

All Workshops took place on Monday 6 December 2010

Digital Curation 101 Lite

Presenters: Joy Davidson, Sarah Jones, Martin Donnelly, DCC, UK.

Agenda

Presentations

Overview:

Earlier this year the National Science Foundation (NSF) announced that it plans to mandate the provision of a 2-page data management plan with all new funding applications. The NSF are not alone in their efforts to improve accountability for data management and we are starting to see Research Councils and funding bodies around the world seek evidence that adequate and appropriate provisions for data management and curation have been considered from the outset of any new funded activity.

The DCC’s Digital Curation 101 Lite workshop will provide researchers with an overview of the key issues they need to consider and help them to develop and implement sound data management and curation plans. This one-day workshop will provide an introduction to data management and curation, the range of activities and roles that should be considered when planning and implementing new research projects, and an overview of tools that can assist with management and curation activities. A key aim for these workshops is to share data management and curation experiences and to foster critical thinking among participants. Real-life examples and lessons learned through the ongoing JISC 07/09 Research Data Management Programme projects will be presented throughout the course wherever possible. A key feature of this particular workshop will be to introduce participants to the new DCC Data Management Plan (DMP) Template and to explore how it can be used to help meet funding bodies’ data management plan requirements.

Aims: Participants in this workshop will:

  • explore how data management and curation can support and safeguard research
  • explore the digital curation lifecycle and how it relates to a range of stakeholders
  • identify the processes and activities involved in good practice for research data management
  • be aware of the free services and tools available to help you manage and curate your data

Audience: The target audience for this workshop is researchers and those who support research activity (data managers, information specialists) with funding body data management and curation mandates to fulfil. A key goal will be the integration of these communities of practice to share their experiences and to identify where, when and how they can best cooperate to meet data curation challenges.

Format: Presentations and group exercises

Length of workshop: Full-day

Cost: $85

CURATEcamp: An Unconference on Digital Curation Tools

Presenters: Declan Fleming, Michael J. Giarlo, Patricia Hswe, Pennsylvania State University & University of California, San Diego

Overview: As the community of digital curation practitioners has grown, so has the need for collaboration and community.  A small number of communities have been formed around digital curation, a few of which focus on the technical aspects of the practice.  Extant communities address the implementation and support needs of specific curation platforms, without broader focus on common services and potential points of intersection. There is however a rich ecosystem of tools, practices, and standards around these platforms, and some that require no such platforms, which have potential to benefit the wider community of practitioners.  CURATEcamp is an unconference-style workshop for practitioners of digital curation to share best practices and discuss tools and technologies in a free-form and highly interactive forum.  Topics of interest might include identifiers, versioning, transfer, packaging, object structure, file system usage, archiving/storage, metadata standards/vocabularies, discovery, and interoperability.  The unconference format ensures that all participants are actively engaged in the workshop and gives everyone an opportunity to contribute.  Activities may include roundtable discussions, presentations, whiteboard sessions, collaborative software development, and whatever else emerges from the collective creativity of participants.

Aims: CURATEcamp is an opportunity to build a community of practice around curation tools, which bridges system-specific gaps that have formed in the community. It will encourage discussion about curation tools and practices across software-, project-, and institution-specific boundaries, and attempt to identify best practices and points of collaboration across these boundaries.  The community that CURATEcamp nurtures is intended to persist beyond the end of the IDCC, so another point of discussion will be around how to maintain connections between face-to-face gatherings. The informal approach of CURATEcamp might also serve as a way to model knowledge sharing for the curator community, not unlike what occurs at BarCamp events, which are loosely structured but highly productive participatory sessions.

Audience: CURATEcamp will be of interest to digital curation practitioners (curators and technologists alike), especially those who have been using and building tools and architectures, and digital curators with experience assessing or evaluating curation tools and services.

Format:  Unconference (defined on Wikipedia as "a facilitated, participant-driven conference centered on a theme").  Facilitator will use a wiki in advance of the session to engage participants and jot down discussion ideas.  A hashtag will be suggested for participants to broadcast the session via social media as it occurs so as to involve virtual participants.

Length of workshop: Full-day

Cost: $85

Improving researchers’ competency in information handling and data management through a collaborative approach

Presenters: Joy Davidson, Digital Curation Centre (DCC); Stéphane Goldstein, Research Information Network (RIN); Simon Hodson, JISC

Overview: In 2008, the Research Information Network published a report, Mind the Skills Gap, which concluded that training for researchers in the UK on information-handling and management is uncoordinated and generally not based on any systematic assessment of needs. The report called for better coordination between relevant organisations and interest groups to ensure that professional development programmes are provided for researchers.

The Working Group on Information-Handling, co-ordinated by RIN, has been established to promote greater coordination and a more strategic approach nationally with regard to the provision of such training for HE researchers.

This workshop will demonstrate how the organisations on the Working Group (BAILER, British Library, CILIP, DCC, HEA, RIN, RLUK, SCONUL, UKCGE, UUK, Vitae) are working in collaboration on projects and activities to co-ordinate and lead development in this area. The workshop will focus particularly on exploring how the new Researcher Development Framework addresses information handling and data management; and integrating the RDF into tools, resources and information for staff supporting and training researchers in information handling and the development of information literacy.

Topics covered

  • A brief overview of the activities of the RIN Working Group to date in creating an ‘information handling and data management lens’ for the RDF.
  • The use of the RDF to highlight the importance of information handling and data management skills (information literacy)  as an integral and indispensable part of the research process – for instance through the mapping of existing good practice examples of information handling training against the RDF.
  • The use of the RDF as a benchmarking tool for graduate level data handling skills and comprehension courses, and as a means of identifying convergence with graduate level information science courses.
  • The prospects and effectiveness of collaborative working through other project examples, including initiatives by JISC and SCONUL.

Aims

Participants will:

  • Develop a greater understanding and awareness of the composition and activities of the Working Group as a potential resource and source of information.
  • Share ideas and existing practice of researcher training in information handling and data management and contribute to the development of an ‘information handling and data management lens’ for the RDF which could be transferred to other specialist areas
  • Understand how the RDF might contribute towards benchmarking and improving information handling and data management courses for researchers at different stages of their careers.

Length of workshop: Half day (morning)

Cost: $50

Introduction to the Data Curation Profile

Presenters: Scott Brandt & Jacob Carlson, Purdue University

The Data Curation Profile was developed by librarians at Purdue University and researchers at the Graduate School of Library and Information Science at the University of Illinois as a component of a two- year Institute of Museum and Library Services-supported research project on identifying faculty needs in sharing their data.  The Profiles were developed as a means of capturing requirements for specific data generated by a single scientist/scholar or labs based on their directly reported needs and preferences for these data. The Data Curation Profile tool that grew out of this research is designed for use by librarians and other information professionals as a means of launching discussions with researchers about data generated and used in areas that may be published, shared, and archived for re-use and dissemination.  Completed Profiles have a variety of potential uses including to inform policies and practices surrounding the curation of a particular data set or to help to plan for the development of data services that directly address researcher needs.

Aims: The specific objectives of this workshop are:

  • To provide attendees with a context of data curation issues from the perspective of single PI research.
  • To introduce the Data Curation Profile as a tool, discussing how it was developed, and its components.
  • To discuss the application of the Data Curation Profile tool to generate profiles of researcher's data and needs relating to the data.
  • To explore how completed profiles could be used for individual, institutional, and research purposes.

Audience: This workshop is aimed at practitioners who need or want to participate in efforts to address data as a valuable research output, particularly for dissemination and/or repository collection.  It will benefit frontline librarians/information specialists who want to be proactive with, or who need to react to, researchers who encounter problems and ask for help dealing with data. It will benefit repository managers who may be called on to work with research data as a supplementary material or as a collection within their institutional repository.  It will also benefit those who have to extend their activities to include additional approaches to dealing with research data.

Format: Presentation with discussion

Programme

12.30

Lunch

13.30

Welcome and Introductions

13.45

Data in the single PI/small lab context

14.10

Overview of Data Curation Profiles research project

15.00

Break

15.20

Data Curation Profiles Toolkit

16.10

Conducting a Profile

16.45

Current and potential use for Profiles

17.30

Close

Cost: $50

Scaling-up to Integrated Research Data Management

Organisers: Manjula Patel & Liz Lyon, DCC & UKOLN, University of Bath.

Overview: Structural Science incorporates a number of disciplines including Chemistry, Physics, Materials, Earth, Life, Medical, Engineering, and Technology.  Within these disciplinary communities scientific research is conducted at a range of differing scales involving the use of small laboratory equipment to institutional installations to large scale facilities such as the synchrotron facilities at CERN, the DIAMOND Light Source (DLS) and ISIS, based at the Science and Technology Facilities Council (STFC). With improvements in technology there is an increasing demand to make available raw, processed and derived data for validation and reanalysis purposes, necessitating data management of these types of data as well as the final results data.

It is however apparent, that many research teams capture, manage, discuss and disseminate their data in relative isolation with highly fragmented data infrastructures and poorly integrated software applications.  In addition, a low awareness of data curation and preservation issues leads to data loss and reduced productivity.

On the other hand, large centralised facilities have a responsibility to provide a data management infrastructure for their users and have spent considerable effort designing and implementing such systems. The outcome is that each large-scale facility has its own, often insular approach to data management resulting in vast ‘data silos’.

This workshop organised by the JISC funded I2S2 (Infrastructure for Integration in Structural Sciences) Project aims to explore and highlight a variety of ways currently under investigation to alleviate data management problems resulting from working at differing scales of science and across organisational boundaries.

The presentations will cover a number of different types of data including chemistry, earth science, climate change and earthquake simulation data.

Aims: The purpose of this workshop is to explore a variety of issues relating to scale and integration in terms of research data management from science being conducted at local bench top level to large-scale facilities such as at CERN and STFC.

Audience: The workshop will be of relevance to a variety of stakeholders interested in ways to improve research data management over differing scales of science and across organisational boundaries; this includes individual research scientists and large-scale facilities managers, as well as computing services and funding agencies.

Format: Presentations, panel, discussion, networking opportunities

Length of workshop: Half day - (afternoon)

Programme

12.30

Lunch

13.30

Welcome and Introductions, Liz Lyon, DCC & UKOLN, University of Bath

13.35

Integrated research data management in the Structural Sciences, Manjula Patel, I2S2 Project, DCC & UKOLN

14.00

A Federated Repository for large scientific datasets, Steve Androulakis, Monash University

14.30

Data: A legacy of NEES, Shirley Dyke, Professor of Mechanical Engineering and Civil Engineering & Director of the Intelligent Infrastructure Systems Laboratory, Purdue University

15.00

Tea and Coffee break

15.20

Integrating Data Management into Climate Change Science Research, Bruce E. Wilson, ORNL Climate Change Science Institute

15.50

DataONE: Preserving Data and Enabling Innovation in the Biological and Environmental Sciences, William Michener, Professor and Director of e-Science Initiatives for University Libraries, University of New Mexico

16.20

Panel/Discussion, facilitated by Liz Lyon, DCC & UKOLN, University of Bath

16.50

Conclusion & Closing Remarks, Simon Hodson, JISC MRD Programme Manager

Cost: $50

Full Programme and Presentations