Because good research needs good data


By David Lamb, Big Communications, Claudio Prandoni, Consorzio Pisa Ricerche and Joy Davidson, DCC

Published: April 2009

1. Introduction

CASPAR (Cultural, Artistic and Scientific knowledge for Preservation, Access and Retrieval) is a four year, multi-partner, EU funded initiative (IST-2005-2.5.10) that aims to research, implement, and disseminate innovative solutions for digital preservation based on the OAIS reference model (ISO:14721:2002)  . The CASPAR project is investigating digital preservation challenges and validating potential CASPAR solutions from the perspective of three distinct domains – scientific, cultural heritage, and creative arts. The CASPAR consortium has been established to ensure adequate coverage from each of the three specialist areas and also includes commercial partners and information preservation experts.

Back to top

2. CASPAR architecture

CASPAR is working to design and develop a distributed, OAIS-based archive system which provides access to a suite of flexible, sustainable, and interchangeable digital preservation services. The following basic tenets have guided the overall development of the CASPAR architecture:

  • OAIS compliance CASPAR components should be compliant with the OAIS Reference Model, the main standard of reference in digital preservation.
  • Technology neutrality The CASPAR solution should be technology-neutral. This means that the CASPAR preservation environment will be implementable using new technologies as they emerge.
  • Loosely-coupled architecture CASPAR should adopt a distributed, very loosely coupled, highly asynchronous architecture and each key component may be deployed without dependencies on a different platform and framework. This means that each key component is self-contained and portable.
  • Domain independence The CASPAR framework should be applicable to multiple domains/contexts, including both public and private organisations, with very little additional effort.
  • Preservation of intelligibility and knowledge dependencies CASPAR should offer methodological and technological solutions in order to preserve information and knowledge, not just the "bits", in particular by means of Semantic Representation Information.
  • Preservation of authenticity and digital rights The CASPAR framework should be able to guarantee the integrity and identity of the information preserved as well as the protection of digital rights.

Back to top

3. Functionality

To ensure that each of the above tenets is achieved and that the six functional components of the OAIS model are addressed, the CASPAR Architecture Team have defined eleven key components for the CASPAR Overall Component Architecture and Component Model. These key components are:

  1. Registry/Repository of Representation Information (RRORI) This component allows centralised and persistent storage and retrieval of Representation Information (RepInfo) and Preservation Description Information (PDI) in a centralised registry/repository. This component builds upon previous work carried out by the Digital Curation Centre.
  2. Knowledge Manager (KM) This component supports a number of high level knowledge management services for digital information preservation systems all based on Semantic Web Technologies. Specific services aim to support the definition of Designated Communities; facilitate the development of new Representation Information; to identify missing Representation Information; and to assist in searching for relevant data objects.
  3. Preservation Orchestration Manager (POM) The POM's main functions are to manage subscription of Data Holders with their interests; to collect information about changes in the Designated Community knowledge base; to accept notifications from Data Preservers for specific events/topics; and to identify and send alerts to experts able to solve Representation Information Gaps. The POM utilises a basic Publish-Subscribe model.
  4. Representation Information (RepInfo) Toolbox The RepInfo Toolbox provides support for the creation, maintenance, and reuse of Representation Information.
  5. Preservation Data Store (PDS) The PDS provides built-in support for both bit and logical preservation in the storage. This component understands elements of the OAIS Archival Information Package (AIP) such as the Data Object, Representation Information, Preservation Description Information sub-components, and works to associate each with its own Representation Information.
  6. Data Access Manager and Security (DAMS) The CASPAR access control model is based on the definition, enforcement, and evaluation of access control policies. For each resource, an access control policy may be declared binding users to specific permissions for particular objects.
  7. Digital Rights Manager (DRM) The DRM component defines, distributes, enforces, verifies, and preserves digital rights on content and services. The main challenge lies in enabling the users of tomorrow to make use of copyrighted works of today while complying with existing legal restrictions and guaranteeing protection to rights holders.
  8. Finding Aids This component provides information discovery services and is split into two basic areas: the Finding Registry and the Finding Manager. The Finding Registry supports the publication and discovery of Finding Managers, in the same way a UDDI server supports the publication and discovery of Web Services. The Finding Manager supports the management of Descriptive Information (DescInfo) and is bound to a Data Definition Language for defining the managed DescInfo and to a Query Language for querying the managed DescInfo. This choice makes the architecture independent from any specific technology or de facto standard and achieves data and query language implementation independence.
  9. Virtualisation Toolbox Virtualisation utilises encapsulation to hide detail and provide a simplified interface to an array of underlying technological applications. As the range of rapidly evolving technologies being used are hidden from the higher level applications which use them, virtualisation does offer benefits for preservation. However, virtualisation interfaces themselves can become obsolete and therefore require adequate description for long-term usability. This component enables the capture of such 'virtualisation descriptions' as RepInfo for deposit in RRORI.
  10. Packaging The Package Manager's main functions are the construction of information packages, the un-packaging of information packages, enabling access to and the manipulation of information objects, and the validation and storage of information packages.
  11. Authenticity Management CASPAR's Authenticity Management tools monitor and manage protocols and procedures across the custody chain in order to maintain and verify authenticity in terms of identity and integrity of the digital objects.

Back to top

4. Selected Implementations

CASPAR is testing preservation scenarios and strategies within three distinct testbed domains in order to validate its conceptual model and architectural solutions. Testbed participants include:

In addition to the development of the components listed above, the CASPAR project team are also working to inform and influence the development of an EU-wide infrastructure to support digital preservation through their involvement in the related Parse.Insight project and membership in the Alliance for Permanent Access to the Records of Science.

The CASPAR project is also actively involved in progressing the development of an ISO standard for audit and certification of repositories of digital information through its leadership of the MOIMS-Repository Audit and Certification Working Group.

Back to top

5. Additional Resources

Back to top