Information for libraries

  • our website

You are here: Home Archives 2015 / 2 Reviewed articles Archivematica – Open Source System for Digital Archiving

Archivematica – Open Source System for Digital Archiving

Summary: The paper offers a basic overview of the Archivematica system. This system is intended to support long term protection and archiving of digital information conforming to the OAIS reference model. It has been developed by the Canadian company Artefactual Systems Inc. and made available as a free open source solution. Growing numbers of institutions use or plan to use the Archivematica system as a low cost approach enabling to solve urgent problems of digital preservation, while avoiding the necessity of massive investment into expensive commercial solutions.

 

Keywords: digital preservation, digital archiving, Archivematica, OAIS, low cost solution

RNDr. Miroslav Bartošek, CSc., Masaryk University, Institute of Computer Science, Botanická 68a, 602 00 Brno, Czech Republic

The article was written within the CESNET Development Fund research project “Pilot project for low-barrier approach to digital preservation (LTP-Pilot)”, project No. 516R1/2014[1][2]

1.   Introduction

Until recently long-term digital preservation was the exclusive domain of large institutions such as national libraries and national archives, which usually had the necessary mandates, finances and expert resources. These institutions were often focused on building large, monolithic systems based on rather costly commercial solutions (e.g. Rosetta from Ex Libris). However, advances in theory and practice, along with the growing need to address digital archiving in smaller institutions led to the realization that even with limited resources it is possible to begin creating their own solutions, and there is no need to wait what the large institutions can offer. One of the new long-term preservation systems that have emerged in recent years, and supports this trend, is Archivematica.

1.1  About Archivematica

Archivematica[3] is a freely available open source system supporting long-term digital preservation. The system is being developed by the Canadian company Artefactual Systems Inc. in collaboration with academic and memory institutions since 2008. The impulses for developing Archivematica were (a) demand for low-cost long-term preservation solution[4], (b) the availability of a large variety of open source tools supporting specific digital preservation tasks which lacked interconnection in a comprehensive system that is easily usable by the wider community of digital curators. The declared objective of Archivematica is to provide archivists and librarians with limited technical and financial capacities with the tools, methodologies and confidence to start digital archiving on their own.

A prototype of Archivematica was designed to verify the idea[5] that it is achievable to create an open source long-term preservation system by mapping available tools to OAIS processes[6]. The system was initially developed for the City of Vancouver Archives and for the International Monetary Fund. Later other institutions and the wider community of users got engaged. The beta version was released in early 2009. The first production version appeared two years later. The latest version, published on the date of submission of this paper, is version 1.4 released in May 2015.

Archivematica integrates a set of freely available tools and uses them in the complex processing of digital objects from submission and ingest into the archive to providing access to end users according to the OAIS model and other standards and recommendations. To implement digital preservation functionalities Archivematica uses the micro-services approach: each micro-service represents a partial step in the preservation process and is usually implemented by some of the available tools. Micro-services are chained to workflows representing functions of the OAIS model. The entire system can be user-controlled and monitored via a web interface. The workflows and tools used for individual steps can be changed and replaced. This makes the system flexible, responding to different needs and technological changes in digital file creation, management and preservation.

1.2  Threats to digital information, digital preservation

Digital information is subject to number of risks that critically threaten its long-term availability and usability. Unlike paper documents that can survive and pass on the authentic information for very long time without the need for any special treatment, the digital objects cannot survive without continuous interventions. The main risks include the limited lifetime of storage media, the complexity of digital objects that cannot be rendered without intricate and innovative intervention, and especially technological progress, a byproduct of which is technology and information obsolescence and thus information inaccessiblity. It is not possible to maintain long-term availability, usability, integrity and authenticity of digital documents without active systematic actions.

Digital preservation includes basically two different and complementary levels: physical preservation of the original digital files (or set of bits, hence also the bit-level preservation); and logical preservation which means preservation of the ability to read and understand the information contained in a digital object.

Physical preservation refers to protection against loss of the digital object alone or part thereof (spontaneous media degradation, intentional or unintentional deletion or content alteration, loss or destruction of media due to a crash or natural disaster). Physical preservation is ensured in particular by creating multiple copies of files stored in multiple geographically distant locations and by regular monitoring of their integrity.

Logical preservation concerns protection against the inability to access preserved object ( due to the technological changes a device needed to read the media is no longer available; there is no software for decoding the file format; there is no operating system or hardware platform which would run the software) or understand its informational content (loss of context etc.). Logical preservation might be provided by storing the digital object along with well maintained andaccurate supporting information – metadata. Moreover, it is also necessary to carry out preservation actions to ensure the legibility of the original information (migrations to a new file format, emulation of the original computing environment, etc.). Logical preservation may result in changing original bits, but that is always favorable in the quest to preserve the readability and clarity of content.

Archivematica is focusing to support processes of logical preservation, ie. preservation of the information content, its readability and understandability.

1.3  OAIS functional and informational model

OAIS – Open Archival Information System (ISO 14721) is the reference model and the crucial standard used today for long-term archiving systems implementation. According to the OAIS model[7] the digital content which has to be preserved is submitted by creator to the archive as a Submission Information Package (SIP). The submission package is then transformed into an Archival Information Package (AIP), which is stored in a secure physical storage and then is managed within and by the digital archive on the basis of established preservation practices and strategies. Digital content is made available to the end user through a Dissemination Information Package (DIP) – see Figure 1.

Figure 1: The OAIS Reference model scheme

The OAIS Functional Model encompasses six functional entities: Ingest, receipt of information from the creator and its preparation for insertion into the archive; Archival Storage, ensuring long-term storage and protection of information; Data Management, managing descriptive metadata of archived objects together with administrative metadata for both the operation of archive and search; Preservation planning, preserving the archived objects with respect to ongoing changes in the external environment and technology; Administration, management, coordination and operation of the archive; Access, providing access to archived objects to end users.

2.   OAIS model implementation in Archivematica

The OAIS functional model has been translated by the Archivematica creators into a set of user scenarios. Using those scenarios the specific workflows were devised and implemented in the system. Digital information passes through a series of transformations during processing and the original digital content might be modified and enhanced (the unchanged digital original is always kept as well).

The main function and goal of Archivematica is to process submitted digital data (called Transfer in the Archivematica terminology) into SIPs which are then ready for ingest into the archive and transformation into archival packages (AIPs) intended for long-term storage. In parallel with the creation of archival AIP packages one can also set up creation of access DIP packages. Archivematica focuses primarily on creating the best possible AIP package[8]. What happens to them further is not what Archivematica really addresses; it relies on the use of other external systems for that. For example, to access the DIP package users can use either the AtoM module, which is an independent component supplied with the system, or integrate Archivematica with external data management and access systems used by the organization (eg institutional repository).

Figure 2: OAIS information packages as seen in Archivematica workflow[9].

2.1  Transfer

The term "transfer"[10] is used in two different meanings: on the one hand, it refers to a set of submitted data and metadata (files and directories) to be archived in Archivematica. On the other hand it also refers to the process prior Ingest itself (ie. pre-ingest) where a SIP is generated from the submitted data.

The preparation and conduct of the Transfer process depends on the type of digital content and the procedures established in the particular institution. Typically it may include putting files into an appropriate folder structure, creating descriptive metadata for those files and adding other metadata such as copyright agreements, access restrictions, etc. There are several predefined modes for submitting the data for Transfer in Archivematica, but it is also possible to create and implement customized structures for Transfer.

A SIP package is created by using sequential steps (micro-services), such as extraction of compressed files, normalisation of file names, the virus scan, checksums generation and validation, assigning unique identifiers, format identification, metadata extraction and others. From one data set submitted to digital archive one or many SIPs can be created. Also the other way round, one SIP package can contain one or many sets of submitted data. The system also supports the "backlog" functionality, delayed processing of incomplete Transfers (a frequent procedure in the archival community).

2.2  Ingest

SIP packages go through further processing during the ingest operation. For example new metadata can be added and validated (like descriptive metadata in Dublin Core, preservation metadata in PREMIS, technical metadata etc.), optical character recognition can be performed etc. More importantly normalisation can be conducted (if configured). This means conversion of the digital content to a more suitable archival format, based on the input format. At the same time Archivematica can also generate representations in other file formats for access purposes. The original versions of digital objects are always stored along with the normalised versions. Normalisation is then followed by another processes involving the creation of detailed input documentation, integration of newly generated metadata into a METS document (see 3.2), content and metadata indexing etc. Archivematica offers pre-defined ways for ingest depending on the type or form of data and completeness of the description of ingested digital content. The Administrator may, however, modify these or define new ones.

The ingest process is completed by creating AIP archival package and storing it in an archival storage. When required, the DIP access package can also be generated during ingest and stored in an access system. Data, metadata and any accompanying information forming an AIP are encapsulated in a single package created according to the BagIt standard (see 3.2).

2.3  Archival Storage

Archivematica stores all data and information packages (transfer, SIP, AIP, DIP) as files in a file system[11]. To ensure independence from specific physical data storage it uses a separate component called Storage Service, which provides an interface to any archival storage. The administrator can configure Storage Service so that the data are stored in the storage according to the need of the organization. Storage may be local or remote file system (eg. NFS), networked storage such as LOCKSS, cloud etc. Multiple repositories can be configured for different data types simultaneously within a single system. Archivematica does not address bit preservation (backups, multiple copies, integrity checks, recovery after catastrophic events, etc.), it leaves this on the repository itself.

All AIP packages in the archival storage are indexed (using the ElasticSearch server) so they can be searched and retrived in a limited way, both at the package level or individual objects level. It is also possible to search at the AIC level (Archival Information Collection is an information unit which brings together a set of logically interrelated AIPs). In justified cases it is possible to remove AIPs from the archival storage by using a controlled remove procedure (but it is not possible to delete individual files from the AIP).

2.4  Preservation Planning

Archivematica uses two-pronged preservation strategies – normalisation conducted during the ingest; and keeping and preserving the original files to support future preservation actions such as the format migrations or emulations. Normalisation is based on the identification of file formats and their significant properties and also on format policies which specify target file format, type of actions, tools used and also procedures followed for creating AIP and DIP packages. Target formats for normalisation are selected using criteria such as current community recommendations, open format specification, availability of open-source tools for the format creation and presentation, format licensing, patent restrictions and others. Administrators of Archivematica can configure their preferred file formats and normalisation processes anytime as needed.

A crucial part of Archivematica is FPR – Format Policy Registry, centrally managed by the Archivematica producer. FPR specifies and continuously updates format-oriented procedures recommended on the basis of the contemporary state of knowledge and best practice in digital preservation (the system administrator  always has the option to modify and enhance these centrally managed proceduress in the local registry copy). FPR is available through API and is shared not only by all organizations using Archivematica, but also by other institutions and projects. It is connected with the PRONOM[12] register. The use of other format registers such as UDFR (Unified Digital Format Registry) or the Planets Core Registry is also planned.

Institutions may use the FPR registry as a tool for supporting and updating local processes as part of their broader concepts and strategies for digital preservation. The user has the freedom to determine their own procedures based on institutional LTP policies or tools available for preservation planning, such as PLATO[13]. However, Archivematica in its current version does not address the creation of the generic preservation plans and their implementation.

2.5  Access

Archivematica was designed to support integration with external systems already used by institutions for data storage, data management and access wherever possible. Therefore it is intended more as a back-end supplement to manage preservation tasks than a data management and access system of its own.  Archivematica customers have the option to continue to use their existing systems and integrate Archivematica with them to "only" support long-term archiving processes.

Access versions of digital objects packed together with other information in DIP packages can be generated during ingest of data into Archivematica. DIPs are then imported into an external system for access and are available to users through it. Archivematica provides tools for basic metadata synchronization between archival storage and external access systems. Currently, there are two approaches on how to ensure access to archived information. Firstly, the AtoM system was developed by Archivematica creators to address the needs of the community of archivists. Secondly, users can connect their own access systems to Archivematica. There are various pilot projects where Archivematica was connected to systems like Archivist's Toolkit, Content DM, DSpace or Fedora (Islandora)[14].

2.6  Administration – Dashboard

The user interface to Archivematica, used for managing the processes and the system configuration, is called a Dashboard. It is a web application which provides the following functionalities to a user/administrator:

-        configure the system,

-        prepare and ingest new content to digital archive,

-        monitor and manage ingest processes, usually by configuring and choosing from available options (dropdown menus),

-        edit and enhance metadata,

-        deal with the user requests for providing AIP packages,

-        report on preservation planning,

-        report different statistics and operations running within the system (in a very rudimentary form).

Archivematica functions are designed in accordance with the OAIS model into the modules described in paragraphs above; individual tabs of the Dashboard correspond with the modules – i.e. Transfer, Ingest, Archival Storage, Planning Preservation[15], Access, Administration, see Figure 3.

Figure 3: Dashboard – user interface of Archivematica

When Archivematica performs various operations the Dashboard displays a list of micro-services currently in use and generates alerts if manual intervention of the administrator is necessary. For example, to choose the variant of next ingest step or the options to solve some error. However, it is possible to configure the individual processes to proceed automatically so manual intervention is mostly not needed[16].

2.7  Management of archival packages (AIPs)

The current version of Archivematica creates high-quality and robust AIPs, but it provides only a minimum of tools needed for their long-term management. For example, over a longer period of time it may be necessary to modify the content of AIPs in connection with migration of obsolete file formats to new ones, or update the metadata stored in the AIP. New versions of Archivematica should deliver at least partial improvements in this direction. Awaited functionalities include versioning of information packages and the possibility of AIP re-ingest[17]. This should enable to perform both minor AIP updates (such as adding a file that was missing in the original SIP package) and extensive large-scale changes (such as periodic migration of normalised file formats).

Another as of yet unsupported feature is replicating the AIP packages to multiple geographically distributed repositories as well as periodical integrity checks of these packages[18]. Users of the current version of Archivematica must use external systems and tools for these tasks.

3.   Other features of Archivematica

3.1  Archivematica as a software and micro-services

Archivematica is developed using Python programming language. Archivematica’s code, development environment and documentation are freely available under the AGPL 3.0 (GNU Affero General Public License) and Creative Commons licenses. The system can be installed on an Ubuntu environment. Alternatively, you can prepare distribution as a virtual appliance with a bundled Xubuntu linux distribution and a set of open source software tools. By using an appropriate virtualization application (eg. Oracle VirtualBox, VMWare Player) a virtual machine running Archivematica can run on any hardware platform and operating system, including conventional desktop computers. The disk image used for the virtual machine can also be used to create a bootable USB drive or DVD or for direct installation of Archivematica on physical hardware like servers and workstations.

As mentioned earlier, Archivematica uses the concept of micro-services. This means that the information packages submitted into the system are processed step-by-step by individual micro-services pipelined in such a way that output of one is an input for following one. Each micro-service consists usually of several steps (jobs) and is implemented as a combination of Archivematica scripts and one or more freely available software tools. Each of the pre-installed tools can be replaced (at least theoretically) for another one, without compromising the functioning of the system as a whole[19].

In the initial analysis and the development of user scenarios based on OAIS functional model the Archivematica developers identified 24 original micro-services which they grouped into 9 process categories[20]:

 

Process category

Micro-service

1. receiveSIP

verifyChecksum

2. reviewSIP

extractPackage assignIdentifier parseManifest cleanFilename

3. quarantineSIP

lockAccess virusCheck

4. appraiseSIP

identifyFormat validateFormat extractMetadata decidePreservationAction

5. prepareAIP

gatherMetadata normalizeFiles createPackage

6. reviewAIP

decideStorageAction

7. storeAIP

writePackage replicatePackage auditFixity readPackage updatePackage

8. provideDIP

uploadPackage updateMetadata

9. monitorPreservation

updatePolicy migrateFormat

The scope and specifications of micro-services are constantly enhanced and refined during Archivematica development, so the current list of micro-services is much wider and more sophisticated.

Archivematica architecture utilizing micro-services implemented by freely available tools is shown in Figure 4.

Bartosek-obr4-architektura.png

Figure 4: Archivematica system architecture[21].

3.2  Standards

Archivematica uses a number of open de facto standards for metadata, identifiers and integration of the information. The most important are:

BagIt[22] – specifies method of packaging of the folders and files into single packages for purposes of the long-term preservation or data exchange. Checksum information is generated and stored for each file in the package, which simplifies the integrity assurance. The BagIt standard is used for AIP packaging, and BagIt packages can be submitted into Archivematica as a Transfer created by other systems.

METS (Metadata Encoding and Transmission Standard)[23] – standard encapsulating metadata (descriptive, administrative, structural) and source files of a structured digital object. Archivematica uses the METS standard to group metadata of the archival objects in a single XML file. The METS file with all metadata records and the content files constitute the AIP package.

PREMIS (PREservation Metadata: Implementation Strategies)[24] – archival metadata standard, which provides data dictionary for storing information about changes of the archived objects in the course of preservation. It keeps also information about events related to the archived object (for example ingest to the archival system, performed virus checks, format conversions, fixity checks etc.), agents, which are associated with these events (people, software, institutions) and technical characteristics of the archived objects (including information about the file format, size, resolution). Archivematica generates metadata in PREMIS standard for preserved objects and adds it to the METS files, which describes these archived objects.

UUID (Universaly Unique Identifier)[25] – standard for unique identification of the information objects in distributed systems without any central coordination. The UUID identifier is 128 bits value (represented as 36 alphanumeric character string) generated so that the identifiers are globally unique. Archivematica uses the UUID to identify all objects, including files, processes, and storage locations.

3.3  System scalability

Archivematica has a client/server architecture, which can have different configurations to support requirements of scalable data processing. To achieve better performance with large scale data processing the services can be distributed over more nodes – processors. Similarly in different scenarios users can parallelize the installation of the Archivematica system itself.  Institutions can run more systems in parallel, each system can perform different tasks (for example when particular type of task is very resource consuming like converting many large graphic images) or the systems can work in parallel on the same task (parallel processing of the big amounts of data).

3.4  Sustainability and further development

Archivematica is open source software developed and freely distributed with support of the Artefactural systems Inc. In the beginning the system development was also co-financed by UNESCO [2].  Currently, it is supported by the producer company and from other resources, like customers who sponsor the development of specific functionalities. The sponsored funcionality is then available to all users. The community also uses code additions provided by independent developers. Institutions which need technical support in the installation and configuration of the system can order these as optional and paid services from Artefactual systems Inc.

4.   Conclusion – what Archivematica is and what is not

The main Archivematica features can be summarized as follows:

  • Archivematica is a free, open source system developed by Artefactual Systems Inc. with the support of growing community of users and customers.
  • The system is actively developed – several times a year a new version is released with new functionalities and bugs corrections.
  • So far the system cannot be considered to be finished product. Some important functions are still missing and configuring the system for smooth ingest of large data can be difficult.
  • Users can influence the development of the system by sponsoring new functionalities (which are then freely available to all users, and are incorporated into further versions of Archivematica) or suggestions into a wish-list. As the source code is open, anybody can create and share his/her own components or adjust the existing code.
  • The system is flexible. It is based on the concept of micro-services using proven open source tools and open standards for implementation of most of the services needed in the archiving workflows and data management.
  • Configurability is a strong point of Archivematica, especially when it comes to configuration of the tools connected in the micro-services. To a large degree the user can configure the system according to his/her specific needs. On the other hand, wide configuration options can pose a barrier for newcomers to digital preservation.
  • A large part of the ingest processing can be automated and manual processing may be minimized.
  • Currently the basic preservation strategy is normalisation (based on format policies) and generating high quality archival packages which can be stored in any repository.
  • Via the FPR – the Format Policy Registry – the system provides updated recommendations based on the current experience and shared knowledge of the preservation community, while permiting local configuration reflecting specific needs of each institution.
  • Different projects use Archivematica in different maturity phases, in various contexts and environments. Experiences, tools and deployment architectures[26] can be shared. There is less pratical experience use cases published about larger installations and long-term production deployment.
  • At the time of writing this article, the system still did not provide all the functionalities that could be derived from the OAIS. It focuses on the ingest processing and preparation of the AIPs. Integration with external systems is needed to ensure other OAIS function entities (physical storage, preservation planning, active preservation, access for the end users etc.).
  • The system provides a low cost solution for long-term preservation of digital information. Institutions can already start now with this activity, even with limited resources and finances. However, nothing is for free.The implementation of the system in a real life digital preservation project requires significant effort and experience with management and configuration of the system, customization to specific local needs and integration with wider institutional infrastructure.
  • For those who would like to use Archivematica for long-term preservation of their digital data, but lack necessary technical personnel for its implementation, management and maintenance, there are paid hosted services like Arkivum[27], ArchivesDirect[28] and other.

5.   Conclusion

Archivematica is an open source system supporting long-term digital preservation, which is currently considered by many to be the most advanced freely available solution. Contrary to other solutions which try to cover all functions related with management, preservation and access in one integrated system (like for example RODA[29]) Archivematica is intended as a complement to existing infrastructures. Archivematica focuses on the processes and services of the long-term preservation, and expects to be integrated with available external systems for data management (collection management, physical storage, access).

Archivematica is relatively young system developed by Artefactual Systems Inc. since 2008. The development is not finished yet and some functionalities inherent in commercial solutions are still missing. But dynamic development, system flexibility, and a growing user community gives hope for those who look for open and promising solutions for projects with limited budgets. Besides quite a number of evauluation projects in the international community and first production installations, the system is currently being tested and used by the pilot projects in the Czech context. The National Digital Archive solution currently being developed step by step by the Czech National Archives is using Archivematica as one of its components. The system was also intensively tested in the LTP-Pilot project supported by CESNET Development Fund. Use of Archivematica is expected also in the prepared ArcLib project which should develop a complex solution for long-term preservation of the library digital collections (the project was submitted by a group of Czech libraries to the funding call from NAKI II programme of the Ministry of Culture of the Czech Republic for the years 2016-2020[30]).

Literature

[1] Archivematica [online]. Artefactual Systems Inc., 2015 [cit. 2015-09-28].

Online available at: http://www.archivematica.org/

[2] VAN GARDEREN, Peter a Courtney C. MUMMA. Realizing the Archivematica vision: delivering a comprehensive and free OAIS implementation. In: iPRES2013: proceedings of the 10th International Conference on Preservation of Digital Objects, 3-5 September 2013, Lisbon, Portugal [online]. Lisbon: Biblioteca Nacional de Portugal, 2013 [cit. 2015-09-28]. Online available at: http://purl.pt/24107/1/iPres2013_PDF/Realizing%20the%20Archivematica%20vision%20deli vering%20a%20comprehensive%20and%20free%20OAIS%20implementation.pdf

[3] VAN GARDEREN, Peter. Archivematica: Using micro-services and open-source software to deliver a comprehensive digital curation service. In: iPRES2010: 7th International Conference on Preservation of Digital Objects, September 19 – 24, 2010, Vienna, Austria [online]. Vienna, iPress2010, 2010 [cit. 2015-09-28].

Online available at: http://www.ifs.tuwien.ac.at/dp/ipres2010/papers/vanGarderen28.pdf

[4] JORDAN, Mark. Introduction to Archivematica : Material for a workshop on Archivematica. In: GitHub [online]. Apr 30 2014 [cit. 2015-09-28].Online available at: https://github.com/mjordan/archivematicaworkshop

[5] SCHUMACHER, Jaime et al. From Theory to Action: “Good Enough” Digital Preservation Solutions for Under-Resourced Cultural Heritage Institutions: A Digital POWRR White Paper for the Institute of Museum and Library Service [online]. August 2014 [cit. 2015-09-28]. Online available at: http://commons.lib.niu.edu/handle/10843/13610

[6] LAVOIE, Brian. The Open Archival Information System (OAIS) Reference Model: Introductory Guide (2nd Edition): DPC Technology Watch Report 14-02 October 2014 [online]. Digital Preservation Coalition, 2014 [cit. 2015-09-28].

Online available at: http://dx.doi.org/10.7207/TWR14-02

[7] ČSN ISO 14721. Systémy pro přenos dat a informací z kosmického prostoru – Otevřený archivační informační systém – Referenční model. Praha: Úřad pro technickou normalizaci, metrologii a státní zkušebnictví, 2014. 98 s. Třídící znak 31 9620.

[8] MITCHAM, Jenny et al. Filling the Digital Preservation Gap: A Jisc Research Data Spring project: Phase One report – July 2015 [online]. University of York, University of Hull, 2015. Online available at: http://dx.doi.org/10.6084/m9.figshare.1481170

 


[1]The project aims were to test functionality, requirements and constraints of the Archivematica system; verify its usefulness for the logical long-term preservation of selected documents and collections; create a basic documentation for system administrators and digital data curators.

 

[2] Jan Hutař and Andrea Byrne from Archives New Zealand kindly read the English translation of the article and suggested number of improvements. Author wishes to express the thanks for their help.

[3] http://www.archivematica.org

[4]As an example of the approach see the project POWRR – Preserving Digital Objects With Restricted Resources, http://commons.lib.niu.edu/handle/10843/13610.

[5] VAN GARDEREN, Peter a Courtney C. MUMMA. Realizing the Archivematica vision: delivering a comprehensive and free OAIS implementation. In: iPRES2013: proceedings of the 10th International Conference on Preservation of Digital Objects, 3‐5 September 2013, Lisbon, Portugal

[6] Open Archival Information System (OAIS) is the reference model for long-term digital archive created as an endorsement of an international forum Consultative Committee for Space Data System in 1999 and standardized in 2002 as the International Standard ISO 14721:2003. In 2012 an updated version was published as ISO-14721:2012 (Czech translation of this standard was published in 2014). Highly qualified readable overview and assessment of the OAIS by Brian Lavoie can be found in [6].

[7] The OAIS standard encompasses three related models: OAIS Environment (external entities and archive interaction with them), OAIS Functional Model (core functions of the archive) and OAIS Information Model (high-level description of the information objects managed by the archive). All entities, relationships and processes are described in details in the standard.

[8] AIP packages are the key information objects for long-term preservation. Preservation and usability of the original content depends heavily on the quality and completeness of the information contained in AIPs. Each AIP package includes not only the information that is subject to archiving (ie. Content Data Object), but also a number of supporting information: information necessary for future understanding and presentation of the object (at both the structural and semantical levels), metadata to support and document protection processes – identification, preservation context and history of the object changes, proving integrity and authenticity, access data and many more. AIP's structure and content on the general level are specified in the OAIS information model. Specific Archivematica's AIP implementation is described in the system and user documentation.

[9] The figure is taken from https://github.com/mjordan/archivematicaworkshop

[10] The concept of "Transfer" is not defined by the OAIS reference model. Archivematica introduced it as a supplemental entity based on practical experience and needs of users.

[11] The Archivematica developers justify utilization of file system by its robustness and proven long-term durability in comparison with other types of information management technology. At the same time it is part of their broader preservation strategy: each layer and a component of the LTP-system is not resistant to the risks of technological obsolescence, as well as digital data itself. The fewer complex technology layers the better.

[12] PRONOM is an on-line information system about data file formats and their supporting software products. It was developed and is operated by the National Archives of Great Britain.

[13] PLATO – The Preservation Planning Tool, http://www.ifs.tuwien.ac.at/dp/plato/intro/

[14] One example is the use of Archivematica as a "dark archive" for the DSpace system. DSpace repository serves users as a storage and access system with Archivematica connected as a preservation back-end. For more information see https://www.archivematica.org/wiki/DSpace_integration, https://www.archivematica.org/wiki/DSpace_exports

[15] Planning Preservation tab provides access to FRP registry.

[16] Examples of automating Archivematica can be found in https://github.com/mjordan/archivematicaworkshop

[17] See also https://www.archivematica.org/wiki/Development_roadmap:_Archivematica

[18] Periodical integrity checks functionality is planned for some of new Archivematica versions.

[19]For example the current version of „Scan fo viruses“ micro-service uses ClamAV tool to check files in transfer for viruses. To change ClamAV for another antivirus software what is needed is only to modify scripts preparing data for a new antivirus tool.

[20] VAN GARDEREN, Peter. Archivematica: Using micro‐services and open‐source software to deliver a comprehensive digital curation service. In: iPRES2010: 7th International Conference on Preservation of Digital Objects, September 19 – 24, 2010, Vienna, Austria

[21] Převzato z http://www.ifs.tuwien.ac.at/dp/ipres2010/papers/vanGarderen28.pdf

[22] https://tools.ietf.org/html/draft‐kunze‐bagit‐10

[23] http://www.loc.gov/standards/mets/

[24] http://www.loc.gov/standards/premis/ or PREMIS introduction by B. Lavoie and R. Gartner. Preservation Metadata (2nd edition). DPC Technology Watch Report 13‐03, May 2013 available from http://dx.doi.org/10.7207/TWR13‐03.

[25] https://tools.ietf.org/html/rfc4122

[26] Czech National Archive develops their own long-term preservation solution based around Archivematica.

[27] http://arkivum.com

[28] http://www.archivesdirect.org

[29] http://www.roda‐community.org/

[30] The results of the NAKI II call should be known at the end of 2015.

Nov 02, 2016
Filed under: