Distarnet was an SNF-funded research project, aimed at developing a system for the long-term preservation of digital data in a distributed network, with high redundancy and automatic migration.
The rapidly growing production of digital data, together with their increasing importance and demands for their longevity, urgently require systems that provide reliable long-term preservation of digital objects. These systems have to ensure availability, integrity, authenticity, and interpretability over the course of the preservation period, which may last for several years, e.g. in business or scientific applications, the lifetime of a human in medical applications, or for potentially unlimited time spans in cultural heritage digital libraries. This means that all kinds of technical problems (network, software, or hardware failures) need to be reliably handled, and that the evolution of data formats is supported.
At the same time, systems need to scale with the volume of data to be archived. Thus, long-term digital preservation systems have to be inherently distributed to allow content to be replicated. Institutions with long-term archiving needs for the preservation of digital data have to collaborate in order to build a highly reliable and available, geographically distributed, Internet-based digital archiving system. By employing distributed systems technologies, whether for the creation of a small cooperating network of few institutions with limited resources or a large network with many nodes, together providing potentially vast amounts of globally distributed resources, the challenges lie in the autonomic, efficient, and fault-tolerant use of these resources without a centralized global coordinator.
We developed novel concepts for a distributed long-term preservation system for digital data, with a focus on long-term preservation as required by archives, museums, research communities, or the corporate sector. These concepts are the result of combining distributed, autonomic, and process-oriented computing, with requirements from the digital preservation community regarding special system, user, and metadata functionality. Originating from this fusion, our novel concepts are the main ingredients of the described system model, consisting of a data model, and different processes. At the data level, support is provided for complex data objects, management of collections, annotations, and arbitrary links between digital objects. At process level, our proposed archiving system model supports automated processes that provide dynamic replication, consistency checks, and automated recovery of the archived digital objects, using autonomic behavior governed by preservation policies without any centralized coordinator, in a fully distributed network. This allows for an efficient and fault-tolerant use of the resources provided in the network.
The prototype implementation of the DISTARNET (DISTributed ARchival NETwork) System, a distributed long-term digital preservation solution, implements the described novel concepts.