How CLOCKSS Works
How we preserve the hard work and knowledge of scholars
As libraries and publishers around the world migrate from print to online-only publications, they need assurance that their investments are protected and preserved for generations to come. CLOCKSS exists to provide these assurances.
As a digital archive, CLOCKSS preserves a online scholarly content. This unique service enables authors, libraries, and publishers to be confident that the content they produce and steward will withstand potential technological, economic, environmental, and political disruptions. Your contributions to the scholarly record will always be available to those who want to access it, after a trigger event has occurred.
World Leading Digital Archive Technology
Built with proven LOCKSS open-source technology, CLOCKSS preserves scholarly publications in original formats. The polling-and-repair mechanism ensures the long-term validity of the data. Content is migrated to the latest formats when triggered in order to remain usable.
- The LOCKSS technology safeguards against the long-term, well-documented causes of digital loss: human error, computer attacks, economic and organizational failure.
- Mirror repository sites at 12 major academic institutions around the world guarantee long-term preservation and access. Our approach is resilient to threats from potential technological, economic, environmental, and political challenges.
- CLOCKSS assigns Creative Commons Open Access licenses to all triggered publications to ensure they are always open and available to everyone.
Technical Overview
The following is a step-by-step overview of how content flows into CLOCKSS and is preserved.
Step One
The publisher signs an agreement that gives CLOCKSS preservation rights in perpetuity and provides content to the CLOCKSS system.
To allow CLOCKSS access to the publisher's content, the publisher either enables CLOCKSS to harvest content from its website or places the content on a designated SFTP site.
Step Two
Special CLOCKSS servers located at Rice, Indiana, and Stanford Universities ingest the content the publisher makes available. The content is safe and secure from the time it reaches these ingest machines.
Step Three
The content in each CLOCKSS ingest server is verified to confirm that each copy of the content is identical to all of the others. This establishes the authoritative version of the content to be preserved.
Step Four
After the quality of the content on the ingest machines is validated, it is collected from them by the twelve CLOCKSS nodes. These are long-term preservation machines, performing the main storage and audit functions.
Step Five
The content is then managed and preserved through a system of audit and repair. The CLOCKSS boxes continually communicate over the internet to audit the content they are preserving. If the content in one CLOCKSS box is damaged or incomplete, that CLOCKSS box will receive repairs of the content based on other CLOCKSS boxes' holdings and/or by referring to the publisher's original content on the ingest servers. This cooperation between the CLOCKSS machines provides unambiguous reassurance that the system is performing its function and that the correct content remains available.
Step Six
When a trigger event occurs, and the CLOCKSS Board decides to release content from the archive, three things happen:
1 - Content is migrated to the newest format so that can be accessed and used with current technology.
2 - We carry out thorough checks for copyright holders.
3 - Content is made publicly available via the web under a Creative Commons license.
4 - If registered with CrossRef, the Digital Objective Identifiers, are re-directed to resolve to the publicly available content.
Step Seven
The released content is now freely available from CLOCKSS. It is also directly available via Open URLs through Crossref, or either of
1 - Local Library Link-resolvers
2 - From CLOCKSS Triggered Contact