Since the recent cyber-attack on the British Library, many publishers have reached out with questions about the security of their content. While we cannot disclose every detail of our security protocols, we want to assure you that we are dedicated to the long-term preservation and security of all content entrusted to us. Here is an overview of how CLOCKSS works to protect data and digital assets.
The CLOCKSS Mission: Long-Term Preservation and Access
At the heart of our work is a clear and unwavering mission: to ensure the long-term preservation and access of scholarly content. CLOCKSS is not simply another backup service; it’s a curated, authoritative copy of the original content. Our goal is to make sure that this content remains unchanged and protected—forever.
We understand the immense responsibility involved in preserving this material, and that’s why security is embedded in everything we do. Our use of the award-winning LOCKSS open-source preservation software, developed at Stanford University, is central to ensuring both the integrity and safety of this content.
The Role of LOCKSS Software in Our Security Framework
LOCKSS isn’t just about storing files; it’s about ensuring that content remains intact and protected against threats. The software is designed to actively manage content with bit-level integrity checking. This means that every piece of content entrusted to CLOCKSS is continuously monitored to ensure that nothing is tampered with or altered.
The LOCKSS system operates as a self-healing secure storage network, meaning that if any content is compromised—whether through corruption or unauthorized access—the system automatically detects and corrects the issue. This robust protection keeps content safe from exfiltration or subversion, ensuring it remains as it should be, for the long term.
If you'd like to learn more about how our preservation system works, you can read more in detail here.
Multi-Layered Security: Comprehensive and Adaptable
When it comes to securing the CLOCKSS archive, we take a layered approach. Security doesn’t just come from technology—it’s also embedded in governance, policy, and social practices. Some layers are well-documented in computer science literature , while others remain private for security reasons. Here are some key layers that protect CLOCKSS content:
1. Governance and Policy
CLOCKSS operates under the guidance of libraries and publishers worldwide, all of whom agree on policies that govern our practices. Our board includes representatives from organizations with in-house security expertise, and our team includes experts for whom security is always a top priority.
2. Secure Storage
All content in the CLOCKSS archive is stored in an actively managed, secure, distributed network of storage nodes. Each node is protected by a unique combination of security measures, offering a multi-faceted approach to securing the data.
3. Distributed Preservation
One of the core strengths of CLOCKSS is the distribution of content across 12 secure storage sites around the globe, hosted by universities and research institutes. This geographic diversity enhances security by reducing the risk of a single point of failure. These sites are in constant communication, so if any content becomes corrupted, it is quickly identified and repaired by other nodes in the network.
4. Diversity in Security Practices
Each of the 12 preservation sites is managed and protected in unique ways, adding an extra layer of defense. This diversity strengthens the security of the entire system, ensuring that no single vulnerability can compromise the entire archive.
5. Access Controls and Technical Security Measures
CLOCKSS operates as a dark archive, which means that content is protected and managed for long-term preservation and access, but it is not openly accessible to the public. This dark archive model significantly reduces human-related risks to security, such as accidental breaches or unauthorized access.
Security measures within the CLOCKSS archive include the use of SSL certificates, firewalls, network intrusion detection systems, physical security measures, and remote wiping capabilities. Additionally, we use encryption and strong authentication protocols to ensure only authorized users can administer the preservation of the content.
6. Regular Audits and Testing
We cannot afford to become complacent, which is why we conduct regular external audits and security tests to assess our defenses. For example, we conducted a series of penetration tests while temporarily removing certain security layers. Even without these layers, CLOCKSS withstood external attacks, proving the resilience of our systems.
7. Pre-Trigger Scans for Malware and Viruses
As a dark archive, CLOCKSS stores and manages files in a state that prevents them from being accessed or executed in a typical browsing environment. This means that if any content contains malware or viruses, these remain locked in a “fossilized” state where they cannot activate. Prior to triggering content for release, we perform special checks to ensure the content is free from such threats.
8. Accreditation and Recognition
CLOCKSS is accredited under the Center for Research Libraries’ TRAC Audit scheme, receiving the highest score ever awarded to a trusted repository. This accreditation considers everything from our technology and technical infrastructure to our security arrangements, validating the strength of our preservation and protection processes.
9. Adapting to Changing Threats
At CLOCKSS, we understand that security is not static—it must evolve in response to emerging threats. The team at Stanford University, led by Thib Guicherd-Callin, actively monitors vulnerabilities and takes immediate action to mitigate risks. For example, in December 2021, a major vulnerability called "PwnKit" affected all versions of Linux, including those used by CLOCKSS systems. We were able to patch all affected machines on the same day the vulnerability was announced, ensuring no exploits were able to spread.
Conclusion
CLOCKSS is more than just a repository — it's a secure, self-healing archival vault designed to preserve digital content for generations to come. Our multi-layered security practices, combined with cutting-edge technology and vigilant monitoring, ensure that the content we manage remains protected and secure.
As threats continue to evolve, CLOCKSS remains committed to adapting and strengthening our defenses.
References
Reich, V. A. (2002). Diffused Knowledge Immortalizes Itself. The LOCKSS Program. High Energy Physics Libraries Webzine, 7/2003.
Reich, V., & Rosenthal, D. (2009). Distributed digital preservation: Private LOCKSS networks as business, social, and technical frameworks. Library Trends, 57(3), 461–475.
Reich, V., & Rosenthal, D. S. (2001). LOCKSS: A permanent web publishing and access system. D-Lib Magazine, 7(6).
Rosenthal, D. S., Vargas, D. L., Lipkis, T. A., & Griffin, C. T. (2015). Enhancing the LOCKSS digital preservation technology. D-Lib Magazine, 21(9/10), 1–39.
Seadle, M. (2006). A social model for archiving digital serials: LOCKSS. Serials Review, 32(2), 73–77.
Seadle, M. (2010). Archiving in the networked world: LOCKSS and national hosting. Library Hi Tech, 28(4), 710–717.
Stuart Holmes Rosenthal, D. (2014). Architectural choices in LOCKSS networks. Library Hi Tech, 32(1), 2–10.