November 7, 2024 The Blockchain Could Build a Truly ‘Forever’ Internet Dear Subscriber, October 2024 saw two major pillars of internet memory fall simultaneously. In fact, these events could be described as a perfect storm for web archiving. Google completed its retirement of cached pages. The Internet Archive's Wayback Machine — a digital archive of the internet that allows users to view past versions of websites — went dark following a series of severe cyberattacks. This unprecedented disruption has left researchers, journalists and everyday users struggling to access historical web content. Soon enough, questions began to swirl around the vulnerability of our digital memory. Source: The Verge. Click here to see full-sized image. See, we like to say that the internet is forever. I remember the public service posts of the 90’s and 2000’s reminding people to be careful of what they put on online, because it would follow them for the rest of their lives. But that isn’t quite true. Websites get shut down. Hackers steal information and bug codes. A myriad of things can happen that can erase data from the World Wide Web. Just like we saw with these two instances in just the past month. That’s why I want to break down what happened. And I want to reveal the solution — and opportunity — that can be found on the blockchain. The Data-Deleting Double Whammy The trouble began when the Internet Archive, the internet's most comprehensive historical repository, faced a series of devastating cyberattacks, including a data breach affecting 31 million user accounts. This forced the organization to temporarily suspend its services, including the invaluable Wayback Machine. While the Archive has since resumed operations, it’s in a limited, read-only capacity, as announced on X by founder Brewster Kahle: Source: X. Click here to see full-sized image. The frustration this outage caused was compounded by the completion of Google's decision to end its cached pages feature — a tool that had served as a reliable backup for accessing web content since the early days of the search engine. Ironically, the search engine chose to link to the Wayback Machine as an alternative to its own cache. Naturally, all this has left users with even fewer options for accessing historical web content. The timing couldn't have been worse. As someone who's spent years covering both traditional tech and blockchain developments, I can't help but see this situation as a stark reminder of the risks inherent in centralized systems. When we rely on just a handful of organizations to preserve our digital history, we essentially put all our eggs in just a few small baskets. And that has significant real-world implications. Journalists and readers alike can't verify past statements on major news sites. Researchers have lost access to historical data. Legal professionals can't retrieve potentially crucial evidence. In an age where anyone can make wild claims on social media, the ability to fact-check and maintain historical digital records is vital. As such, the ripple effects from these developments extend far beyond mere inconvenience. But they also make a strong case for decentralized web archiving. Blockchain technology, with its fundamental properties of immutability and distributed storage, could offer a more resilient solution. Imagine a system where archived web content is stored across thousands of computers worldwide, with no single point of failure. Each archived page would be time-stamped and cryptographically verified, making it impossible for any single entity to alter or delete historical records. Even if some computers fail or end up offline, the network would continue to function to ensure continuous access to our digital heritage. Decentralized networks like this already exist in other sectors. Just think of Render (RENDER, “B”) or Bittensor (TAO, Not Yet Rated) for AI computing. Their ecosystems also rely on a decentralized network of computers — also called nodes — to execute their goals. Source: ResearchGate. Click here to see full-sized image. However, implementing such a system isn't without challenges. The sheer volume of data involved in web archiving would make storing everything on-chain impractical. The ideal architecture would comprise multiple integrated layers. At its foundation, a blockchain layer would handle time stamps, hashes and metadata verification. Above this, a distributed storage layer would maintain the actual archived content. Then, an incentive mechanism would drive network participation by rewarding storage providers and verifiers. The final piece would be a governance structure enabling community-driven decisions about archiving priorities and resource allocation. While a blockchain-based archiving system won't materialize overnight, the current crisis should serve as a wake-up call. We need to seriously reconsider how we preserve our digital history. While several decentralized projects aim to preserve web content, each in their own way, none fully replicate the comprehensive web page archiving services provided by the Internet Archive's Wayback Machine. In the short term, organizations will likely need to rely on a combination of alternative archiving services and local caching solutions. But looking ahead, the development of decentralized archiving systems should be a priority for the tech community. The goal shouldn’t be just to create a more resilient archive. This endeavor is about democratizing the preservation of digital history. In a decentralized system, no single entity would have the power to decide what gets preserved or deleted. Or to stop preserving records at all! Rather, these decisions would be made collectively by the community of users and contributors. The irony shouldn't escape us: In an era where we're sending robots to Mars and building artificial general intelligence, we still can't guarantee the preservation of a simple news article from last week. The simultaneous failure of our primary web archives isn't just a technical glitch. It's a warning shot across the bow of our digital civilization. When future historians look back at our era, they won't just judge us by what we created, but by what we managed to preserve. The Internet Archive's vulnerability and Google's casual abandonment of web caching reveal a dangerous truth: We've built our collective digital memory on foundations of sand. But there is hope. The appeal and promise of a decentralized blockchain solution is breaking crypto containment. Even the U.N. Development Programme says “The future is decentralized.” Blockchain technology isn't just a solution — it's an imperative. While decentralized systems may seem complex and unwieldy compared to the elegant simplicity of centralized archives, they reflect a fundamental truth about human knowledge: It belongs to everyone, and therefore its preservation should be everyone's responsibility. After all, if we can't protect our digital past, how can we expect to build a digital future worth preserving? I foresee digital recordkeeping will become a new horizon in blockchain development. And considering its wide appeal — from archiving the internet to public and private data preservation — the possibilities for successful first movers in this field are vast. I’ll be keeping my eyes peeled for those. I suggest you do the same. Best, Jurica Dujmovic |