The Disappearing Web: a study highlights loss of online content

The study, titled When Online Content Disappears, analyzes website snapshots collected between 2013 and 2023 to estimate the volume of webpages no longer accessible.

Roughly 25% of webpages sampled from the period 2013-2023 are no longer available as of October 2023
Roughly 25% of webpages sampled from the period 2013-2023 are no longer available as of October 2023

A new report by the Pew Research Center, published on May 17, 2024, sheds light on a concerning trend: the disappearance of online content. The study, titled When Online Content Disappears, analyzes website snapshots collected between 2013 and 2023 to estimate the volume of webpages no longer accessible.

A Quarter of Webpages Lost: The study reveals that a significant portion of the historical web is no longer accessible. According to the report, roughly 25% of webpages sampled from the period 2013-2023 are no longer available as of October 2023.

Disappearing News and Reference Content: The report highlights the impact on news and reference content. An estimated 23% of news pages and 21% of government websites included in the study contained links to inaccessible references. This suggests a potential loss of valuable historical information and resources.

Social Media and Content Disappearance: The study indicates a rapid disappearance of content on social media platforms. According to the report, as much as 20% of tweets may vanish from a platform within a few months of being posted.

The report identifies several reasons for online content disappearance:

Individual Page Deletion: Website owners may choose to delete individual pages due to outdated content, changes in focus, or broken links.

Website Discontinuation: Entire websites may shut down or move to a different domain, rendering all their content inaccessible.

Technical Issues: Technical problems, such as server failures or changes in website structure, can also lead to content disappearing.

The disappearance of online content can have a number of negative consequences:

  • Loss of Historical Record: Inaccessible webpages can create gaps in the historical record, making it difficult to research past events or trends.
  • Accessibility Challenges: Disappearing content can create accessibility challenges for individuals who rely on online resources for education, research, or government information.
  • Erosion of Trust: The frequent disappearance of content can erode trust in the internet as a reliable source of information.

The Pew Research Center report highlights the importance of web archiving initiatives. Archiving organizations play a crucial role in preserving historical webpages and ensuring long-term access to online content.

The issue of disappearing content raises questions about the long-term viability of the web as a reliable information repository. Further research is needed to develop strategies for preserving online content and ensuring its accessibility for future generations.

The Pew Research Center study sheds light on a significant challenge facing the internet: the disappearance of online content. The loss of historical information, reference materials, and social media content can have a negative impact on research, education, and trust in the web as a whole. Understanding the causes of disappearing content and supporting web archiving initiatives are crucial steps towards ensuring the long-term accessibility and integrity of online information.

Is the Blue Link economy fading? Rise of AI answers and search result shifts
The way users find information online is constantly evolving, and recent developments suggest a potential shift away from the traditional “blue link” search engine results pages (SERPs) dominated by publisher websites.