The day the internet rewrote itself

Written by Nestor Gomez | Sep 11, 2025 3:37:59 PM

The Day The Internet Rewrote Itself

In 1988, Justin Hall saw the internet for the first time. In January 1994, after reading a New York Times article by John Markoff about Mosaic - the then new web browser that let you navigate the web click by click, and page by page - Hall, then a college freshman, published his “first attempt at hypertext”.

Jason Hall didn't know he was pioneering the age of digital blogging with an original writing laid out on a simple page of blue links and grey backgrounds.

Back then, the web was a vast, almost empty library with mostly empty shelves. Over the next three decades, we transformed it in a whole digital world by hand, adding blog posts, videos, podcasts, songs, and scientific papers.

The four eras of knowledge

Fast forward to 2022, and a new author arrived. Or rather, a whole family of authors: the LLMs. And they changed everything. They write quickly, they do not sleep, and they can condense the daydream of a blogger in seconds.

We still write but are no longer alone at the keyboard. We share the desk with systems that can draft poems, summarize and translate a large scientific paper, mimic an accent, or create a thousand variations of the same idea in the time it takes my kettle to boil water.

This turning point highlights a remarkable arithmetic.

It took humanity roughly five millennia to build the world’s written memory by hand, from cuneiform scripts on clay tablets to the sacred scribes and illuminators of the middle age.
Then, starting around 1440, and for the past five centuries, we replaced most handwritten publications with printed paper: thank you Gutenberg!
The massification of the digital computer in the 1980s, followed by the advent of the Internet in the 1990s, made possible the digitization of information. And it took us the best of the past three decades to publish it all in the web.
Enter the LLM family. They first vacuumed and ingested most of the Internet content in less than three years, and are now regurgitating it into synthetic knowledge, at a rough range of 540 to 600 GB/hour (1).

Each era takes one order of magnitude less: how long before synthetic is the prevalent content in our streams?

Forecasts that say most new online content could be machine generated within a few years are projections, not facts, and the dates are debated. Yet the direction is hard to miss in our feeds.

I submit to you that, within the next decade, commercial ads, product copy, support scripts, and most of our social media posts will, almost entirely, be created by the LLM family with human as gatekeepers and decision makers.

Coding and computer programming are already being taken over by members of the LLM family, leaving engineers in charge of architecture and integration.

I don't believe that human-made content authoring will vanish entirely. I think it will retreat into the nooks and crannies of the web, into newsletters, journals, group chats, and local archives. It will still exist, but it will be harder to hear over the synthetic chorus.

Signals in the noise: when machines discover

The good news is that synthetic doesn’t always mean derivative. At its best, AI is an engine that accelerates scientific discovery, solving real live problems and improving our lives.

Researchers in the United States, China, Sweden and Singapore collaborated to produce an ultrathin, AI‑designed paint that reflects sunlight and radiates heat to the sky. This coating could cool buildings, and trim power bills significantly.
Self‑driving lab are highly automated and semi-autonomous research facilities powered by machine‑learning planners that autonomously mix, test, and iterate through experiments, discovering new materials far faster than a traditional bench team.
At USC Viterbi, Allegro‑FM, an AI foundation model for atomistic simulation, reduces the time it takes to discover new materials by orders of magnitude.
In aerospace, teams are adopting physics‑informed AI surrogate models, like Luminary’s SHIFT‑Wing, to deliver near‑instant aerodynamic predictions that once required long queues, waiting times and high costs at testing facilities.

These are signs that AI is moving from remixing to making, and allowing scientific breakthroughs that otherwise would take us years to come up with.

The human signal needs a home

Public platforms are excellent for reach, but they do a poor job at providing provenance. Social media favours speed over care.

If we the Internet is going to survive an avalanche of AI auto-generated content, we need spaces where context, accountability, reputation and authenticity work the way traditional knowledge-based communities and discussion forums have always worked.

Professional and trade associations, alumni networks, non‑profits, special‑interest groups, open‑source projects, research consortia, patient and caregiver communities, standards bodies, guilds and unions, member co‑operatives, and even neighbourhood associations.

These are places where we can authentically connect, remember, verify, and improve. And AI can contribute significantly without replacing the human voice.

In a private community, threaded discussions, accepted solutions, and linked sources make it clear how a conclusion formed. If AI assists with a summary or translation, it can be labelled and auditable.

AI can bring automation and help close the loop by turning conversations into durable knowledge, curate them into FAQs, tutorials, learning paths, and close the loop with product teams and policy teams to finally bring the outcomes back to the thread.

Because private communities usually reward quality of contribution, not volume, we recognise members who document steps, share data, and follow up, and we can replace vanity metrics with accepted solutions, helpful response, and peer‑to‑peer resolutions.

The tipping point. The choice is ours to make

But while in certain places the signal sharpens, elsewhere it blurs. Which leads me to the next question, the uncomfortable one: what happens when the discovery engine also becomes the primary author that feeds us with information?

When most of what we read and watch is machine made, will we prefer content marked "human made" the way we seek out fair-trade coffee?

Best case: provenance standards and watermarking make origins legible, human–AI co-creation lifts quality and careful curation earns a premium.

Worst case: content pollution; models trained on their own exhaust, and close information loops quickly degrading meaning and value. Knowledge becomes commoditized, and high-value work lays behind paywalls.

The outcome depends less on the size of the model and more on the governance, our habits and the rules we manage to put around the LLM family.

It's up to each one of us to ensure the human voice does not become an online rarity. We can keep it front and centre and let LLM family enhance it and amplify it instead of drowning it out.

(1) There is no official “GB per hour” counter for AI-generated content, so I triangulated from public baselines. For example ~7.5–8.3 million posts per day, at a median of 2.3–2.7 MB/page, and an April 2025 analysis showing ~74% of new pages "contain AI text", from which I extrapolate roughly 540–600 GB per hour of AI-influenced page payload. If we added image generation telemetry and AI video, it pushes it to 1–2 TB per hour, so treat this as an order-of-magnitude guide rather than a census.

Note: I used AI for topic research, image generation, proof reading and styling. All content was written by a human (me).

View full post