AI is cannibalizing itself. And creating more AI.
Artificial intelligence consumption is outpacing the data humans are creating
Artificial intelligence is trained on data that is largely taken from the internet. However, with the volume of data required to school AI, many models end up consuming other AI-generated data, which can in turn negatively affect the model as a whole. With AI both producing and consuming data, the internet has the potential to become overrun with bots, with far less content being produced by humans.
Is AI cannibalization bad?
AI is eating itself. Currently, artificial intelligence is growing at a rapid rate and human-created data needed to train models is running out. "As they trawl the web for new data to train their next models on — an increasingly challenging task — [AI bots are] likely to ingest some of their own AI-generated content, creating an unintentional feedback loop in which what was once the output from one AI becomes the input for another," said The New York Times. "When generative AI is trained on its own content, its output can also drift away from reality." This is known as model collapse.
Still, AI companies have their hands tied. "To develop ever more advanced AI products, Big Tech might have no choice but to feed its programs AI-generated content, or just might not be able to sift human fodder from the synthetic," said The Atlantic. As it stands, synthetic data is necessary to keep up with the growing technology. "Despite stunning advances, chatbots and other generative tools such as the image-making Midjourney and Stable Diffusion remain sometimes shockingly dysfunctional — their outputs filled with biases, falsehoods and absurdities." These inaccuracies then carry through to the next iteration of the AI model.
Subscribe to The Week
Escape your echo chamber. Get the facts behind the news, plus analysis from multiple perspectives.
Sign up for The Week's Free Newsletters
From our morning news briefing to a weekly Good News Newsletter, get the best of The Week delivered directly to your inbox.
From our morning news briefing to a weekly Good News Newsletter, get the best of The Week delivered directly to your inbox.
That is not to say that all AI-generated data is bad. "There are certain contexts where synthetic data can help AIs learn," said the Times. "For example, when output from a larger AI model is used to train a smaller one, or when the correct answer can be verified, like the solution to a math problem or the best strategies in games like chess or Go." Also, experts are working to create synthetic data sets that are less likely to collapse a model. "Filtering is a whole research area right now," Alex Dimakis, a computer scientist at the University of Texas at Austin and a co-director of the National AI Institute for Foundations of Machine Learning, said to The Atlantic. "And we see it has a huge impact on the quality of the models."
Is AI taking over the internet?
The issue of training newer artificial intelligence models may be underscoring a larger problem. "AI content is taking over the Internet," and text generated by "large language models is filling hundreds of websites, including CNET and Gizmodo," said Scientific American. AI content is also being created much faster and in larger quantities than human-made content. "I feel like we're kind of at this inflection point where a lot of the existing tools that we use to train these models are quickly becoming saturated with synthetic text," Veniamin Veselovskyy, a graduate student at the Swiss Federal Institute of Technology in Lausanne, said to Scientific American. Images, social media posts and articles created by AI have already flooded the internet.
The monumental amount of AI content on the internet, including tweets by bots, absurd pictures and fake reviews, has given rise to a more sinister belief. The dead internet theory is the "belief that the vast majority of internet traffic, posts and users have been replaced by bots and AI-generated content, and that people no longer shape the direction of the internet," said Forbes. While once just a theory floating around the forum 4Chan during the early 2010s, the belief has gained momentum recently.
Some believe that AI content on the internet goes deeper than just getting social media engagement or training models. "Does the dead internet theory stop at harmless engagement farming?" Jake Renzella, a lecturer and Director of Studies (Computer Science) at UNSW Sydney, and Vlada Rozova, a research fellow in applied machine learning at The University of Melbourne, said in The Conversation. "Or perhaps beneath the surface lies a sophisticated, well-funded attempt to support autocratic regimes, attack opponents and spread propaganda?"
Luckily, experts say that the dead internet theory has not come to fruition yet. "The vast majority of posts that go viral — unhinged opinions, witticisms, astute observations, reframing of the familiar in a new context — are not AI-generated," said Forbes.
Sign up for Today's Best Articles in your inbox
A free daily email with the biggest news stories of the day – and the best features from TheWeek.com
Devika Rao has worked as a staff writer at The Week since 2022, covering science, the environment, climate and business. She previously worked as a policy associate for a nonprofit organization advocating for environmental action from a business perspective.
-
Can Ukraine win over Donald Trump?
Today's Big Question Officials in Kyiv remain optimistic they can secure continued support from the US under a Trump presidency
By Richard Windsor, The Week UK Published
-
Orbital by Samantha Harvey: the Booker prize-winner set to go 'stratospheric'
In The Spotlight 'Bold' and 'scintillating' novel follows six astronauts orbiting Earth on the International Space Station over 24 hours
By Irenie Forshaw, The Week UK Published
-
Gladiator II: Paul Mescal 'mesmerising' in 'relentlessly entertaining' sequel
The Week Recommends Ridley Scott's 'primary aim' is fun, in this 'exhilarating' blockbuster
By Irenie Forshaw, The Week UK Published
-
What Trump's win could mean for Big Tech
Talking Points The tech industry is bracing itself for Trump's second administration
By Theara Coleman, The Week US Published
-
Google Maps gets an AI upgrade to compete with Apple
Under the Radar The Google-owned Waze, a navigation app, will be getting similar upgrades
By Justin Klawans, The Week US Published
-
Australia proposes social media ban before age 16
Speed Read Australia proposes social media ban before age 16
By Peter Weber, The Week US Published
-
Social media ban: will Australia's new age-based rules actually work?
Talking Point PM Anthony Albanese's world-first proposal would bar children under 16 even if they have parental consent, but experts warn that plan would be ineffective and potentially exacerbate dangers
By Harriet Marsden, The Week UK Published
-
Is ChatGPT's new search engine OpenAI's Google 'killer'?
Talking Point There's a new AI-backed search engine in town. But can it stand up to Google's decades-long hold on internet searches?
By Theara Coleman, The Week US Published
-
Teen suicide puts AI chatbots in the hot seat
In the spotlight A Florida mom has targeted custom AI chatbot platform Character.AI and Google in a lawsuit over her son's death
By Theara Coleman, The Week US Published
-
FTC bans fake online product reviews
Speed Read The agency will enforce fines of up to $51,744 per violation
By Peter Weber, The Week US Published
-
The Internet Archive is under attack
Under the Radar The non-profit behind open access digital library was hit with both a data breach and a stream of DDoS attacks in one week
By Theara Coleman, The Week US Published