Free AI is over: why ChatGPT is getting dumber

Q: What does serious AI use cost per month in 2026?

For heavy professional use, expect a fixed cost of between 100 and 300 dollars a month per user, spread across two or three providers. For lighter business use, a single Pro subscription of 20 to 30 dollars a month often suffices.

Q: Which AI models are best combined?

Route per task rather than per brand. Perplexity and Gemini for recent information and search work, Claude Code for heavy coding and complex reasoning jobs, Codex CLI for web design and front-end, and Claude Sonnet or GPT-mini for simple tasks. Three or four subscriptions protect you against the silent degradation of one provider.

Last week I asked ChatGPT for help with a piece I had earlier had it draft for me in a few minutes, but the answer that came back was shorter than expected and flatter than I was used to. At first I thought I had given too little context, tried again with more, and got the same result. An hour later the notice appeared that I had reached my limit and was switched to a smaller model.

At that moment I realised something uncomfortable: I had become an addict, and the dealer had just quietly started diluting the doses.

TL;DR

Free AI is quietly getting worse. OpenAI and Anthropic dilute the doses through limits, automatic downgrades to lighter models, and silent retraining that no one announces.
This is not a conspiracy, but a business model reinforced by physical scarcity: power shortages and chip shortages force labs to give their compute to the highest bidder.
Model collapse makes it worse: an internet that is 74% AI-generated text pollutes the training data of future models, and those consequences land mainly on the free tier.
Three concrete actions: accept that AI becomes a fixed cost, route per task across multiple providers, and build your own "thermometer" to measure model degradation.

Anthropic does exactly the same, even if that may sound strange to say out loud. I pay 100 dollars a month for Claude Max and for the past few months hit my limit almost every week, sometimes already on Wednesday, after which I started planning my workdays around the counter instead of around the content. Last week it was quieter and I did not hit the limit, but instead of being happy I wondered whether Anthropic had quietly given back some headroom, or whether I had simply worked less intensively.

That is exactly the type of question you cannot resolve, because the meters are not yours — they belong to the lab. Anthropic openly admits that it actively tweaks Claude's capacity to manage demand (The Register, March 2026 (opent in nieuw venster)). The direction varies: looser last week, stricter before that, next week who knows. No one says it out loud, and you only notice when the counter runs differently.

OpenAI was the dealer. And we knew it.

It is the oldest trick in the book: first free stuff to get you acquainted, good stuff too, so good that you do not want to go back. Then you see a few small changes that are not big enough to complain about — a limit that pops up, a question that no longer gets picked up, an answer that comes out thinner — and by then you are already on the subscription. And when that is no longer enough either, there is a Pro tier for 200 dollars a month, and above that another layer for those who really want work done.

This is not a conspiracy. This is a business model.

OpenAI's free tier now accepts 10 messages per five hours on the top model (OpenAI Help Center, 2026 (opent in nieuw venster)), after which you automatically fall back to a mini version that can noticeably do less. GPT-4o, the model that millions of users fell in love with, was quietly removed from ChatGPT on 13 February 2026 (OpenAI Help Center (opent in nieuw venster)) — without an announcement video, without a farewell blog, just a help-desk article. The bill is being divided: those who want a lot pay a lot, and those who pay nothing get a pill bottle with fewer and fewer pills in it.

How to tell the free dose has been diluted

There are a few signals I recognise from my own use and from conversations in my network, and once you know them, you see them everywhere.

The hard limit that was not there a year ago

A year ago free ChatGPT was virtually unlimited for individual questions and some writing, but now it is a stopwatch: ten messages, then into a smaller room, and at peak moments that happens faster than the homepage states. It is like sitting in a restaurant where the portion is announced larger than it lands on your plate, and no one is measuring it.

The automatic downgrade to lighter models

Model degradation is the silent transition from a top model to a lighter model without the user noticing. When you reach your limit, you are not logged out but imperceptibly switched to a smaller variant — the interface stays the same, the output is different, and anyone who says "but I could still do this yesterday" misses it: you did it yesterday with a different model than today.

The technical veil around how models think

At Anthropic something similar is going on, but more technical. In early 2026, with Claude Opus 4.6, Anthropic introduced an "effort" setting that lets the model decide for itself how much it thinks, from low to maximum, and for long conversations they added context compaction: a feature that automatically shortens older parts of a conversation as the context window threatens to fill up. Those improvements are technically sound, but you lose something with them — namely the visibility into what is happening under the hood.

The silence around retraining

No lab announces that it has made a model cheaper, but suddenly reports appear from users that answers are less in-depth than two weeks earlier, which is sometimes dismissed as "prompt drift" — the term labs use to say it is you, not them. There is a chance that part of those complaints is indeed user bias, but there is also a chance that labs turned knobs to cut costs, and the problem is that you cannot verify it. That is not innocent. That is structural.

Sora is dead, and that says everything

The clearest confirmation of this whole story is what happened with Sora, because Sora was OpenAI's prestige product for video: announced in February 2024 with clips that made the whole world gape, and followed in December 2025 by a billion-dollar licensing deal with Disney in which characters from Marvel and Star Wars would be built into Sora (Variety, 2026 (opent in nieuw venster)).

On 24 March 2026 came the news that Sora is being shut down — no transition, no alternative, the app gone on 26 April and the API until September (OpenAI Help Center (opent in nieuw venster)). Disney heard less than an hour before the public announcement, which means a billion-dollar deal went up in flames without the partner having any time at all to respond.

Why? According to reporting by TechCrunch, Sora burned roughly a million dollars a day in compute, while the user count collapsed from a peak of a million people to less than half (TechCrunch, 29 March 2026 (opent in nieuw venster)). Sam Altman chose the B2B route because Anthropic was busy winning the software developers and business customers who actually make money, and OpenAI had to free up GPUs for GPT-5 reasoning models that do business work instead of for consumers making deepfake clips of Pikachu.

Read that again carefully: a billion to Disney was given up because consumer video was not profitable, the deal was signed in December 2025 and the shutdown came on 24 March 2026, which means leadership decided within a few weeks that its own prestige branch could not be made cost-covering and chose fuel for the business route. That is how the numbers now weigh against the story.

The real reason is not in a boardroom, but in a power socket

Here the story gets interesting, because even if OpenAI had been the nicest club in the world, free AI would not have survived within two years. The problem, you see, is physical.

Power is the real ceiling

Data centres are on track for more than 1,000 terawatt-hours of electricity consumption in 2026 — in the worst case towards 1,050 — which amounts to the electricity consumption of the whole of Japan (IEA Energy and AI report (opent in nieuw venster)). Anthropic itself estimates that training a single frontier model in 2028 will require a data centre of 5 gigawatts, which equals the power of a medium-sized city for one single training run (Anthropic, Build AI in America, July 2025 (opent in nieuw venster)).

Chips are the second ceiling

And then there are the chips, because that is where it gets personal. The Nvidia Blackwell B200 was available to rent on spot markets at the end of April 2026 for 4.08 dollars an hour, a rise of 48% in just two months (KuCoin (opent in nieuw venster)), while delivery times for new Blackwell systems run up to a year and Nvidia has an order book of some 3.6 million units still outstanding (Spheron, 2026 (opent in nieuw venster)).

What scarcity does to your access

What do you do as an AI lab when you do not have enough chips and not enough power? You give the chips first to the customers who pay the most — the B2B customers with multi-year contracts — while the free consumer gets what is left, and "what is left" gets smaller every month. No greed. Just scarcity, and scarcity you do not see because the power grid simply does not work as a marketing message.

Model collapse: the poison is in your own supply chain

There is a second problem that almost no one names, and it has to do with the raw material the whole sector runs on. Model collapse is the quality loss that occurs when AI models are trained on data that is itself largely AI-generated. The internet is filling up rapidly with AI text, and if the next generation of models is trained on an internet that is three-quarters its own output, quality loss occurs in which models lose variation, forget rare patterns and increasingly start to resemble themselves. In April 2025 more than 74% of new web pages were already AI-generated (Wikipedia model collapse (opent in nieuw venster)).

Those consequences do not land on the Pro tier, because there you pay for the cleanest-trained models. They land on the free tier, on the small models, in the place where quality is least visible.

What it means for you

First the optimistic part: for a great many tasks the current free AI is still fine, because a shopping list, a brainstorm or a summary of an email — that works, and it works well. But if you or your organisation deploy AI for something that matters, then there are three things I seriously want to pass on to you.

1. Accept that the free phase is over

Do not pretend you do not need a budget for AI, because you do, just as you have a budget for power, internet, telephony and software. AI is going to become a fixed cost instead of an experiment, and if you do not accept that now, in six months you will be surprised why your AI output no longer meets what you were used to.

2. Stop blindly relying on one provider

If your organisation leans heavily on one model, you are vulnerable to the silent degradation of that model without noticing. I wrote earlier about why blocking shadow AI does not work and how employees always find their way to tools that work, but the other side of that coin is that you need a routing regime.

In my own work I use Perplexity and Gemini for search work where I expect recent data, the latest Codex CLI for web design because it is really good, Claude Code for the heavy coding work and the complex reasoning jobs, and Sonnet 4.6 for simple tasks I do not want to run through a more expensive model. I also make sure my claude.md is current but not so extensive that it eats up the context window itself. A routing regime is not a gimmick — it is your insurance against a provider that suddenly turns other knobs.

3. Build your own thermometer

That is easier than it sounds. I do it myself with a fixed prompt that I run again every two weeks on my main model: an instruction that counts ("write exactly 60 words"), an instruction that requires reasoning ("give three counterarguments, each in one sentence"), and an instruction that requires initiative ("end with one follow-up question to the author"), always sending the same piece of text as input and saving the output with a date.

What you will see is recognisable: in periods of scarcity the word count suddenly no longer matches, you get two counterarguments instead of three, or the follow-up question becomes more generic. I see it come back in my own weekly test, and the variation in output is my most honest gauge of what is changing behind the scenes — no technical benchmark, no Python knowledge, just a folder with dates.

The free phase of AI was not a gift, but an investment of billions to make you dependent. That worked, including on me, and now the payback period begins. That does not have to be a disaster, but looking away is also a choice: do you keep reacting to what the labs give you, or do you decide for yourself what you use and why? I choose the latter, not because I love paying, but because dependence on something you cannot see or measure is the most expensive thing there is.

Frequently asked questions

Is free AI really over?

For individual, simple tasks such as a summary, a shopping list or a short brainstorm, free AI still works fine. But for work that matters — content with your name on it, decisions that affect your organisation, code that runs in production — the free phase is practically over. The combination of tighter limits, automatic downgrades to lighter models and silent retraining makes the output on the free tier less reliable than a year ago, and that direction is reinforced by physical scarcity of power and chips.

How do I know if my ChatGPT or Claude has got worse?

You do not know for sure, because the labs do not publish quality metrics per user. What you can do is build your own "thermometer": a fixed prompt with measurable elements (write exactly 60 words, give three counterarguments, ask one follow-up question) that you run the same way every two weeks, with the same input. Save the output with a date. If the word count suddenly deviates, or the number of counterarguments does not match, you see that the output has changed without any announcement.

Why was Sora shut down while a billion-dollar Disney deal was running?

OpenAI burned an estimated million dollars a day in compute for Sora while the user count halved, and Anthropic was busy winning business software developers where there is far more money than in consumer video. By shutting down Sora, GPUs could be freed up for GPT-5 reasoning models that do business work. It was a choice between prestige (the Disney deal) and cash flow (B2B customers), and that choice makes clear which way OpenAI is heading.

What is model collapse and why does it affect me?

Model collapse is the quality loss that occurs when AI models are trained on data that is itself largely AI-generated. Models lose variation, forget rare patterns and increasingly resemble themselves. Because in 2025 already 74% of new web pages were AI-generated, each new training round pollutes the data of future models further. Pro-tier models get the cleanest data, but free models get what is left, so that is where the effect lands first.

What does serious AI use cost per month in 2026?

It depends on what you do, but for heavy professional use (coding, content creation, research) expect a fixed cost of between 100 and 300 dollars a month per user, spread across two or three providers. For lighter business use, a single Pro subscription of 20 to 30 dollars often suffices. That sounds like a lot, but compared with the hours it saves, it is usually cheaper than an hour of an external consultant.

Which AI models are best combined?

Route per task rather than per favourite brand. For recent information and search work, Perplexity and Gemini work well because they actively integrate web search. For heavy coding and complex reasoning jobs, Claude Code is strong right now. For web design and front-end work, OpenAI's latest Codex CLI is excellent. For simple tasks that do not justify the top model, you can use Claude Sonnet 4.6 or GPT-mini. Three or four subscriptions sounds like a lot, but it protects you against the silent degradation of one provider.

Sources

The Register, March 2026: Anthropic tweaks usage limits (opent in nieuw venster)
OpenAI Help Center: ChatGPT free tier FAQ (opent in nieuw venster)
OpenAI Help Center: GPT-4o retirement (opent in nieuw venster)
Variety, 2026: OpenAI shutting down Sora (opent in nieuw venster)
OpenAI Help Center: Sora discontinuation (opent in nieuw venster)
TechCrunch, 29 March 2026: Why OpenAI really shut down Sora (opent in nieuw venster)
IEA: Energy and AI report (opent in nieuw venster)
Anthropic: Build AI in America, July 2025 (opent in nieuw venster)
KuCoin: Nvidia GPU rental prices surge (opent in nieuw venster)
Spheron, 2026: GPU shortage 2026 (opent in nieuw venster)
Wikipedia: Model collapse (opent in nieuw venster)