Build Journal

The ChatGPT Subscription Trap: Stuck Between Tiers With 1.1 Billion Tokens

I am burning through tokens faster than any single ChatGPT plan was designed for, but I have not made a penny from this yet. The subscription math for multi-agent orchestration does not add up, and I am in the gap between tiers with no clear exit.

2026-05-01 · 5 min read

The number that stopped me.

Since 13 April, I have pushed 171.8 million tokens through ChatGPT alone. Total usage across my account: 1,092.2 million tokens. That is 1.1 billion tokens.

I am on the Plus subscription. Twenty pounds a month.

The limits are not theoretical anymore. I hit them regularly. Mid-orchestration, mid-evaluation, mid-build — the rate limit arrives and everything stalls. My agents sit waiting. Tokens queue up. Context windows that were carefully loaded evaporate because the session dies before the work completes.

So I asked ChatGPT directly: should I upgrade? Business or Pro?

The answer was nuanced, but the subtext was clear: neither plan was designed for what I am actually doing.

What "pushing limits" actually looks like.

My setup is not one person chatting with one model. I run a multi-agent system on Hermes Agent. ChatGPT Codex 5.3 is the orchestrator — it generates prompts for other models, evaluates their responses, decides routing, and stitches results together. The other models — DeepSeek v4, GLM 5.1, Nemotron — handle specialised tasks. About 47.9% of my inference runs locally on qwen3.5:9b via Ollama.

ChatGPT itself accounts for only 16.4% of my token usage, and with the latest release of ChatGPT 5.5 I am directing it more towards orchestration. The direct work — long-context reasoning, code generation, planning passes, and content drafting — is becoming less and less central.

This is not a usage pattern that fits neatly into a pricing table.

The tier trap.

The Plus plan gives me GPT-5, Codex, and the standard toolset for £20/month. But the rate limits are tuned for single-user conversational workflows. When I chain five agent calls in sequence, each with its own context window, the rate limiter sees a flood.

Pro (£80 (5x) or £200 (20x) /month) increases most rate limits and gives me "extended thinking" — longer reasoning chains, deeper context. But the top tier at £200 per month for something I have not monetised yet? That is £2,400 per year on a bet.

Business (£30/user/month, minimum two users) is cheaper per seat but adds admin overhead, team management features I do not need, and still has usage caps — just higher ones. It is designed for teams sharing a workspace, not one person running an agent swarm.

There is no "power user" tier. There is no "I run AI infrastructure through your chat interface" plan. The gap between Plus and Pro is a canyon, and I am standing in the middle of it.

The API escape hatch that is not open yet.

The obvious answer is: move to the API. Pay per token, scale precisely, no rate limits beyond your wallet. I know this. Every technical argument points there.

But the API path requires a different economic model. You pay upfront, you hope the output generates revenue. I have not made any money from this yet. Not a single pound. The blog does not have monetisation. The consulting pipeline is not built. The product ideas are still in the workshop.

So the API is not an escape hatch — it is a second trap. One where the meter runs continuously whether or not anything ships.

The desktop Codex gap.

There is another variable I cannot even test yet: the desktop version of Codex. OpenAI has shipped a desktop Codex experience with deeper system integration — file access, terminal control, persistent workspaces. But there is no Ubuntu version. I am on Ubuntu 24.04. The desktop Codex does not exist for my operating system. (If you are on Windows or macOS, I can only recommend trying it. It beats Claude by miles.)

This matters because desktop Codex might change the orchestration equation. If it handles multi-step agent workflows more efficiently, if it reduces the token overhead of context reloading, the Pro tier might justify itself. But I cannot test that hypothesis. I am locked out of the experiment by my choice of operating system.

What I am actually deciding.

This is not really about Plus versus Pro versus Business. It is about whether I am building infrastructure or running a hobby.

If this is a hobby, Plus is the right tier. Hit the limits, wait, come back later. No shame in that.

If this is infrastructure — if I am genuinely building a multi-agent system that produces value — then £200/month is cheap compared to the value of uninterrupted compute. The question is whether I believe that value exists yet.

I have pushed 1.1 billion tokens through this system. I have built memory architectures, automated publishing pipelines, design systems, benchmark frameworks. The output is real. But the revenue is zero.

That is the gap. Not between subscription tiers. Between output and income.

The number I keep coming back to.

1,092.2 million tokens. That is not a casual user. That is someone running serious compute through a consumer plan. According to ChatGPT with "frontier routing, local fallback, judge architecture… [I am] already operating more like an inference engineer than a typical “ChatGPT subscriber.”"

OpenAI probably looks at my account and sees an anomaly. A Plus subscriber behaving like an enterprise customer. Someone who should have been on Pro months ago.

But I look at the same number and see something else: proof that I am building something real. The question is whether I am ready to pay what it actually costs.

I am still on Plus. The rate limits hit again while I was drafting this.

Found this useful? Follow Raf VRS on X for the @Raf_VRS build journal behind Hard Interference — the token counts, the subscription maths, and the reality of pushing consumer AI tools past their design limits.

☕ Support the build journal on Ko-fi — visible support stays at ko-fi.com/rafvrs, but the site routes it through the internal support page.

#BuildJournal #MultiAgent #ChatGPT #AICosts #HermesAgent