Build Journal

Weekly Usage Report — Week 3 (Apr 20–26): 530 Million Accounted Tokens for £9.24

Week 3: 449.3M visible tokens plus 80.8M cached tokens, for 530.1M total accounted Hermes tokens across 2,288 sessions. Opus-equivalent API cost: about £6,543.

2026-04-27 · 5 min read

530 million accounted tokens in a single week once cached context is included. That's not a typo. The visible input/output work alone was 449.3M tokens, and the full footprint is larger — for the price of a Pret subscription. And this time, the boundaries are right.

This is Week 3 of my ongoing transparency series. Every Monday, I publish exactly what my AI agent consumed and what it cost. No rounding. No spin. Just honest numbers from Mission Control.

Token accounting

This report separates visible prompt/completion tokens from cached context. Visible tokens show fresh input/output work; cached tokens show repeated context reused during long agent sessions. Together, they show the full model-traffic footprint for the week.

Visible tokens (input + output): 449,253,667 (449.3M)
Cached tokens (cache-read/write): 80,815,872 (80.8M)
Total accounted tokens: 530,069,539 (530.1M)
Sessions: 2,288
Input tokens: 445,459,185
Output tokens: 3,794,482
Total cost: £9.24/week
Opus-equivalent API cost: approximately £6,543

The week in one picture

This is the headline version of Week 3: 449.3M tokens, 2,288 sessions, £9.24 in subscription route cost — and the curve getting ridiculous against per-token API pricing.

Weekly Usage Report Week 3 — 530 million accounted tokens for £9.24 compared with per-token pricing

View full-size infographic

Top visible model routes

Model	Type	Share of visible route tokens	Cost
GLM-5.1	Cloud (OAuth)	40%	£4.62/wk
Qwen 3.5 9B	Local (Ollama)	35%	Free
GPT-5.3 Codex	Cloud (OAuth)	24%	£4.62/wk

Model shares are visible-route estimates, not shares of the 530.1M cache-inclusive accounted total.

Qwen 3.5 9B jumped from 25% to 35% of the visible route mix this week — the local model carried more of the fresh input/output work. GLM-5.1 still led the heavy-context visible sessions, and GPT-5.3 Codex handled coding tasks. Cached context is included in the accounting block above, but not distributed across these route estimates.

Daily Breakdown

Mon Apr 20: 268 sessions, 85,257,674 visible (85.3M) + 3,529,472 cached (3.5M) = 88,787,146 total accounted tokens (88.8M), 16.8% of the week; cache share 4.0%, visible share 96.0%. Work note: Multi-agent delegation, mission control dashboard, notetaking, blog drafting, Telegram conversations.
Tue Apr 21: 309 sessions, 84,669,012 visible (84.7M) + 1,979,264 cached (2.0M) = 86,648,276 total accounted tokens (86.6M), 16.3% of the week; cache share 2.3%, visible share 97.7%. Work note: Hard Interference Demo category links, Mission Control Memory tab build, CSS styling fixes.
Wed Apr 22: 317 sessions, 100,740,103 visible (100.7M) + 1,895,424 cached (1.9M) = 102,635,527 total accounted tokens (102.6M), 19.4% of the week; cache share 1.8%, visible share 98.2%. Work note: Peak day. Blog icon generation, blog structure design, logo/favicon work.
Thu Apr 23: 332 sessions, 73,488,850 visible (73.5M) + 1,964,928 cached (2.0M) = 75,453,778 total accounted tokens (75.5M), 14.2% of the week; cache share 2.6%, visible share 97.4%. Work note: 40 blog image variants generated, pending.md todo system, memory compression fixes.
Fri Apr 24: 377 sessions, 43,465,639 visible (43.5M) + 25,424,000 cached (25.4M) = 68,889,639 total accounted tokens (68.9M), 13.0% of the week; cache share 36.9%, visible share 63.1%. Work note: Ko-fi donation page, Cloudflare custom domains research, detailed hosting cost analysis, logo SVG, skill extraction.
Sat Apr 25: 355 sessions, 48,041,981 visible (48.0M) + 42,309,760 cached (42.3M) = 90,351,741 total accounted tokens (90.4M), 17.0% of the week; cache share 46.8%, visible share 53.2%. Work note: Stale project pipeline recovery, context-loss-recovery skill, terminal sessions.
Sun Apr 26: 330 sessions, 13,590,408 visible (13.6M) + 3,713,024 cached (3.7M) = 17,303,432 total accounted tokens (17.3M), 3.3% of the week; cache share 21.5%, visible share 78.5%. Work note: Quietest day. Mostly automated cron (315/330 sessions), ComfyUI research, I took the kids for a hike and later spent the evening to finalise all the posts.

The Price Comparison

What would 449M tokens cost on per-token pricing?

Claude Opus 4.6: £5,573 → 603x my cost
Gemini 2.5 Pro: £1,279 → 138x
Claude Sonnet 4: £1,115 → 121x
GPT-5.3 Codex (per-token): £553 → 60x
DeepSeek Chat: £100 → 11x
GPT-4o mini: £55 → 6x

On Opus per-token pricing, this single week would cost £5,573. That's £290,000 a year. For one person's AI usage. I paid £9.24.

The Opus multiplier climbed from 502x to 603x this week — not because Opus got more expensive, but because my token volume keeps growing while the bill stays flat. That's the subscription advantage compounding.

Notable Events

Wednesday Apr 22 — 100.7M Tokens

The week's biggest day. Blog icon generation for the ChatGPT image prompt, a full blog audit across all posts, and extensive logo/favicon work. 317 sessions averaging 318K tokens each. The I/O ratio hit 180:1 — the agent consumed massive context windows (reading full post files, design specs) while producing focused edits.

Monday Apr 20 — 85.3M Tokens

A strong start to the week. Heavy multi-agent delegation across Dade, Coder, and Plague for mission control dashboard work, blog writing, and Telegram conversations. The I/O ratio of 182:1 shows deep context work — typical of multi-step agent orchestration.

Tuesday Apr 21 — 84.7M Tokens

Hard Interference demo day. Category link wiring, Mission Control Memory tab build, CSS styling passes, and auto light/dark mode implementation. Nearly matched Monday's volume with 309 sessions at 274K average.

Week-over-Week Comparison

Metric	Week 2 (Apr 14–20)	Week 3 (Apr 20–26)	Change
Total tokens	378.3M	449.3M	+18.8%
Total sessions	1,146	2,288	+99.7%
Cost	£9.24	£9.24	0%
Effective rate	£0.025/M	£0.017/M	-32%
I/O ratio	188:1	117:1	Shift

Note: Week 3 uses correct Mon–Sun boundaries (Apr 20–26). Previous weeks had offset boundaries, so exact comparisons are approximate.

Sessions nearly doubled. Tokens grew 19%. Cost didn't budge. The effective rate dropped 14% because the fixed £9.24 now covers 19% more tokens. More sessions doesn't mean more cost — it means the agent is doing more things, not bigger things. The I/O ratio shifted from 188:1 to 117:1 — more interactive work (terminal sessions, shorter tasks) alongside the usual deep-context operations.

The Stack

Component	Cost	Type
GLM-5.1 (cloud)	£4.62/wk	OAuth subscription
GPT-5.3 Codex (cloud)	£4.62/wk	OAuth subscription
Qwen 3.5 9B (local)	£0	Local Ollama
Gemma 4 31B (cloud)	£0	Free tier
MiniMax M2.7 (cloud)	£0	Free tier
Total	£9.24/wk	£480/year

No API keys. No per-token billing. No surprise invoices.

The Bottom Line

Week 3: 449M tokens. 2,288 sessions. £9.24.

Sessions doubled. Tokens grew 19%. Rate dropped 14%. The subscription advantage compounds — every additional token makes the flat rate more absurd compared to per-token pricing.

603x cheaper than Opus. That's not a discount. That's a fundamentally different model of computing.

Found this useful? 👉 Follow @Raf_VRS for more transparent AI insights that put you in control of your hardware. 👉 Support the work: ko-fi.com/rafvrs

#VRSComputing #ModelBenchmarking #TokenUsage #AIAgents #CostTransparency