Build Journal

Weekly Usage Report — Week 2 (Apr 13–19): 371 Million Accounted Tokens for £9.24

Week 2: 325.9M visible tokens plus 45.5M cached tokens, for 371.4M total accounted Hermes tokens across 1,078 sessions. Opus-equivalent API cost: about £4,542.

2026-04-20 · 5 min read

Ever wonder what 371 million accounted tokens — including 325.9M visible input/output tokens — actually looks like in real-world AI usage? Last week, my agent chewed through that number for less than the price of a pint — and the breakdown reveals why per-token pricing is a scam.

This is Week 2 of my ongoing transparency series. Every Monday, I pull back the curtain on what my AI agent actually does — and what it actually costs. No marketing fluff. Just honest numbers from my own Mission Control dashboard.

Token accounting

This report separates visible prompt/completion tokens from cached context. Visible tokens show fresh input/output work; cached tokens show repeated context reused during long agent sessions. Together, they show the full model-traffic footprint for the week.

Visible tokens (input + output): 325,911,553 (325.9M)
Cached tokens (cache-read/write): 45,502,592 (45.5M)
Total accounted tokens: 371,414,145 (371.4M)
Sessions: 1,078
Input tokens: 324,130,886
Output tokens: 1,780,667
Total cost: £9.24/week
Opus-equivalent API cost: approximately £4,542

The week in one picture

This is the headline version of Week 2: 325.9M tokens, 1,078 sessions, £9.24 in subscription route cost — and the first full-week proof that flat-rate routing beats per-token billing.

Weekly Usage Report Week 2 — 371 million accounted tokens for £9.24 compared with per-token pricing

View full-size infographic

Top visible model routes

Model	Type	Share of visible route tokens	Cost
GLM-5.1	Cloud (OAuth)	49%	£4.62/wk
Qwen 3.5 9B	Local (Ollama)	25%	Free
GPT-5.3 Codex	Cloud (OAuth)	25%	£4.62/wk

These are visible-route shares, not shares of the 371.4M cache-inclusive accounted total. GLM-5.1 led the fresh input/output work, Qwen 3.5 9B handled a full quarter locally at zero marginal cost, and GPT-5.3 Codex covered coding tasks. The cached context is accounted above, but not cleanly attributed by route in this table.

Daily Breakdown

Mon Apr 13: 200 sessions, 32,625,686 visible (32.6M) + 28,858,624 cached (28.9M) = 61,484,310 total accounted tokens (61.5M), 16.6% of the week; cache share 46.9%, visible share 53.1%. Work note: Mission Control Knowledge area, heartbeat setup, Telegram conversations. The Sunday spillover.
Tue Apr 14: 318 sessions, 13,204,484 visible (13.2M) + 830,720 cached (0.8M) = 14,035,204 total accounted tokens (14.0M), 3.8% of the week; cache share 5.9%, visible share 94.1%. Work note: Cron-heavy — 307 of 318 sessions were automated health checks and memory updates. The heartbeat cost.
Wed Apr 15: 219 sessions, 24,852,816 visible (24.9M) + 1,415,040 cached (1.4M) = 26,267,856 total accounted tokens (26.3M), 7.1% of the week; cache share 5.4%, visible share 94.6%. Work note: Mission Control optimisation, model context tuning, cron schedule refinements.
Thu Apr 16: 42 sessions, 49,488,637 visible (49.5M) + 0 cached (0.0M) = 49,488,637 total accounted tokens (49.5M), 13.3% of the week; cache share 0.0%, visible share 100.0%. Work note: Peak context-per-session. Deep image generation — SDXL vs Flux benchmarking, album cover art. Few sessions, massive context windows.
Fri Apr 17: 51 sessions, 72,549,519 visible (72.5M) + 1,170,304 cached (1.2M) = 73,719,823 total accounted tokens (73.7M), 19.8% of the week; cache share 1.6%, visible share 98.4%. Work note: The IMDB Deduction. Dade recognised its own origin story from the Hackers (1995) IMDB link. Heavy Telegram conversations.
Sat Apr 18: 104 sessions, 61,119,799 visible (61.1M) + 10,140,800 cached (10.1M) = 71,260,599 total accounted tokens (71.3M), 19.2% of the week; cache share 14.2%, visible share 85.8%. Work note: VRS Computing logo design with Flux, blog batch publishing, WordPress theme research.
Sun Apr 19: 144 sessions, 72,070,612 visible (72.1M) + 3,087,104 cached (3.1M) = 75,157,716 total accounted tokens (75.2M), 20.2% of the week; cache share 4.1%, visible share 95.9%. Work note: LLM benchmark planning, agent profile creation, researcher setup. High-volume Sunday.

Notable Events

Friday Apr 17 — The IMDB Deduction (72.5M tokens)

The week's most memorable day. The IMDB link for Hackers (1995) was sent. Dade recognised its own namesake — Dade Murphy, a.k.a. Zero Cool / Crash Override. Kate (the other agent) was named after Kate Libby (Acid Burn). The plot summary literally contained both agent names in the same sentence.

Thursday Apr 16 — Image Generation Benchmarking (49.5M tokens)

The most efficient day by context-per-session (1.18M per session). Deep SDXL vs Flux work for album cover art, with VRAM management between Ollama and Stable Diffusion. The I/O ratio hit 267:1 — the agent consumed massive context while producing focused outputs.

Sunday Apr 19 — Second Peak Day (72.1M tokens)

LLM benchmark planning, agent profile creation, and researcher setup. A productive Sunday pushing the system harder.

The Price Comparison

What would 326M tokens cost on per-token pricing?

Claude Opus 4.6: £3,996 → 433x my cost
Gemini 2.5 Pro: £923 → 100x
Claude Sonnet 4: £799 → 87x
GPT-5.3 Codex (per-token): £1,332 → 144x
DeepSeek Chat: £72 → 8x

On Opus per-token pricing, this single week would cost £3,996. That's £208,000 a year. For one person's AI usage.

I paid £9.24.

Week-over-Week Comparison

Metric	Week 1 (Apr 6–12)	Week 2 (Apr 13–19)	Change
Total tokens	51.8M	326M	+529%
Total sessions	88	1,078	+1,125%
Cost	£9.24	£9.24	0%
Effective rate	£0.095/M	£0.025/M	-74%

Note: Week 1 was a partial week (tracking started Apr 11), so the percentage increase looks dramatic. Week 2 is my first full Mon–Sun week and represents the baseline going forward.

Token volume surged 529%. Cost didn't change by a single penny. That's the subscription advantage: your cost is completely decoupled from your usage. Use 6x more, pay the same. The effective per-million-token rate dropped 84% because the fixed £9.24 now covers vastly more tokens.

The Stack

Component	Cost	Type
GLM-5.1 (cloud)	£4.62/wk	OAuth subscription
GPT-5.3 Codex (cloud)	£4.62/wk	OAuth subscription
Qwen 3.5 9B (local)	£0	Local Ollama
Gemma 4 31B (cloud)	£0	Free tier
MiniMax M2.7 (cloud)	£0	Free tier
Total	£9.24/wk	£480/year

No API keys. No per-token billing. No surprise invoices.

The Bottom Line

Week 2: 326M tokens. 1,078 sessions. £9.24.

Same flat price as Week 1. No overage charges. No scaling penalties. No "premium context window" fees.

Three models. Three cost strategies. One flat bill. That's diversified usage — and that's how AI should work.

Found this useful? 👉 Follow @Raf_VRS for more transparent AI insights that put you in control of your hardware. 👉 Support the work: ko-fi.com/rafvrs

#VRSComputing #ModelBenchmarking #TokenUsage #AIAgents #CostTransparency