Weekly Usage Report — Week 2 (Apr 13–19): 371 Million Accounted Tokens for £9.24
Week 2: 325.9M visible tokens plus 45.5M cached tokens, for 371.4M total accounted Hermes tokens across 1,078 sessions. Opus-equivalent API cost: about £4,542.
Ever wonder what 371 million accounted tokens — including 325.9M visible input/output tokens — actually looks like in real-world AI usage? Last week, my agent chewed through that number for less than the price of a pint — and the breakdown reveals why per-token pricing is a scam.
This is Week 2 of my ongoing transparency series. Every Monday, I pull back the curtain on what my AI agent actually does — and what it actually costs. No marketing fluff. Just honest numbers from my own Mission Control dashboard.
Token accounting
This report separates visible prompt/completion tokens from cached context. Visible tokens show fresh input/output work; cached tokens show repeated context reused during long agent sessions. Together, they show the full model-traffic footprint for the week.
- Visible tokens (input + output): 325,911,553 (325.9M)
- Cached tokens (cache-read/write): 45,502,592 (45.5M)
- Total accounted tokens: 371,414,145 (371.4M)
- Sessions: 1,078
- Input tokens: 324,130,886
- Output tokens: 1,780,667
- Total cost: £9.24/week
- Opus-equivalent API cost: approximately £4,542
The week in one picture
This is the headline version of Week 2: 325.9M tokens, 1,078 sessions, £9.24 in subscription route cost — and the first full-week proof that flat-rate routing beats per-token billing.

Top visible model routes
| Model | Type | Share of visible route tokens | Cost |
|---|---|---|---|
| GLM-5.1 | Cloud (OAuth) | 49% | £4.62/wk |
| Qwen 3.5 9B | Local (Ollama) | 25% | Free |
| GPT-5.3 Codex | Cloud (OAuth) | 25% | £4.62/wk |
These are visible-route shares, not shares of the 371.4M cache-inclusive accounted total. GLM-5.1 led the fresh input/output work, Qwen 3.5 9B handled a full quarter locally at zero marginal cost, and GPT-5.3 Codex covered coding tasks. The cached context is accounted above, but not cleanly attributed by route in this table.
Daily Breakdown
- Mon Apr 13: 200 sessions, 32,625,686 visible (32.6M) + 28,858,624 cached (28.9M) = 61,484,310 total accounted tokens (61.5M), 16.6% of the week; cache share 46.9%, visible share 53.1%. Work note: Mission Control Knowledge area, heartbeat setup, Telegram conversations. The Sunday spillover.
- Tue Apr 14: 318 sessions, 13,204,484 visible (13.2M) + 830,720 cached (0.8M) = 14,035,204 total accounted tokens (14.0M), 3.8% of the week; cache share 5.9%, visible share 94.1%. Work note: Cron-heavy — 307 of 318 sessions were automated health checks and memory updates. The heartbeat cost.
- Wed Apr 15: 219 sessions, 24,852,816 visible (24.9M) + 1,415,040 cached (1.4M) = 26,267,856 total accounted tokens (26.3M), 7.1% of the week; cache share 5.4%, visible share 94.6%. Work note: Mission Control optimisation, model context tuning, cron schedule refinements.
- Thu Apr 16: 42 sessions, 49,488,637 visible (49.5M) + 0 cached (0.0M) = 49,488,637 total accounted tokens (49.5M), 13.3% of the week; cache share 0.0%, visible share 100.0%. Work note: Peak context-per-session. Deep image generation — SDXL vs Flux benchmarking, album cover art. Few sessions, massive context windows.
- Fri Apr 17: 51 sessions, 72,549,519 visible (72.5M) + 1,170,304 cached (1.2M) = 73,719,823 total accounted tokens (73.7M), 19.8% of the week; cache share 1.6%, visible share 98.4%. Work note: The IMDB Deduction. Dade recognised its own origin story from the Hackers (1995) IMDB link. Heavy Telegram conversations.
- Sat Apr 18: 104 sessions, 61,119,799 visible (61.1M) + 10,140,800 cached (10.1M) = 71,260,599 total accounted tokens (71.3M), 19.2% of the week; cache share 14.2%, visible share 85.8%. Work note: VRS Computing logo design with Flux, blog batch publishing, WordPress theme research.
- Sun Apr 19: 144 sessions, 72,070,612 visible (72.1M) + 3,087,104 cached (3.1M) = 75,157,716 total accounted tokens (75.2M), 20.2% of the week; cache share 4.1%, visible share 95.9%. Work note: LLM benchmark planning, agent profile creation, researcher setup. High-volume Sunday.
Notable Events
Friday Apr 17 — The IMDB Deduction (72.5M tokens)
The week's most memorable day. The IMDB link for Hackers (1995) was sent. Dade recognised its own namesake — Dade Murphy, a.k.a. Zero Cool / Crash Override. Kate (the other agent) was named after Kate Libby (Acid Burn). The plot summary literally contained both agent names in the same sentence.
Thursday Apr 16 — Image Generation Benchmarking (49.5M tokens)
The most efficient day by context-per-session (1.18M per session). Deep SDXL vs Flux work for album cover art, with VRAM management between Ollama and Stable Diffusion. The I/O ratio hit 267:1 — the agent consumed massive context while producing focused outputs.
Sunday Apr 19 — Second Peak Day (72.1M tokens)
LLM benchmark planning, agent profile creation, and researcher setup. A productive Sunday pushing the system harder.
The Price Comparison
What would 326M tokens cost on per-token pricing?
- Claude Opus 4.6: £3,996 → 433x my cost
- Gemini 2.5 Pro: £923 → 100x
- Claude Sonnet 4: £799 → 87x
- GPT-5.3 Codex (per-token): £1,332 → 144x
- DeepSeek Chat: £72 → 8x
On Opus per-token pricing, this single week would cost £3,996. That's £208,000 a year. For one person's AI usage.
I paid £9.24.
Week-over-Week Comparison
| Metric | Week 1 (Apr 6–12) | Week 2 (Apr 13–19) | Change |
|---|---|---|---|
| Total tokens | 51.8M | 326M | +529% |
| Total sessions | 88 | 1,078 | +1,125% |
| Cost | £9.24 | £9.24 | 0% |
| Effective rate | £0.095/M | £0.025/M | -74% |
Note: Week 1 was a partial week (tracking started Apr 11), so the percentage increase looks dramatic. Week 2 is my first full Mon–Sun week and represents the baseline going forward.
Token volume surged 529%. Cost didn't change by a single penny. That's the subscription advantage: your cost is completely decoupled from your usage. Use 6x more, pay the same. The effective per-million-token rate dropped 84% because the fixed £9.24 now covers vastly more tokens.
The Stack
| Component | Cost | Type |
|---|---|---|
| GLM-5.1 (cloud) | £4.62/wk | OAuth subscription |
| GPT-5.3 Codex (cloud) | £4.62/wk | OAuth subscription |
| Qwen 3.5 9B (local) | £0 | Local Ollama |
| Gemma 4 31B (cloud) | £0 | Free tier |
| MiniMax M2.7 (cloud) | £0 | Free tier |
| Total | £9.24/wk | £480/year |
No API keys. No per-token billing. No surprise invoices.
The Bottom Line
Week 2: 326M tokens. 1,078 sessions. £9.24.
Same flat price as Week 1. No overage charges. No scaling penalties. No "premium context window" fees.
Three models. Three cost strategies. One flat bill. That's diversified usage — and that's how AI should work.
Found this useful? 👉 Follow @Raf_VRS for more transparent AI insights that put you in control of your hardware. 👉 Support the work: ko-fi.com/rafvrs
#VRSComputing #ModelBenchmarking #TokenUsage #AIAgents #CostTransparency