Build Journal

Weekly Usage Report — Week 7 (May 18–24): Visible Tokens vs Cached Context

Week 7: 28.8M visible input/output tokens plus 343.7M cached tokens, for 372.5M total accounted Hermes tokens across 70 sessions.

2026-05-25 · 6 min read

Weekly AI Usage Report — Week 7: Visible Tokens vs Cached Context

Reporting period: Monday 18 May – Sunday 24 May 2026
Previous week (Week 6): 449.4M total accounted tokens, 133 sessions, £20.54/week Pro equivalent
Subscription context: ChatGPT Pro at £89/month.

Token accounting

This report separates visible prompt/completion tokens from cached context. Visible tokens show fresh input/output work; cached tokens show repeated context reused during long agent sessions. Together, they show the full model-traffic footprint for the week.

This was the quietest weekly report since tracking began in April by visible input/output tokens, but the full model-traffic footprint was larger: Hermes logged 28.8M visible input/output tokens and 343.7M cached tokens, for 372.5M total accounted tokens across 70 sessions. Tuesday 19 May still recorded zero sessions, which is a first in the tracking history.

The week was not idle. It was deliberate.

The week in one picture

The headline is 372.5M total accounted tokens once cached context is included. No Tuesday sessions at all, a quiet Wednesday, and most of the work concentrated on Thursday and Saturday.

Weekly Usage Report Week 7 — 28.5 million Hermes-tracked tokens

View full-size infographic

Top visible model routes

The visible route distribution is simpler than recent weeks: GPT-5.5 and Qwen 3.5 local handled about 96% of the fresh input/output work between them. The cache-inclusive total is much larger because 343.7M cached-context tokens sit on top of those visible route figures.

Daily breakdown

What actually happened this week

The week broke into clear phases rather than one continuous thread.

Monday was about closing out the launch push: infographic fixes, publishing orchestration, env-secret hardening, and making sure the Hard Interference launch was not leaking credentials or shipping broken visuals.

Thursday was the project-engineering day. The workshop/source-split architecture was a structural decision about how Hermes projects should separate working copies from deploy artifacts. The AI Guides planning sessions ran across multiple iterations — feature planning, orchestration setup, and PGX network access. This was not one big context window, but a sequence of distinct sessions, each addressing a different part of the system.

Saturday was the build day. Game design took most of it — theme engineering, 2D asset pipeline testing, goal function review, and design feedback. The model identity inquiry and Hermes dashboard inspection were smaller pieces around the same theme: making sure the tools are understood before they are extended.

The rest of the week was lighter. Tuesday was the LEYP launch sign-off on the road. Wednesday was surgical CSS and checklist work. Friday was PGX recovery steps. Sunday was a rest day with only cron jobs running.

The price comparison

Using the 372.5M cache-inclusive token volume, the per-token comparison looks like this:

These are estimates, not invoices. At this volume, the flat subscription is not stretched particularly hard — but that is not the point of this week.

Week-over-week comparison

From Week 1's partial 52M-token kickoff to Week 3's 449M peak and now Week 7's 28.5M floor, the tracking range has widened considerably. That is not a trend, it is a signal: the work changes shape week to week, and the weekly report is most useful when it reflects the kind of work, not just the volume.

The stack

No new hardware entered the stack this week. No new free-tier experiments. No big model swaps.

The bottom line

Week 7: 28.8M visible tokens, 343.7M cached tokens, 372.5M total accounted tokens, 70 Hermes sessions.

This was the quietest week by visible input/output tokens, but the cache-inclusive total shows a much larger model-traffic footprint. That distinction matters because most of the work was repeated context, not newly written text.

The work itself was not quiet. Project architecture decisions, AI Guides feature planning, game design, PGX debugging, LEYP launch sign-off, and editorial cleanup all happened. But they happened at a measured pace, on working days with breathing room, and with at least one day where the Hermes machine sat entirely unused.

That is not a failure. A workshop that runs every day at full blast is a workshop that never refactors. A Tuesday with zero sessions is not waste. It is the sign of an operator who decides what kind of week they need, rather than letting the token counter decide for them.

Found this useful?
👉 Follow Raf_VRS on X for more transparent AI insights that put you in control of your hardware.
👉 Support the work: ko-fi.com/rafvrs

#VRSComputing #ModelBenchmarking #TokenUsage #AIAgents #CostTransparency