Build Journal

The 12 Million Token Mistake

A healthcheck cron job running every 5 minutes through an LLM session was burning 12 million tokens per day to execute 'curl localhost:3000'. The most expensive 3-line bash script ever written.

2026-04-15 · 3 min read

Discovery

I was looking at the Mission Control live activity feed and noticed something: the last 10 events were all from cron. Every single one. No actual work being done -- just automated jobs checking in.

Then I looked at the numbers.

The offender

There was a cron job called mission-control-healthcheck-auto-restart that did exactly this:

#!/usr/bin/env bash
status="$(curl -s -o /dev/null -w '%{http_code}' http://127.0.0.1:3000)"
if [[ "$status" == "200" ]]; then
  echo "OK mission-control healthy (200)"
  exit 0
fi
# ...restart logic...

That's it. A curl command and a conditional. Three lines of bash.

But it was running through an LLM session every 5 minutes.

Each run created a new Hermes agent session that:

Loaded the full system prompt (~5K tokens)
Parsed the user instruction
Executed the terminal command
Formulated a response
Returned status

Each run cost approximately 22,680 tokens.

Let me do the math:

288 runs per day (every 5 minutes, 24 hours)
22,680 tokens per run
= 6,531,840 tokens per day

But wait, there's more

The daily-memory-refresh cron was also using cloud models when it didn't need to:

Running every 30 minutes on glm-5.1:cloud
48 runs per day × 22K tokens = ~1,056,000 tokens per day

Total from just these two jobs: ~7.6 million tokens per day.

For comparison, an entire day of actual work (coding, research, building) might use 5-10 million tokens. These cron jobs were using as many tokens as the actual productive work.

The fix

Three changes, massive impact:

1. Healthcheck → system crontab (zero tokens)

# Replaced the Hermes cron job with a plain system crontab entry:
*/5 * * * * /home/klb/.hermes/scripts/mission-control-watchdog.sh >> /tmp/mission-control-watchdog-cron.log 2>&1

Same script, same frequency, zero LLM tokens. The bash script doesn't need an AI to run curl and check if the server is up.

2. Memory refresh → 1 hour interval + local model

Schedule: every 30m → every 60m (halved the runs)
Model: glm-5.1:cloud → qwen3.5:9b (local, zero cost)
Token reduction: from ~1M/day to ~0 (local model = no cloud tokens)

3. The rule

Simple checks (healthchecks, pings, bash scripts) must NEVER use LLM tokens. If a cron job must use an LLM for a trivial task, route to a local model. Prefer system crontab for pure shell tasks.

This rule is now baked into the system memory.

The principle

Automation without cost awareness is just expensive manual work.

The healthcheck was "automated" -- but it was automating an LLM to run a bash script. That's not automation, that's paying a consultant to press a button for you. The automation should be the button press itself, not the consultation.

Savings summary

Change	Before	After	Tokens saved/day
Healthcheck → system cron	~6.5M	0	6.5M
Memory refresh → 1h + local	~1.1M	0 (local)	1.1M
Total	~7.6M	0	7.6M

That's roughly 75% of the daily cron token budget eliminated in two changes.

The Numbers — At a Glance

The 12 Million Token Mistake — before and after cost dashboard

View full-size infographic

What 12 million tokens/day would cost on Opus

Using Opus 4.6 API pricing assumptions (£12/M input + £60/M output), 12 million tokens per day gets expensive fast:

Best-case (all input): £144/day (about £4,320/month)
Conservative mixed estimate (90% input / 10% output): £202/day (about £6,048/month)
In GBP at 0.80 FX: roughly £4,320 to £6,048 per month

That is real money burned by a healthcheck loop.

This is exactly why it's important to start small and grow with the system, instead of diving in on hype and scaling waste.

Found this useful? 👉 Follow @Raf_VRS for more Build Journal updates 👉 Support the work: ko-fi.com/rafvrs