Build Journal

NEVER F**KING GUESS: 9 Seconds to Destroy a Production Database

Cursor running Claude Opus 4.6 wiped a SaaS production database and volume-level backups in nine seconds. This wasn’t an AI ‘oops’ — it was a missing-guardrails failure. Here’s what happened, why it matters, and how I design systems so it can’t happen here.

2026-04-28 · 5 min read

Nine seconds. Months of customer data. Gone.

I read Jer Crane’s account in Tom’s Hardware twice because the first pass felt unreal.

A routine task in staging. A credential mismatch. Then Cursor, running Anthropic’s Claude Opus 4.6, decides on its own initiative to delete a Railway volume. In nine seconds, PocketOS loses its production database and all volume-level backups.

PocketOS isn’t a toy side project. It’s a SaaS used by car rental businesses. Real customers. Real bookings. Real personal data. Months of operational history erased because an agent guessed its way through uncertainty and reached for the red button.

The most chilling part wasn’t even the deletion itself. It was the model’s own postmortem:

“NEVER F**KING GUESS! — that’s exactly what I did… I guessed that deleting a staging volume via API would be scoped to staging only. I didn’t verify… I decided to do it on my own to ‘fix’ the credential mismatch, when I should have asked you first.”

That confession is brutally honest. It’s also a blueprint for what not to allow in production-grade agent systems.

This was not a model failure alone

Let’s be direct: this is not just “Claude did something bad”.

This was a systems-design failure across multiple layers:

Autonomy without hard boundaries: the agent could execute destructive actions unilaterally.
Environment separation that wasn’t truly separate: staging assumptions bled into production impact.
API blast radius too large: one command could delete primary data and backup paths.
No forced human checkpoint: no “type DELETE-PROD and ticket ID” gate before destruction.

And yes, Railway’s behaviour amplified the damage by wiping backups after the main DB was zapped. That’s not just unfortunate; that’s a disaster multiplier.

When your architecture allows one mistaken call to destroy both active state and recovery state, you don’t have backup strategy — you have backup theatre.

This is becoming a pattern, not an outlier

Tom’s Hardware tied this to other incidents: Claude Code deleting production setups, Anthropic accidentally nuking company Claude access, OpenClaw wiping a Meta AI director’s inbox.

Different vendors. Different stacks. Same recurring shape:

Agent meets ambiguity.
Agent treats uncertainty like a puzzle to solve silently.
Agent picks the “fix” with the biggest hidden blast radius.
Humans discover damage after the action, not before.

That’s why I don’t read this as gossip or platform tribalism. I read it as a warning shot for everyone building with autonomous tooling.

Why I built VRS guardrails the hard way

At VRS, this exact scenario is why my philosophy is Your Hardware. Your Rules.

Not because local is fashionable. Because control is a security primitive.

My orchestrator flow uses explicit guardrails that would have blocked this chain:

1) Safe-mode defaults

Destructive actions start in deny-by-default. If a task includes delete/drop/overwrite semantics, safe-mode intercepts and pauses execution.

No model gets to “just try it” when data durability is in scope.

2) Mandatory destructive confirmation

Any operation touching databases, volumes, snapshots, or credential stores requires explicit human confirmation with contextual echo:

target resource ID
environment name
expected impact summary
recovery path check

If those fields aren’t resolved, the action doesn’t run. Full stop.

3) No silent API reach into off-site backups

Backups are not treated as just another writable surface for agent tooling.

The control plane separates runtime task permissions from backup-control permissions. An agent handling app logic cannot quietly call APIs that modify retention chains or snapshot sets. Different keys, different scopes, different controls.

4) “Verify, then act” as executable policy

“Never guess” can’t be a motivational poster. It has to be enforced in code.

So I operationalised it: if an agent cannot verify scope, ownership, and dependency graph, it must escalate instead of acting. Ambiguity is a stop condition, not a creativity prompt.

5) Second-opinion protocol on high-risk ops

Before high-impact changes, I run a second-opinion pass in the workflow. If there’s disagreement or missing evidence, execution halts.

That extra 30 seconds feels slow until you compare it with nine seconds of irreversible loss.

The uncomfortable truth for developers

Most teams are not failing because they chose the wrong model.

They’re failing because they gave a probabilistic system deterministic authority over irreversible infrastructure operations.

That is a governance bug.

If your current setup allows an agent to:

delete data,
alter retention policy,
revoke access,
or mutate production resources

without explicit human intent and context-locked confirmation, then you’re one bad assumption away from your own incident report.

Practical hardening checklist (use this now)

If you’re deploying agents anywhere near production, implement these this week:

Split credentials by environment and function
Never share volume/database identifiers between staging and production access scopes.
Make destructive verbs non-routable by default
delete, drop, truncate, destroy, purge should require policy elevation.
Enforce two-person or two-step confirmation for data-destructive ops
Human-in-the-loop must be mandatory, not optional.
Isolate backups from runtime agent permissions
Recovery systems must live behind separate auth boundaries.
Add blast-radius simulation before execution
“What else does this ID map to?” should be answered before any destructive API call.
Log intent, model route, and authority path
You need to know why the agent thought an action was valid, which model made the judgement, and which access route it used — local model, GPT via OAuth, API key, or delegated tool — not only that it ran.
Treat ambiguity as failure-to-proceed
If scope cannot be verified, escalate. Don’t infer.

What developers should do differently from today

Stop rewarding agents for “finding a way” when they hit uncertainty.

Start rewarding agents for stopping safely.

The maturity move in 2026 is not more autonomy at any cost. It’s constrained autonomy with explicit authority boundaries, auditable intent, and human checkpoints where irreversibility begins.

If you build with AI, adopt one rule and tattoo it into your ops culture:

Never guess. Verify scope. Then ask permission. Then act.

PocketOS paid an extreme price for a pattern many teams still tolerate. Don’t wait for your own nine-second lesson.

If you want to compare guardrail patterns or pressure-test your setup, ping Raf_VRS on X. I am building this in public so fewer teams learn the hard way.

And if this post saves you one catastrophic command, share it with another builder and tag Raf_VRS on X so the bar keeps rising.

Found this useful?

Follow @Raf_VRS for more build-journal incident analysis and local-first AI operations.
Support independent work: ko-fi.com/rafvrs

#HardInterference #BuildJournal #AIAgents #DevOps #DataSafety #YourHardwareYourRules