AI Guides

Safe Mode for Local Files: Keeping Sensitive Prompts on Your Machine

I built SAFE mode switches so sensitive local-file work can be routed through local models only. When SAFE is on, cloud fallbacks are removed, web access is disabled, and the confirmation includes an audit fingerprint showing what mode was applied, when, and on which host.

2026-04-18 · 6 min read

The question that forced the feature

The uncomfortable question was simple:

If I ask an AI agent to read a local file, where does that file content actually go?

That sounds obvious until you look at a real agent setup. The assistant might use a local terminal tool, but the reasoning model could still be running in the cloud. It might have a cloud fallback. It might have web search enabled. It might delegate a subtask to another model provider. It might summarise the result into a memory file, then a later task might send that summary somewhere else.

For public blog drafts, that risk is manageable. The content is going public anyway. But for local files — configs, logs, business notes, source code, customer information, tokens, private drafts — "probably fine" is not a security model.

So I added a hard switch: SAFE mode.

What SAFE mode is for

SAFE mode is the mode I use when the instruction is effectively:

Read this local thing, but do not let its contents leave this machine.

That means the request should be handled by local tooling and local models only. No OpenRouter. No Ollama Cloud. No ChatGPT/Codex fallback. No Tavily web search. No browser tool quietly opening a remote page. No delegation to another hosted model because the first model struggled.

The target behaviour is boring on purpose:

  1. Read the local file locally.
  2. Process it with a local model.
  3. Return the answer locally.
  4. Do not send the file contents to a remote model provider.

That is the whole point. SAFE mode is not about making the agent more capable. It is about making it less slippery.

The switches

I wrapped the routing change in two operator commands:

safe-on is the lockdown path. In the original routing design, it switched Hermes into strict-local mode:

model.default: qwen3.5:9b
model.provider: custom
model.base_url: http://localhost:11434/v1
fallback_providers: []
smart_model_routing.enabled: false
privacy.strict_local: true
web.backend: disabled

It also narrowed the CLI toolset to the local-safe subset:

clarify
code_execution
file
memory
skills
terminal
todo

That matters because model routing is only half the problem. If web, browser, image generation, vision, TTS, delegation, cron, and session search are still available, the agent has too many ways to move data away from the local privacy boundary. SAFE mode removes those remote-capable routes from the active tool surface.

safe-off restores the normal cloud-enabled working mode when the sensitive task is finished.

Local-only should be checkable

The important part is that SAFE mode gives a visible, checkable state.

When I turn it on, the confirmation is supposed to say more than "done". It includes the mode, the local model, and the next operational step:

✅ SAFE MODE LOCKED
mode_fingerprint: mode=strict-local ts_utc=2026-05-09T01:23:45Z host=alienware
Local model: qwen3.5:9b @ localhost:11434
Next step: run /restart so tool restrictions apply.

That mode_fingerprint line is deliberately machine-readable:

mode_fingerprint: mode=<strict-local|cloud-glm> ts_utc=<ISO-8601 UTC> host=<hostname>

It answers three questions every time:

  1. What mode did I apply? strict-local or cloud-enabled.
  2. When did I apply it? UTC timestamp, so logs line up across systems.
  3. Where did I apply it? Hostname, so the audit trail makes sense if more than one machine is involved.

That turns a privacy toggle from a vague promise into an audit line.

Why local files need a separate mode

Local files are where the real risk lives.

A normal chat prompt might be harmless. A local-file prompt can accidentally contain everything:

The dangerous bit is not only the final answer. It is the input context. If a cloud model sees the file content, that content has already left the machine.

That is why I do not want a soft convention like "try to use local models for private files." I want a switch that changes the routing layer before the task starts.

What SAFE mode blocks

In strict-local mode, the intended boundary is:

CapabilityNormal modeSAFE mode
Primary modelCloud model allowedLocal Ollama model only
Fallback modelsCloud fallback allowedNo fallbacks
Smart routingCan choose another modelDisabled
Web searchAvailableDisabled
Browser accessAvailableDisabled
DelegationAvailableDisabled
File readsLocal toolsLocal tools
Terminal workLocal shellLocal shell

The practical effect is simple: if I ask the agent to inspect a local config file, the file content should be handled on the workstation by the local model, not packaged into a prompt for a hosted API.

The status check

There is also a status wrapper:

~/.local/bin/safe-status

That reads local config only and redacts sensitive fields before printing anything. It reports the current model provider, model name, base URL, whether strict-local privacy is enabled, whether web is disabled, how many fallbacks exist, and which CLI toolsets are active.

The status check is intentionally local-only. A privacy status command that calls the network to check privacy would be comedy, and not the good kind.

The limitation

SAFE mode is a routing guard, not a magic force field.

If I explicitly paste a secret into a public website, the switch cannot save me. If I run a shell command that uploads a file, that is still a shell command. If I turn SAFE mode off before restarting the session, the old tool surface may still be active until the restart applies the new restrictions.

That is why the confirmation includes the boring but vital line:

Next step: run /restart so tool restrictions apply.

The switch changes config. The restart makes the running agent live inside that config.

Why I added fingerprints

Before the fingerprint line, the mode switch was useful but weak for incident review. I could say "I turned safe mode on," but the logs did not have a clean, parseable proof line.

With fingerprints, every toggle leaves a small audit trail:

mode_fingerprint: mode=strict-local ts_utc=2026-05-09T01:23:45Z host=alienware
mode_fingerprint: mode=cloud-glm ts_utc=2026-05-09T01:45:02Z host=alienware

That makes it much easier to reconstruct what happened later:

For a solo setup, that might sound overbuilt. It is not. The whole point of automation is that the boring checks happen every time, even when I am tired.

The rule I now use

If a task involves sensitive local files, I want SAFE mode first.

Not halfway through. Not after the agent has already read the file. Before the prompt touches the content.

The workflow is:

/safe-on
/restart
ask the local-file question
/safe-status if unsure
/safe-off when finished
/restart

That gives me a clean boundary: local files stay local, local models do the work, and cloud tools only come back after I deliberately restore them.

That is the trust model I want for a personal AI workstation.


Found this useful? Follow @Raf_VRS for more practical AI guides.

Support the work: ko-fi.com/rafvrs

Stop Scrolling. Start Building. #LocalAI #AIAgents #Privacy