Safe Mode for Local Files: Keeping Sensitive Prompts on Your Machine
I built SAFE mode switches so sensitive local-file work can be routed through local models only. When SAFE is on, cloud fallbacks are removed, web access is disabled, and the confirmation includes an audit fingerprint showing what mode was applied, when, and on which host.
The question that forced the feature
The uncomfortable question was simple:
If I ask an AI agent to read a local file, where does that file content actually go?
That sounds obvious until you look at a real agent setup. The assistant might use a local terminal tool, but the reasoning model could still be running in the cloud. It might have a cloud fallback. It might have web search enabled. It might delegate a subtask to another model provider. It might summarise the result into a memory file, then a later task might send that summary somewhere else.
For public blog drafts, that risk is manageable. The content is going public anyway. But for local files — configs, logs, business notes, source code, customer information, tokens, private drafts — "probably fine" is not a security model.
So I added a hard switch: SAFE mode.
What SAFE mode is for
SAFE mode is the mode I use when the instruction is effectively:
Read this local thing, but do not let its contents leave this machine.
That means the request should be handled by local tooling and local models only. No OpenRouter. No Ollama Cloud. No ChatGPT/Codex fallback. No Tavily web search. No browser tool quietly opening a remote page. No delegation to another hosted model because the first model struggled.
The target behaviour is boring on purpose:
- Read the local file locally.
- Process it with a local model.
- Return the answer locally.
- Do not send the file contents to a remote model provider.
That is the whole point. SAFE mode is not about making the agent more capable. It is about making it less slippery.
The switches
I wrapped the routing change in two operator commands:
~/.local/bin/safe-on~/.local/bin/safe-off
safe-on is the lockdown path. In the original routing design, it switched Hermes into strict-local mode:
model.default: qwen3.5:9b
model.provider: custom
model.base_url: http://localhost:11434/v1
fallback_providers: []
smart_model_routing.enabled: false
privacy.strict_local: true
web.backend: disabled
It also narrowed the CLI toolset to the local-safe subset:
clarify
code_execution
file
memory
skills
terminal
todo
That matters because model routing is only half the problem. If web, browser, image generation, vision, TTS, delegation, cron, and session search are still available, the agent has too many ways to move data away from the local privacy boundary. SAFE mode removes those remote-capable routes from the active tool surface.
safe-off restores the normal cloud-enabled working mode when the sensitive task is finished.
Local-only should be checkable
The important part is that SAFE mode gives a visible, checkable state.
When I turn it on, the confirmation is supposed to say more than "done". It includes the mode, the local model, and the next operational step:
✅ SAFE MODE LOCKED
mode_fingerprint: mode=strict-local ts_utc=2026-05-09T01:23:45Z host=alienware
Local model: qwen3.5:9b @ localhost:11434
Next step: run /restart so tool restrictions apply.
That mode_fingerprint line is deliberately machine-readable:
mode_fingerprint: mode=<strict-local|cloud-glm> ts_utc=<ISO-8601 UTC> host=<hostname>
It answers three questions every time:
- What mode did I apply?
strict-localor cloud-enabled. - When did I apply it? UTC timestamp, so logs line up across systems.
- Where did I apply it? Hostname, so the audit trail makes sense if more than one machine is involved.
That turns a privacy toggle from a vague promise into an audit line.
Why local files need a separate mode
Local files are where the real risk lives.
A normal chat prompt might be harmless. A local-file prompt can accidentally contain everything:
.envpaths and API key names- stack traces with usernames and internal URLs
- database dumps
- private notes
- unreleased blog drafts
- customer details
- business strategy
- source code that is not public yet
The dangerous bit is not only the final answer. It is the input context. If a cloud model sees the file content, that content has already left the machine.
That is why I do not want a soft convention like "try to use local models for private files." I want a switch that changes the routing layer before the task starts.
What SAFE mode blocks
In strict-local mode, the intended boundary is:
| Capability | Normal mode | SAFE mode |
|---|---|---|
| Primary model | Cloud model allowed | Local Ollama model only |
| Fallback models | Cloud fallback allowed | No fallbacks |
| Smart routing | Can choose another model | Disabled |
| Web search | Available | Disabled |
| Browser access | Available | Disabled |
| Delegation | Available | Disabled |
| File reads | Local tools | Local tools |
| Terminal work | Local shell | Local shell |
The practical effect is simple: if I ask the agent to inspect a local config file, the file content should be handled on the workstation by the local model, not packaged into a prompt for a hosted API.
The status check
There is also a status wrapper:
~/.local/bin/safe-status
That reads local config only and redacts sensitive fields before printing anything. It reports the current model provider, model name, base URL, whether strict-local privacy is enabled, whether web is disabled, how many fallbacks exist, and which CLI toolsets are active.
The status check is intentionally local-only. A privacy status command that calls the network to check privacy would be comedy, and not the good kind.
The limitation
SAFE mode is a routing guard, not a magic force field.
If I explicitly paste a secret into a public website, the switch cannot save me. If I run a shell command that uploads a file, that is still a shell command. If I turn SAFE mode off before restarting the session, the old tool surface may still be active until the restart applies the new restrictions.
That is why the confirmation includes the boring but vital line:
Next step: run /restart so tool restrictions apply.
The switch changes config. The restart makes the running agent live inside that config.
Why I added fingerprints
Before the fingerprint line, the mode switch was useful but weak for incident review. I could say "I turned safe mode on," but the logs did not have a clean, parseable proof line.
With fingerprints, every toggle leaves a small audit trail:
mode_fingerprint: mode=strict-local ts_utc=2026-05-09T01:23:45Z host=alienware
mode_fingerprint: mode=cloud-glm ts_utc=2026-05-09T01:45:02Z host=alienware
That makes it much easier to reconstruct what happened later:
- Was the agent in local-only mode before it read the file?
- Was cloud mode restored afterwards?
- Did the change happen on the right machine?
- Do the timestamps line up with the sensitive task?
For a solo setup, that might sound overbuilt. It is not. The whole point of automation is that the boring checks happen every time, even when I am tired.
The rule I now use
If a task involves sensitive local files, I want SAFE mode first.
Not halfway through. Not after the agent has already read the file. Before the prompt touches the content.
The workflow is:
/safe-on
/restart
ask the local-file question
/safe-status if unsure
/safe-off when finished
/restart
That gives me a clean boundary: local files stay local, local models do the work, and cloud tools only come back after I deliberately restore them.
That is the trust model I want for a personal AI workstation.
Found this useful? Follow @Raf_VRS for more practical AI guides.
Support the work: ko-fi.com/rafvrs
Stop Scrolling. Start Building. #LocalAI #AIAgents #Privacy