AI Guides

Private by Default: Local AI That Transcribes, Summarises, and Drafts — Then Deletes Everything

Receive an audio file. Transcribe it locally. Summarise the key points. Draft a formal response. Delete every trace. No cloud, no API, no third party ever sees your data. Here's how I did it — and why you should care.

2026-04-20 · 4 min read

Your most sensitive conversations are being harvested right now. Every time you send audio to Otter, Rev, or Google, you're handing over your data to be mined, analysed, and potentially leaked. I refuse to accept that as the price of convenience.

Last week, I built a workflow that turns that paranoia into peace of mind. It takes under two minutes from audio file to actionable letter — and leaves zero digital footprint. Here's exactly how I did it, and why you should care.

The problem with sending your audio to the cloud

You've just finished a sensitive phone call. Maybe it's a legal consultation. Maybe it's a confidential business discussion. Maybe it's something you simply don't want sitting on someone else's server, waiting to be scraped, analysed, or leaked.

Your options used to be:

Type it up yourself — slow, error-prone, tedious
Send it to Otter, Rev, or Google — fast, but your audio is now on their servers, processed by their AI, subject to their privacy policy (which they can change whenever they like)

Neither option is acceptable when the content matters.

What I built instead

Here's the workflow. It happened last week, and it took under two minutes from start to finish:

1. Receive the audio file

The audio arrived as a file in a local chat — our AI agent sitting on the machine, not in a browser, not connected to any cloud service. The file landed directly on the local disk. No upload button. No third-party link. No S3 bucket.

2. Transcribe locally with faster-whisper

The agent loaded faster-whisper — the C++/CUDA-optimised reimplementation of OpenAI's Whisper — and ran it directly on the GPU. A 25-minute recording was processed in under two minutes. The output was a full timestamped transcript, saved temporarily to disk.

No API key. No cloud endpoint. The audio file never left the machine.

3. Summarise the key points

The raw transcript was passed to the local language model. It identified the core topics, extracted action items, flagged decisions made, and produced a concise summary. All running locally. All without a single packet leaving the network.

4. Draft a formal response letter

With the summary in hand, the same local LLM drafted a polished, formal response letter — structured, professional, ready to send. Review, tweak, regenerate as needed. No usage quotas. No data retention. No "we store your prompts for 30 days" policy.

5. Delete everything

Once the output was confirmed, the agent deleted the audio file, the transcript, and the draft. Not moved to trash. Deleted. Securely, irreversibly, gone. The only thing that remains is whatever you chose to keep.

Why this matters

Privacy by design. No audio or text is uploaded, scanned, or stored by any external party. Your sensitive conversations stay on your machine.

Speed. Local GPU inference beats round-trip latency to cloud APIs, especially for longer files. It processed a 25-minute recording faster than most cloud services finish their queue.

Independence. No service outages. No pricing changes. No privacy policy shifts. Your workflow doesn't depend on anyone else's business model.

Practical power. The same stack that transcribed a call can summarise research interviews, draft legal notes, turn voice memos into actionable tickets — all offline, all private, all yours.

The numbers

Here's what we actually measured:

Step	Duration	Tool
Transcribe 25min audio	~2 minutes	faster-whisper (small, CPU)
Summarise transcript	~5 seconds	Local LLM
Draft formal letter	~10 seconds	Local LLM
Delete all files	Instant	Local filesystem

Total time from audio file to finished letter: under 3 minutes. Total data sent to the cloud: zero bytes.

You can do this today

You don't need a data centre. You don't need a cloud subscription. You need:

A machine with a GPU (even an RTX 3060 will do — or run on CPU if you're patient)
faster-whisper installed (pip install faster-whisper)
A local LLM running via Ollama or llama.cpp
An agent like Hermes to orchestrate the workflow

That's it. Three open-source tools. No accounts. No API keys. No monthly fees.

Your audio. Your transcript. Your machine.

Keep your data where it belongs — on your hardware.

Found this useful? 👉 Follow @Raf_VRS for more private AI workflows that put you in control of your hardware
👉 Support the work: ko-fi.com/rafvrs

Stop Scrolling. Start Building. #LocalAI #PrivateAI #Transcription