Private by Default: Local AI That Transcribes, Summarises, and Drafts — Then Deletes Everything
Receive an audio file. Transcribe it locally. Summarise the key points. Draft a formal response. Delete every trace. No cloud, no API, no third party ever sees your data. Here's how I did it — and why you should care.
Your most sensitive conversations are being harvested right now. Every time you send audio to Otter, Rev, or Google, you're handing over your data to be mined, analysed, and potentially leaked. I refuse to accept that as the price of convenience.
Last week, I built a workflow that turns that paranoia into peace of mind. It takes under two minutes from audio file to actionable letter — and leaves zero digital footprint. Here's exactly how I did it, and why you should care.
The problem with sending your audio to the cloud
You've just finished a sensitive phone call. Maybe it's a legal consultation. Maybe it's a confidential business discussion. Maybe it's something you simply don't want sitting on someone else's server, waiting to be scraped, analysed, or leaked.
Your options used to be:
- Type it up yourself — slow, error-prone, tedious
- Send it to Otter, Rev, or Google — fast, but your audio is now on their servers, processed by their AI, subject to their privacy policy (which they can change whenever they like)
Neither option is acceptable when the content matters.
What I built instead
Here's the workflow. It happened last week, and it took under two minutes from start to finish:
1. Receive the audio file
The audio arrived as a file in a local chat — our AI agent sitting on the machine, not in a browser, not connected to any cloud service. The file landed directly on the local disk. No upload button. No third-party link. No S3 bucket.
2. Transcribe locally with faster-whisper
The agent loaded faster-whisper — the C++/CUDA-optimised reimplementation of OpenAI's Whisper — and ran it directly on the GPU. A 25-minute recording was processed in under two minutes. The output was a full timestamped transcript, saved temporarily to disk.
No API key. No cloud endpoint. The audio file never left the machine.
3. Summarise the key points
The raw transcript was passed to the local language model. It identified the core topics, extracted action items, flagged decisions made, and produced a concise summary. All running locally. All without a single packet leaving the network.
4. Draft a formal response letter
With the summary in hand, the same local LLM drafted a polished, formal response letter — structured, professional, ready to send. Review, tweak, regenerate as needed. No usage quotas. No data retention. No "we store your prompts for 30 days" policy.
5. Delete everything
Once the output was confirmed, the agent deleted the audio file, the transcript, and the draft. Not moved to trash. Deleted. Securely, irreversibly, gone. The only thing that remains is whatever you chose to keep.
Why this matters
Privacy by design. No audio or text is uploaded, scanned, or stored by any external party. Your sensitive conversations stay on your machine.
Speed. Local GPU inference beats round-trip latency to cloud APIs, especially for longer files. It processed a 25-minute recording faster than most cloud services finish their queue.
Independence. No service outages. No pricing changes. No privacy policy shifts. Your workflow doesn't depend on anyone else's business model.
Practical power. The same stack that transcribed a call can summarise research interviews, draft legal notes, turn voice memos into actionable tickets — all offline, all private, all yours.
The numbers
Here's what we actually measured:
| Step | Duration | Tool |
|---|---|---|
| Transcribe 25min audio | ~2 minutes | faster-whisper (small, CPU) |
| Summarise transcript | ~5 seconds | Local LLM |
| Draft formal letter | ~10 seconds | Local LLM |
| Delete all files | Instant | Local filesystem |
Total time from audio file to finished letter: under 3 minutes. Total data sent to the cloud: zero bytes.
You can do this today
You don't need a data centre. You don't need a cloud subscription. You need:
- A machine with a GPU (even an RTX 3060 will do — or run on CPU if you're patient)
faster-whisperinstalled (pip install faster-whisper)- A local LLM running via Ollama or llama.cpp
- An agent like Hermes to orchestrate the workflow
That's it. Three open-source tools. No accounts. No API keys. No monthly fees.
Your audio. Your transcript. Your machine.
Keep your data where it belongs — on your hardware.
Found this useful? 👉 Follow @Raf_VRS for more private AI workflows that put you in control of your hardware
👉 Support the work: ko-fi.com/rafvrs
Stop Scrolling. Start Building. #LocalAI #PrivateAI #Transcription