Hardware Guides

Hermes on the Thin Client: Installing an AI Agent on a £80 Laptop

A £80 HP thin client will not run useful local models, but it can still host a full personal agent with local PC access, memory, cloud models and a path to the PGX.

2026-04-17 · 10 min read

The Machine

This is not a powerhouse. This is not even a house. This is the shed out back.

Spec	HP 14-bs057	My main rig (Alienware Aurora)
CPU	Intel Celeron N3060 (2C/2T, 1.6 GHz burst 2.48 GHz)	Intel Core Ultra 7 265KF (20T)
RAM	8 GB DDR3L (upgraded from 4 GB)	64 GB DDR5
Storage	1 TB 5400 RPM HDD	2 TB NVMe + 2 TB scratch
GPU	Intel HD 400 (integrated)	RTX 5070 Ti 16 GB
Era	2016	2025
Used price	~£80 (+ £15 RAM upgrade)	~£1,800

The Celeron N3060 scores roughly 200 on PassMark single-thread. For context, my Core Ultra 7 scores 4,200. That's a 21x gap.

Why do this? Because most people don't have an RTX 5070 Ti. Most people have this — a hand-me-down laptop gathering dust. If Hermes can run here, it can run anywhere. And that matters for adoption.

Constraints We're Working With

8 GB RAM (upgraded from 4 GB) — the OS needs ~500 MB idle on Server, leaving ~6 GB for Hermes and tools. This is workable — just don't run Chrome alongside it
Local model selection works, local inference does not — ChatGPT could add a tiny Ollama model as a selectable local option in Hermes, but the model's ~40k context window and slow response time made it unusable once Hermes sent its normal agent context
1 TB 5400 RPM HDD — plenty of space for Ubuntu Server, Hermes, logs, memory and project files, but much slower than an SSD for Python startup and local model loading
USB 3.1 ports — can expand storage externally if needed, ideally with an SSD if this becomes a permanent always-on box
The CPU is the real bottleneck — not RAM. Celeron N3060 at 200 PassMark means slow Python startup, slow local processing, and very slow local inference; cloud inference hides most of that pain

The Goal

Get Hermes Agent running end-to-end on this laptop:

✅ CLI chat working via ChatGPT OAuth / GPT-5.5
✅ Cloud model access through an Ollama subscription as a second route
✅ Telegram gateway connected
✅ At least basic tools (terminal, file, web)
✅ A realistic local-model test, including the failure modes
✅ Document every step, every error, every workaround

Step-by-Step Install Guide

Step 1: Choose the OS

Ubuntu Desktop 24.04 LTS is the default, but on 4 GB RAM the desktop environment itself eats ~800 MB. Two options:

OS	RAM at idle	Pros	Cons
Ubuntu Desktop 24.04	~1.5 GB	GUI, browser for testing	Tight with 8 GB
Ubuntu Server 24.04	~500 MB	Maximum RAM for Hermes	No GUI, SSH only
Lubuntu 24.04	~800 MB	Lightweight GUI	Smaller community

Recommendation: Ubuntu Server 24.04 LTS. With 8 GB we have more headroom than the original 4 GB would have allowed, but we're still not running a desktop. This is a headless agent that talks to you via Telegram. Every megabyte counts toward responsiveness.

Step 2: Install Ubuntu Server

Download Ubuntu Server 24.04 LTS ISO from ubuntu.com/download/server

Flash to USB with Balena Etcher or dd:

sudo dd if=ubuntu-24.04-live-server-amd64.iso of=/dev/sdX bs=4M status=progress && sync

Boot the HP from USB (press F9 at startup for boot menu, F10 for BIOS)
Install with minimal packages — no LAMP, no snaps you don't need
Set hostname to something memorable (e.g., hermes-thinclient)
Enable SSH during install — you'll need it
Create your user, set a strong password

Step 3: First Boot — Free Up Resources

After first login via SSH:

# Update everything
sudo apt update && sudo apt upgrade -y

# Remove snap packages you don't need (saves ~200 MB RAM)
sudo snap remove lxd
sudo snap list --all | awk '/disabled/{print $1, $3}' | while read name rev; do sudo snap remove "$name" --revision="$rev"; done

# Check free RAM
free -h

You should see roughly 6.5 GB available after boot on a clean Server install with 8 GB RAM.

Step 4: Install Dependencies

Hermes needs Python 3.10+, pip, and a few system packages:

# Python (Ubuntu 24.04 ships 3.12 — perfect)
python3 --version  # Should show 3.12.x

# pip and venv
sudo apt install -y python3-pip python3-venv

# Build tools (needed for some pip packages)
sudo apt install -y build-essential git curl

# FFmpeg (optional, for voice features)
sudo apt install -y ffmpeg

Step 5: Install Hermes

Use the official install script:

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

This will:

Clone Hermes to ~/.hermes/hermes-agent/
Create a virtual environment
Install Python dependencies
Add hermes to your PATH

Watch for errors on 8 GB RAM. The pip install step compiles some dependencies from source. With 8 GB and swap it'll be fine, but the Celeron is slow — expect 5-10 minutes for the install regardless. Adding swap as insurance:

# Create swap space (1 GB swap file — insurance, not essential with 8 GB)
sudo fallocate -l 1G /swapfile
sudo chmod 600 /swapfile
sudo mkswap /swapfile
sudo swapon /swapfile

# Make it permanent
echo '/swapfile none swap sw 0 0' | sudo tee -a /etc/fstab

# Verify
free -h  # Should show 1G swap

With 8 GB RAM + 1 GB swap, pip install will complete cleanly. It'll still be slow on the Celeron — but it won't OOM.

Step 6: Configure a Cloud Provider

Since local models are selectable but not practical on this hardware, we need a cloud brain. The cheapest viable options:

Provider	Model	Cost	Setup
ChatGPT / OpenAI Codex	GPT-5.5 via OAuth	Uses your ChatGPT subscription allowance	`hermes auth add openai-codex --type oauth`
Ollama subscription	31 cloud models	Subscription-backed cloud model access	`OLLAMA_API_KEY` / Ollama Cloud config
OpenRouter	Any	Pay-per-token	`OPENROUTER_API_KEY`
Z.AI / GLM	glm-5.1:cloud	Free tier available	`GLM_API_KEY`
DeepSeek	deepseek-chat	Very cheap	`DEEPSEEK_API_KEY`

Run the setup wizard:

hermes setup

Choose your provider, enter your API key when prompted. For this thin client, the tested winner is GPT-5.5 via ChatGPT OAuth. It gives you the full Hermes agent experience — tools, memory, terminal access and personal context — without asking the HP laptop to be the model host. An Ollama subscription is the other useful route here: not because this laptop can run local models well, but because it gives access to 31 cloud models that work far better than forcing inference onto a Celeron.

Step 7: Test CLI Chat

hermes chat -q "Say hello in one word."

On the Celeron N3060, expect a 3-10 second delay for the model response (network round-trip + minimal local processing). The bottleneck isn't CPU here — it's the cold start of the Python runtime on first invocation.

If it works, you have a functioning AI agent. On a £80 laptop. In a shed.

Step 7b: Test Local Model Selection Honestly

This was the important experiment. ChatGPT could add a small Ollama model as a local selection inside Hermes. On paper, that sounds like the dream: old laptop, local model, private agent.

In practice, the model only had a roughly 40k context window, and even a direct ollama run query involved a long wait. The first Hermes query then hit the same old Ollama service problem: the GPT agent started a second Ollama service, the two services fought over the runtime, and the request produced no response.

After fixing the Ollama service conflict, the likely remaining problem was simpler: Hermes sends enough startup context, tools, memory and instructions that the local model timed out before producing a useful answer.

That is not a failure of the project. It is the result we needed.

This laptop is not the brain. It is the always-on body for the agent. It can host Hermes, expose tools to the local machine, keep memory, run jobs, and let you test what a real personal agent feels like. The model can live somewhere else: ChatGPT OAuth, Ollama Cloud, OpenRouter, or — next — a stronger local box like the PGX.

Step 8: Connect Telegram

This turns the thin client into an always-available personal agent you message from your phone:

# Set your Telegram bot token
hermes config edit  # Add to .env:
# TELEGRAM_BOT_TOKEN=your_token_here
# TELEGRAM_ALLOWED_USERS=your_telegram_user_id

# Run the setup wizard for gateway
hermes gateway setup

Then install as a systemd service so it starts on boot:

hermes gateway install
hermes gateway start

Verify it's running:

hermes gateway status

Step 9: Enable Only the Tools You Need

On 8 GB RAM, we have reasonable headroom but the CPU is still the bottleneck. Be selective:

hermes tools

Recommended toolset for the thin client:

Toolset	Keep?	Reason
terminal	✅	Core — run commands
file	✅	Read/write files
web	✅	Search, extract (lightweight)
memory	✅	Persistent notes
skills	✅	Load procedures
clarify	✅	Ask user questions
cronjob	✅	Scheduled tasks
browser	❌	Too heavy — Chromium needs 500 MB+ and will crawl on Celeron
image_gen	❌	No GPU
tts	⚠️	Works but slow — Edge TTS is lightweight
vision	⚠️	Needs API key (no local model)
code_execution	⚠️	Useful but Python sandbox uses RAM

Step 10: Verify Everything Works

Send yourself a message on Telegram. Ask Hermes to:

Run uname -a — confirms tools work
Search the web for "Celeron N3060 benchmark" — confirms web access
Save a note to memory — confirms persistence

If all three work, you're done.

What About More Upgrades?

We started at 4 GB, already upgraded to 8 GB. Here's what else could help:

Upgrade	Cost	Impact
SATA SSD upgrade	~£20-£30	Faster boot, Python startup, logs and swap than the current 1 TB 5400 RPM HDD
USB 3.0 external SSD (500 GB)	~£30	Fast project storage without opening the case
PGX / stronger local box	Serious money	Use the HP as the mobile/control surface and let the PGX do the heavy local inference

Total practical investment so far: £80 (laptop) + £15 RAM upgrade = £95 for a dedicated, always-on AI agent box. Add an SSD later if you want snappier local system performance, but do not buy one expecting it to turn the Celeron into a useful local LLM machine.

The Honest Take

This machine is at the low end of what can run Hermes, but with 8 GB RAM it's genuinely workable as an agent host. The Celeron N3060 is the bottleneck, not the RAM — and cloud inference hides most of that pain:

Slow first response (3-5s cold start, 1-2s warm) — noticeably better than 4 GB would have been
Local inference is not practical — local model selection worked, but the context window, latency and timeout behaviour made it unusable inside Hermes
Cloud inference works — GPT-5.5 via ChatGPT OAuth behaved as intended, and Ollama subscription access to 31 cloud models gives another practical model pool
Comfortable RAM — can run browser OR code_execution, just not both simultaneously
Fan noise — the Celeron will spin up under sustained use, but it's a quiet machine overall
It's always on — £0 in cloud compute when you're asleep, unlike a VPS

The real win: £95 total cost (laptop + RAM) for a personal agent host with full access to a local PC: terminal, files, memory, scheduled jobs, and messaging. The model does not need to live on the laptop for the agent to be useful.

That also makes this a mobile platform for the next step: connect the thin client to the PGX and let the bigger machine handle the heavy local inference while the HP remains the cheap, portable control surface.

That's not a toy. That's infrastructure. At least the beginning of one.

Next Steps

Run the install and document the first real errors
Test ChatGPT OAuth / GPT-5.5 as the working model route
Test local Ollama model selection and confirm why it is not practical on this hardware
Test 24-hour uptime stability
Connect the HP thin client to the PGX as a mobile/control platform
Write the follow-up with real PGX-connected results

Found this useful? Follow @Raf_VRS for more from the VRS Computing trenches.

Support independent tech writing: ko-fi.com/rafvrs

Stop Scrolling. Start Building. #LocalAI #AIAgents #VRSComputing