Build Journal

Day 1: The Box Arrives

The Alienware box arrived: RTX 5070 Ti, 64GB RAM, Ubuntu 24, Ollama, and the first gap between “it runs” and “it runs well.”

2026-04-11 · 2 min read

The hardware

It started simply enough: a pre-built Alienware Aurora ACT1250 sitting on the desk. The specs looked solid on paper:

Component	Spec
CPU	Intel Core Ultra 7 265KF
Memory	64GB DDR5
GPU	NVIDIA RTX 5070 Ti (16GB VRAM)
Storage (OS)	Stock NVMe (came with the machine)
Storage (test bench)	Samsung 990 Evo Plus 2TB PCIe
OS	Ubuntu 24.04 LTS

The extra Samsung 990 Evo Plus was intentional -- a dedicated 2TB NVMe drive for unboxing and testing AI models, tools, and experiments without risking the main OS partition. When you're pulling 5-15GB model files and running destructive benchmarks, you want a scratch disk that doesn't share a filesystem with /home. It also means wiping and starting fresh is a 30-second operation, not an afternoon of backup anxiety.

The dream? Run production-grade AI locally and stop paying per-token to cloud providers. The machine itself was barely warm when I started asking the real question:

Can a single consumer GPU actually replace cloud AI?

Installing Ollama

First step was Ollama -- the easiest way to get local models running. One curl pipe, a few minutes, and I had a working LLM server on localhost:11434.

curl -fsSL https://ollama.ai/install.sh | sh
ollama pull gemma4:e4b

The first time a model responded from your own hardware is a strange feeling. It's fast -- like really fast. No network round trip, no API key, no billing. Just you and the silicon.

The first reality check

But then things got complicated fast:

Some models don't fit in 16GB VRAM -- you need to be strategic
Model names are confusing (what even is gemma4:e4b?)
There's no obvious way to compare models for your actual use case
The default Ollama API has quirks (more on that in a later post)

The biggest realisation: "it responds" and "it responds well" are very different things. A model that takes 78 seconds to answer is technically working. It's also completely unusable for interactive workflows.

What I wanted

I wasn't building a chatbot. I wanted an AI agent -- something that could:

Read files, run commands, edit code
Make decisions about which model to use for which task
Run automated cron jobs without burning tokens
Stay secure and local by default

That last point became way more important than I expected. But that's the next post.

Found this useful? 👉 Follow @Raf_VRS for more Build Journal updates 👉 Support the work: ko-fi.com/rafvrs #SelfHosting #AIAgents #HardInterference