Build Journal

How ChatGPT Images 2.0 Finally Got Our Logo Right (After 50+ Failed Attempts)

After 50+ failed attempts across Stable Diffusion, Flux, and Claude, ChatGPT Images 2.0 nailed the VRS logo in just 8 prompts — then reverse-engineered prompts for every other model.

2026-04-22 · 10 min read

For months I've been on a quest to nail the VRS Computing logo. I wanted something that felt both cerebral and tangible — a brain woven from circuits, with a CPU at its heart proudly displaying the VRS initials. I tried over fifty variations across different image generation models: Stable Diffusion 1.5, Flux (schnell), Claude's image capabilities, even earlier versions of ChatGPT's image tool. Each attempt fell short — either the brain looked like a blob, the circuits felt pasted on, the text came out garbled, or the overall vibe missed the mark of "Stop Scrolling. Start Building."

Then, my internal content curation app (the name's under wraps for now — Interested? Follow @Raf_VRS where it'll be released first) surfaced OpenAI's Introducing ChatGPT Images 2.0 announcement. Sceptical but hopeful, I dove in.

My first prompt was simple: I want a front version of a brain made of circuits and nodes. in the middle a cpu with engraved letter VRS is visible. make it transparent

Within eight prompts — yes, just eight — ChatGPT Images 2.0 delivered the exact version you see below. The breakthrough? Its text editing capability. For the first time, an AI image model reliably rendered "VRS COMPUTING" beneath the emblem without morphing the letters into hieroglyphs. The brain's lobes retained their organic feel while the circuit overlays looked like they'd grown there naturally. The CPU centrepiece, complete with the engraved VRS, sat crisp against a clean white background.

The Full Benchmark

Every model tested with the same brief — generate a front-facing brain made of circuits with a VRS CPU at its centre.

REF — ChatGPT Images 2.0 (Human Iterative, 8 Prompts)

ChatGPT Images 2.0 — the gold standard

Score: 🏆 8.6/10. Best overall — excellent text and crisp detail. The benchmark every other result is measured against. Not a perfect 10 though: the brain is slightly more stylised than organic, the white background limits logo versatility, and the high detail complexity limits small-scale logo use.

T1 — Flux.1-schnell (Direct, ChatGPT Generic Prompt)

Flux schnell — correct text but no transparent background

Score: 8.1/10. Text was correct — "VRS COMPUTING" rendered cleanly on the first try. Brain shape (9/10) and CPU centrepiece (9.5/10) are both excellent. Brain circuitry looked sharp and integrated. No transparent background (4/10) is the main drag — critical for a logo that needs to work on any surface.

T2 — SD 1.5 (Direct, ChatGPT SD1.5 Prompt)

SD 1.5 result — gibberish text and triple CPU chips

Score: 4/10. The brain silhouette from the side profile is recognisable (8/10 shape), earning slightly more credit than a total failure. But text came out as gibberish (0/10), the "circuits" are tangled neural wires rather than integrated PCB traces (5/10), and there's no clear CPU centrepiece (2/10). Fundamentally failed to produce a usable logo result for comparison.

T4 — Claude → Flux (Claude Writes Prompt, Flux Renders)

Claude→Flux — a language model's prompt beats dedicated image generators

Score: 7.6/10. Best seeds produce great results — brain shape (8.5/10), CPU (9/10), text (8/10) all strong. Dark opaque background (4/10) and the mechanical/less-organic brain stylisation are the main drags. Seed variation can cause brain↔cloud confusion.

T5 — Claude.ai Native (Human Chat Interface)

Claude.ai native output — cloud hybrid, flat and jagged

Score: 5.2/10. The brain shape reads as blobby (5/10) — more jellyfish than cerebellum. Circuits are overlaid rather than integrated (7/10). Credit where it's due — Claude.ai native captured all three required elements: the brain shape, the CPU centrepiece, and readable "VRS COMPUTING" text. But "COMPUTING" is faint and blurry (4/10 text rendering), and the overall execution is flat and jagged. Solid white background limits versatility.

T6 — Nemotron → Flux (Nemotron Writes Prompt, Flux Renders)

Nemotron→Flux — a text model's prompt beat dedicated image generators

Score: 7.3/10. 🤯 Best local score! Brain shape (8.5/10), CPU (9/10), and text (7.5/10) are all strong — remarkably close to the ChatGPT reference. Main drags: opaque dark background (3/10) and the high complexity limits small-scale logo use. Some seeds produce cloud shapes instead of brains, but the best result is impressive.

T7 — SDXL (Local GPU)

SDXL — stronger than SD1.5 but still struggles with text

Score: 3.2/10. This isn't an isolated brain logo — it's a full humanoid head illustration with a translucent face, shoulders, and atmospheric purple glow. The brain shape (8/10) is recognisable inside the head, but the "circuits" are more atmospheric neural lines (6/10) than integrated PCB traces. No CPU centrepiece (0/10), no "VRS COMPUTING" text (0/10), and a full illustrated background (1/10). Fundamentally missed the brief. ChatGPT's suggestion that SDXL would be "much stronger than SD1.5" proved optimistic — it's arguably worse for a logo task.

T8 — FLUX.1-dev GGUF Q8_0 (Local GPU)

FLUX.1-dev — best local model result

Score: 7/10. The highest-quality local result on paper — brain shape (8.5/10), CPU (8/10), and text (7/10) are all solid. But a text artefact "#53AB7" appeared alongside "VRS COMPUTING" on the chip, and circuit integration is concentrated near the CPU rather than throughout the brain folds (6.5/10). Dark opaque background (3/10). With stronger prompt engineering (ALL CAPS for key terms), text rendering approached ChatGPT quality. 50-step generation took ~60 seconds on an RTX 5070 Ti.

T9 — Grok (X Web Chat)

Grok output — decent composition, weak text rendering

Score: 8/10. Much stronger than initially assessed. Dense, well-integrated circuitry throughout the brain (9/10) — circuits feel like they're actually building the brain structure, not just layered on top. CPU centrepiece is clear (8.5/10) with readable "VRS" and "COMPUTING" text (8/10). The checkerboard pattern in the background suggests possible transparency (6.5/10). The only real drag is the high complexity which may limit small-scale logo use.

T10 — Copilot / Microsoft (Web Chat, ChatGPT Prompt)

Copilot output — over-saturated, messy circuits

Score: 8.2/10. Massively underrated on first pass. Clean, symmetrical, and well-structured — the brain shape (8/10) has a recognisable two-hemisphere contour, circuit traces are well integrated and radiate naturally from the CPU (8.5/10), and the metallic CPU centrepiece is sharp and dominant (9/10). "VRS" and "COMPUTING" text are both legible (8/10). The solid white background (5/10) limits versatility but the execution is nowhere near "over-saturated and messy" — it's one of the cleanest results in the whole benchmark.

T11 — Gemini (Google Web Chat)

Gemini output — clean but bland, missed circuitry detail

Score: 5/10. Clean and well-composed but bland. Circuitry detail was largely absent — more illustration than logo. The rectangular (non-square) format and JPG output (no transparency) limit logo versatility. It captured the brain shape decently but without the circuit-level detail or strong CPU focus that defines the benchmark.

B1 — Flux.1-schnell 1:1 Reproduction (ChatGPT Prompt Verbatim)

Flux 1:1 reproduction — CLIP truncation killed the detail

Score: 6/10. ChatGPT's full reproduction prompt was too detailed. CLIP truncation at 77 tokens killed the detail, scoring lower than the shorter generic prompt. Less is more with Flux.

The Reverse Prompting Comparison

Here's the key evidence. Both Claude and Nemotron wrote prompts for Flux — and both prompts beat the handcrafted approach.

Claude-generated prompt (T4):

Front-facing symmetrical human brain logo made of intricate metallic circuitry and glowing neural pathways, transparent background, detailed circuit traces in brushed steel with hundreds of luminous nodes glowing in electric violet #534AB7 with blue highlights, central CPU microchip with polished steel bevelled frame and corner screws, "VRS" engraved in large letters on chip with "COMPUTING" in smaller text below, perfectly centred typography, hyper detailed sci-fi branding, premium technology aesthetic, 8K resolution, professional logo design

Negative: asymmetrical, cropped, blurry text, misspelled words, organic brain tissue, pink/red colours, cluttered background, low resolution, amateur design, cartoon style, hand-drawn, sketchy lines, uncentred text, missing typography, realistic photography, people, faces, body parts other than brain

Nemotron-generated prompt (T6):

front-facing symmetrical human brain made of intricate metallic circuitry, hundreds of luminous nodes and glowing neural pathways in electric violet #534AB7 with blue highlights, transparent background, central brushed steel CPU microchip with bevelled frame and corner screws, large engraved "VRS" on chip, smaller "COMPUTING" perfectly centred below, hyper detailed, premium sci‑fi branding aesthetic, high resolution, crisp vector‑style lines

Negative: low quality, blurry, noisy, jpeg artefacts, misaligned or extra text, background elements, gradients, shadows, watermark, logo clutter, oversaturation, dull colours, asymmetry, missing circuitry, missing nodes, low detail, distorted proportions

Notice how different the styles are. Claude writes like a creative director — rich adjectives, brand-oriented phrasing. Nemotron writes like an engineer — concise, technical, specification-focused. Same brief, two entirely different languages for Flux.

Scorecard Summary

#	Model	Method	Score	Key Finding
REF	ChatGPT Images 2.0	Human iterative (8 prompts)	🏆 8.6/10	Best overall. Excellent text and detail. Slightly stylised brain; white bg
T1	Flux.1-schnell (direct)	ChatGPT generic prompt	8.1/10	Strong all-round. Text correct! No transparent bg
T2	SD 1.5 (direct)	ChatGPT SD1.5 prompt	4/10	Text gibberish. Tangled wires not circuits. Failed comparison
T4	Claude→Flux	Claude writes prompt → Flux renders	7.6/10*	Best seeds great, some = cloud shape. Dark bg
T5	Claude.ai native	Human chat interface	5.2/10	Brain+CPU+text all present. Blobby brain, flat execution
T6	Nemotron→Flux	Nemotron writes prompt → Flux renders	7.3/10*	🤯 Best local score! Some seeds = cloud. Dark bg
T7	SDXL	SDXLPipeline (local GPU)	3.2/10	Full head illustration. No CPU, no text. Missed brief
T8	FLUX.1-dev	GGUF Q8_0 (local GPU)	7/10	Solid but "#53AB7" text artefact. Circuits concentrated near CPU
T9	Grok (X)	Web chat	8/10	Dense integrated circuits. Readable text. Possible transparency
T10	Copilot (Microsoft)	Web chat (ChatGPT prompt)	8.2/10	Clean, strong CPU, readable text. One of the cleanest results
T11	Gemini (Google)	Web chat	5/10	Clean but bland, missed circuitry. Rectangular JPG, no transparency
B1	Flux.1-schnell (1:1)	ChatGPT reproduction prompt	6/10	CLIP truncation at 77 tokens killed detail

*Best seed only. Seed variation causes brain↔cloud confusion.

The Key Finding

The model that CAN'T draw beat the model that CAN — but web chat models matched it.

Nemotron (text-only, free, 120B MoE) wrote a Flux prompt that scored 7.3/10 — higher than:

SD 1.5 directly (4/10 — complete failure)
Claude's native image generation (5.2/10)
SDXL locally (3.2/10 — another failure)
ChatGPT's 1:1 reproduction prompt (6/10 — truncated)

But the revised scoring revealed two surprises:

Copilot (8.2/10) and Grok (8/10) both outscored the Nemotron→Flux pipeline entirely on their own — no prompt engineering, no text model intermediary, just a single web chat prompt.
The Reverse Prompting advantage is real for local pipelines, but the free web chat interfaces deliver competitive results with zero setup.

This proves the Reverse Prompting concept holds: text models are better at describing images than image models are at generating them from short prompts. But it also shows that the best web chat image generators have closed the gap significantly.

Secondary Findings

CLIP 77-token truncation — Flux schnell uses CLIP which caps at 77 tokens. ChatGPT's 1:1 reproduction prompt was too detailed and got truncated, scoring LOWER than the shorter generic prompt. Less is more with Flux.
Seed variance — The same prompt on different seeds can produce brain OR cloud shapes. Reproducibility requires seed selection AND prompt tuning.
Claude API can't generate images — Only claude.ai has image generation. The API is text-only. Scored 5.2/10 natively.
FLUX.1-dev GGUF Q8_0 — A solid local result (7/10), but the "#53AB7" text artefact on the chip and circuit integration limited to the CPU area mean it's not the runaway winner it first appeared. With stronger prompt engineering (ALL CAPS for key terms), text rendering approached ChatGPT quality. The 50-step generation took ~60 seconds on an RTX 5070 Ti.
ChatGPT was honest about SD1.5 — It warned text generation is weak, recommended generating without text then overlaying. Confirmed by my test.
Prompt style matters by model — Flux loves keyword-dense, SD1.5 needs weighted tokens, Claude wants structured intent + hierarchy, Nemotron prefers technical + concise. Same brief, four different languages.
Web chat models are competitive — Copilot (8.2/10) and Grok (8/10) both outscored the local Nemotron→Flux pipeline with zero prompt engineering. First-pass assessment massively underrated these — likely because the initial focus was on local/reproducible pipelines. The free web chat tools have quietly become viable options for logo generation.
ChatGPT isn't perfect, yet — Despite being the clear winner at 8.6/10, the reference image has a slightly more stylised/less-organic brain and the high detail complexity limits small-scale logo use. The gap between ChatGPT and the best free alternatives (Copilot at 8.2, Grok at 8) is only 0.4-0.6 points — much narrower than a perfect 10 suggests. However, where it excels is conversational editing which can affect specific parts of the image only.

What This Means for Independent AI

You don't need a £20/month subscription to create professional logos. With reverse prompting, a free text model can write prompts that get you 80-90% of the way there on local hardware. The last 10% — that final polish — is where a tool like ChatGPT Images 2.0 earns its subscription.

But here's the thing: most builders don't need that last 10%. Most builders need a logo that works on a website, a business card, and a T-shirt. The Nemotron→Flux pipeline delivers that for £0.00 on hardware you already own.

Stop Scrolling. Start Building.

And remember: never settle for "almost right". Keep prompting.

Found this useful? 👉 Support the work: ko-fi.com/rafvrs

#IndependentAI #ChatGPTImages #StopScrollingStartBuilding