← All posts
Opinion·6 min read

Fable 5 is back — and the arena receipts are worth reading

Anthropic’s Mythos-class model is usable again after the launch crunch. What 18 one-shot coding challenges, 3 compiled Godot games, and the community votes actually say about it.


For a stretch after launch, Fable 5 was the model everyone argued about and almost nobody could actually run. Capacity was rationed, rate limits hit fast and hard, and half the takes you read were written by people who had gotten maybe four responses out of it. I hit the limits constantly myself — this site literally got rebuilt in a session that started on Fable, got throttled onto Sonnet, and came back to Fable to finish.

That crunch is easing. Fable 5 is usable again for normal work, which means it is finally time to judge it on outputs instead of vibes. Conveniently, judging models on outputs is the only thing this site does.

What Fable 5 actually is

Fable 5 is the first model in Anthropic’s Mythos class — a tier that sits above Opus. It is also the most expensive Claude you can buy: $50 per million output tokens, double Opus 4.8 and more than three times Sonnet. One more detail people miss: in our roster of 19 variants, Fable is the only family with no thinking-effort dial. Opus and Sonnet ship low through max variants; Fable ships one configuration and that is what you get.

The receipts: 21 challenges, same prompt as everyone else

Fable 5 ran 21 of our head-to-head challenges: 18 one-shot single-file builds (landing pages, arcade games, a raw-WebGL racer, a CHIP-8 emulator) and 3 full Godot 4 games built through an agentic pipeline. Every single-file entry came from the exact same one-shot prompt every other model received — the prompt is published under the Details panel of each task.

The token appetite is the first thing you notice, and it is not the simple story you expect. On the particle-physics sandbox, Fable burned 233,555 generation tokens where Opus 4.8 used 47,656 — almost five times more. On the CHIP-8 emulator: 161,702 vs 54,822. At $50 per million, that appetite is real money.

But it is not uniformly hungry, and this is the part I find genuinely interesting: on the one-button cave copter, Fable used 47,687 tokens against Opus’s 125,395. On the launch-and-upgrade game: 38,573 vs 85,191. On castle defense: 61,313 vs 115,838. Fable seems to spend where the problem is hard and stay lean where it is not. Whether that judgment is worth the price premium is exactly the kind of question a blind vote answers better than a benchmark chart.

The Godot runs impressed me most

Three challenges are not single files at all — they are real compiled Godot 4 WebAssembly games from an agentic pipeline where the model writes engine code, compiles, smoke-tests, and fixes its own build until validation passes. Fable rebuilt the marble platformer in 22 turns for $5.73 of full-pipeline cost and the endless runner in 17 turns for $5.45. The FPS gallery run is my favorite footnote: the generation process itself exited with an error flag — and the game it left behind was complete and passed validation cleanly anyway.

As I write this

Fable leads the community tally on several of the original challenges — on the NEON BREAKER brick-breaker it currently holds more votes than any Opus variant. Tallies are live and move as people vote, so check the header numbers rather than trusting a blog post.

The honest gap

The 22 newest challenges — the fluid solver, cloth sim, traffic model, spreadsheet engine, chess with minimax, and friends — have not run Fable yet. That batch went to Opus, Sonnet, and Haiku first, and a Sonnet 5 backfill is landing right now. So no, you cannot yet see Fable hand-roll Navier-Stokes next to Opus. When those runs happen, they will show up in the same arena with the same rules.

My take

Is Fable 5 worth $50 per million output tokens? For boilerplate, obviously not — Haiku exists and is shockingly competent for a tenth of the price. But on the tasks where every model gets the same one shot and most of them ship something broken, Fable’s hit rate on the hard stuff is why I keep it in the default lineup. That is an opinion. The whole point of this site is that you do not have to take it.

Go run the comparison yourself: open the coding arena, pick Fable 5 against whatever you believe in, hit Compare blind, and vote before the labels reveal. The tally in the header is everyone’s votes combined — including the ones that disagree with me.

Don’t take the post’s word for it

The arena runs every model’s real output live. Pick a challenge, go blind, and cast a vote that counts in the public tally.

Open the arena