Wrapping my head around Gas Town

I read Steve Yegge’s post announcing Gas Town, an LLM orchestrator which allows you to manage dozens of Claude Code instances at once while they make independent progress towards some stated goals. It was a wild ride, but I believe I see the promise.

Context#

Two pieces of context seem useful before diving in. First, I worked with Steve briefly at Google. I remember him working on “stone tools”, which was bringing the features of an IDE to old editors like Emacs and Vim. This was 3 years before LSPs happened. This is to say, I think that Steve has some degree of future sight. I think that’s in play here. His Kool-aid is good. I’ve drunk it before and likely will again.

Second, I’m at level 7 on Steve’s “Evolution of the Programmer” model, which he defines as:

10+ agents, hand-managed. You are starting to push the limits of hand-management.

I had some familiarity with beads prior to this based on a previous post of his. For those not familiar, it’s a git-based ticketing system that’s made to work with LLMs as a (the?) primary user. Inside, it’s a jsonl file with a line for each issue, plus a reasonable terminal interface.

I would consider myself a significant user of claude code (~60k LOC / month for the past several months), but not a particularly advanced user. I can make my own /slash-commands and skills, but I’ve not played with --teleport or --resume functionality at all.

At the time of writing, I have 9 active threads of work with an associated agent. They’re not working continuously, but that’s not out of desire. It’s a function of support and tooling, which Gas Town intends to solve. For visibility, those threads are:

Churn through available data compiling my year in review, including things I should be publicly writing about. (idle)
https://github.com/justinabrahms/llm-session-sharer/ - waiting on me to say “go implement the next spec” that’s up for PR
convert a small service from go to python b/c too few people know go at my job
gastown trial, pending work to be given to it
service which will call 2 downstreams and compare return values. Pending review from colleague
documentation updates for our internal k8s platform
iterating on automating SOX audit log reporting
fix the component names in a Jira project so I don’t have to hand edit like 30 of them.
Add telemetry to an existing service for an attribute we don’t know how frequently it’s used

There are more things I could be working on. We’ll get to that.

First impressions#

I stood up gastown and managed to complete a little actual work with it (single rig, no convoy) and it broke enough I had to restart it once. To it’s credit, it did start working right where it left off when I brought it back up.

My first impression was “These metaphors are all wrong”. I don’t know the Mad Max universe very well. I don’t understand what the underlying meaning is. So, I had an LLM generate a decoder ring. It’s pretty good.

Gas Town Term	Business Equivalent	Description
Town	Organization/Enterprise	The entire workspace containing all projects and resources
Mayor	Executive/Project Manager	Coordinates work across teams, makes strategic decisions, delegates but doesn’t implement
Rig	Business Unit/Product Team	Self-contained project with its own resources, workflows, and team members
Polecat	Individual Contributor/Contractor	Ephemeral worker that executes specific tasks, disposable and replaceable
Witness	Team Lead/Supervisor	Monitors worker health, handles lifecycle management, nudges stuck workers
Refinery	CI/CD Pipeline/Integration Team	Handles merging completed work, resolves conflicts,maintains quality gates
Beads	Tickets/Work Items	Discrete trackable units of work (like Jira tickets or GitHub issues)
Convoy	Sprint/Project Batch	Grouped related work items moving through the pipeline together
Hook	Inbox/Assignment Queue	Where work lands for a specific worker - their personal task assignment
Molecule/Wisp	SOP/Runbook	Policy documents and checklists that guide common workflows
GUPP	SLA/Work Guarantee	“If work is assigned to you, you execute it” - no waiting for confirmation
Capability Ledger	Performance Record/Track Record	Permanent history of completed work demonstrating reliability

When trying to run it, the next big hurdle is how you actually start it up. Without a firm mental model of the system, it’s hard to grok. As best I understand it, you have a central coordination tool called the “mayor” (PM). It will generate work as beads (tickets) and sling (delegate) it to various rigs (teams) to do the work.

The issues I see#

As I begin to think through how my own workflow would need to change in order to accommodate this workflow, a few issues come up.

Attention is finite#

In my own multi-tab claude sessions, monitoring the state of the workers is too much effort. I’m cycling to each tab and checking in on it. There’s a frenetic energy to it that feels like plate spinning. I think having a centralized UI / coordinator to understand the state of the complex system is important. I don’t know that gastown’s (or really any sole LLM interface) is the one I’d choose, but the need is certainly there.

Work generation#

As mentioned in the gas town article:

Aside from just keeping Gas Town on the rails, probably the hardest problem is keeping it fed. It churns through implementation plans so quickly that you have to do a LOT of design and planning to keep the engine fed.

This actually introduces tension into my current process. While I have what my wife calls a “multi-threaded brain”, it doesn’t think particularly far in advance. To make a system like gastown work, I’ll need to spend legitimate effort in planning something amounting to a roadmap. I’m looking at tools like OpenSpec to help with this. The roadmap we need isn’t just any roadmap either.

One of the biggest struggle with my current work is that while there are many things that both can and should change, changing all of them at once would be massively destabilizing. Figuring out both what the future could/should maybe look like, then sequencing that to be stable for the system and digestible by the humans is tough!

Beyond that, as the cost of change goes down, we lose an important filter. I think it can be too easy to add to a product, just like making it “too easy” to push content to entire world has had wide-spread consequences (positive and negative). This will require a greater degree of stewardship than I think we’re prepared for as an industry.

Visibility into the workstream#

Let’s say we get the thing running, we get a giant pile of beads (tickets) generated for it. We don’t actually have good tooling in this space for tracking all of the roadmapping stuff a project management org usually does. For each of our rigs (teams), do they have enough work? When will they run out? Are they stuck? What are the challenges? What’s the efficacy of the changes they’re making? How can they be made more efficient? What areas of the product should we be working on, but are actually under-investing in?

Where from here?#

I think there’s really something to this. I think it’s a paradigm shift that will be hard for the industry as it currently exists to handle. Like XP, CD or similar transformational practices, I suspect we’ll end up with a watered down version that’s more accessible to the broader populace. I think that version will still be a net win.

As for me, I’m focusing on three things:

Get better at breaking larger tasks into smaller tasks for the LLMs.
Experiment with orchestration frameworks to better understand what’s possible. Gastown is too complex, but with refinement, I think it’s a very big unlock.
Talk with more of my product counterparts to see if we can super charge their idea factory in an LLM-army compatible way.