The future of AI agents looks a lot like 2010.

The most useful thing I’ve done for my agent setup wasn’t a new model or a better prompt.

It was adding a kanban board.

The problem with just letting the agent run

The conversation around AI agents is almost entirely about orchestration — tool use, memory layers, multi-agent routing. Almost none of it is about managing the work.

I had the same gap. I’d open a Claude Code session, pick something to build, and let the agent run. Sometimes it worked. Often it drifted. Context would bleed between sessions. I’d come back three days later with no idea what state things were in or why something had been built the way it was.

The problem wasn’t the agent. It was that I’d stripped away every management structure from the workflow and expected nothing to change.

The kanban board

Three Markdown files and an Obsidian plugin.

Board file — the Obsidian Kanban plugin renders this as a board. Three columns: Backlog, In Progress, Done. Each card is a wikilink to a story file. Nothing else.

## Backlog
- [ ] [[CVR-019-bedrock-card-rendering|CVR-019 Replace card rendering]]
- [ ] [[CVR-020-dashboard-rollout|CVR-020 Dashboard full rollout]]

## In Progress

## Done
- [x] [[PRJ-008-deploy-pipeline|PRJ-008 Deploy pipeline refactor]]

Story files (CVR-NNN-slug.md) — one per card. Here’s one, abridged:

## CVR-019 — Replace card rendering
Goal: cards render from the new classifier output; old path deleted.
Hard gates:
- [ ] all golden cases render identically
- [ ] old renderer removed, not flagged off
Blocker: none
Work log:
- session 1 — classifier clean in shadow mode, opening render work

Hard gates are the definition of done — the story doesn’t move until every one is checked. When something can’t proceed, the card stays in Backlog marked ⚠️ BLOCKED. The agent doesn’t silently skip or abandon — it marks, explains, and stops.

I ask Claude to tackle the next story. Stories move to In Progress as the agent works — and come back to me when they need a human call. The board can’t lie if Claude is the only thing that moves cards.

Before marking a story Done, there’s a context update step. The work log and the project’s CLAUDE.md are the place to capture what was decided and why — not just what was built. Zero checked boxes is not valid. Durable learnings don’t survive on vibes.

Nothing gets worked on without a story.

The team

The team is a Markdown file. Named roles, called at specific points in the process.

The most useful one is /refinement-team-deep — the planning council. Six sequential agents: Critic, Contrarian, Tech Lead, Software Architect, Engineering Manager, then a Revised Planner who synthesises all five. Each addresses a flaw the previous one didn’t.

Concrete example: I was redesigning a campaign analysis pipeline. The council flagged that shipping the new classification layer and the rendering change together would make it impossible to isolate regressions. The revised plan split them into separate stories — ship the classifier first in shadow mode, verify it on known cases, only then open the rendering work.

Two minutes of latency. The output was two new stories instead of one, with clearer hard gates on each.

Orchestrator — reads the board, routes work, breaks epics into stories. Default mode.

Tech Lead — called before anything architectural ships.

Designer — /ux-review before writing a line of markup.

Critic — /critique for a fast single pass. Returns PROCEED / REVISE / ABORT in under a minute.

Retrospective

The next SDLC primitive I’m working on is the retro. Two experiments in progress:

A weekly review skill — a structured moment to step back, reflect on how the week went, and carry the useful parts forward.
An agent retro skill — pattern analysis across sessions: which skills I reach for, which tasks get repeated or abandoned, where the gaps are.

Both are still rough. But the retro is the one SDLC ritual whose job is turning mistakes into process — the thing agents can’t yet do for themselves.

Where this is going

Backlog, stories, acceptance criteria, roles, reviews. We’ve been doing this for fifteen years. AI agents didn’t invent a new way of working — they just gave us a new team to manage.

That’s the bet I’m making: the patterns from SDLC transfer, and the closer we follow them, the better agent-driven development gets. The overhead disappears. The discipline doesn’t.

The open problems remain. Context doesn’t fully survive session boundaries. Decision rationale evaporates from transcripts. Shared context across sessions and collaborators is unsolved. At AI Bar Camp Berlin last week, every room circled back to the same questions. Nobody has cracked it.

My guess is that the answers look a lot like the questions — better story handoffs, better session summaries, better ways to make the work log the source of truth. Not new paradigms. Deeper investment in the old ones. But I’m genuinely not sure.

The kanban board is embarrassingly low-tech. That’s the point.