The ticket is the thinking

Writing the ticket is the work. Everything after that is execution.

That sounds backwards. In most workflows, the ticket is the lightweight wrapper around the real work. In mine, the ticket is where design decisions get made, edge cases get named, and verification gets written. The quality of what the agent produces is bounded by the quality of the spec you hand it. A vague brief gets vague software.

This is the system. Tickets become tests. Tests become context. Context shapes the next ticket.


Every task is a ticket before it’s a task

Before I open an editor, I write a spec.

Not a description. A document: what the user sees, what happens in the database, what the edge cases are, what “done” looks like. Then I open Claude Code.

The spec does two things. It forces the thinking before implementation starts. And it draws a hard line around scope.

The drift is the real enemy. “While I’m in here, I should also fix X” is where most bugs I’ve introduced came from. When something adjacent comes up mid-session, I open a new ticket. The current one stays clean.


Every bug becomes its own ticket

The instinct when you find a bug is to fix it immediately. You’re in the codebase. You can see the cause. Just fix it.

The problem: a bug found in passing is usually not fully understood. You see the symptom. You have a theory. Fix it inline and you fix it under time pressure, without documenting the failure mode, without thinking through verification.

Real example. Testing a bulk order on Biztrix — an NFC business card platform I built with a friend — I noticed every card in a bulk PDF was showing the same base design instead of per-card designs. I could see what was probably wrong. The instinct was to fix it.

Instead I opened BIZ-57.

Writing the ticket forced actual diagnosis. There were two separate bugs: async design generation was fired as void inside a Vercel after() hook — Vercel was killing the function before it completed — and card ordering during regen used created_at timestamps instead of the code field, making the order non-deterministic. Two bugs, two root causes, documented before a line of code changed.

Writing the ticket also flagged there was no automated test covering this. Immediately after BIZ-57 closed, I opened BIZ-58: a Playwright test with three scenarios designed to catch these failure modes if they regressed.

The bug became a ticket. The ticket became a fix. The fix became a test. The test now runs before every deploy.


The kanban is a view, not the system

I have a kanban board. Backlog, In Progress, Done. Useful for a read on what’s moving. But the board isn’t the system — the .md files are.

Every ticket is a markdown file: frontmatter for status, then a spec. Context, scope, what to build, how to verify. The board is generated from those files by a sync daemon. If the board disappeared, I’d lose the view, not the knowledge.

The files survive. I can grep them, count them, reference them from other tickets. The board is a snapshot. The files are a record.

📄 The future of AI agents looks a lot like 2010. waldo.vanderlore.de/blog/how-i-run-my-agent-team

Tickets accumulate

After a wave of features ships, I run a retro. Not a retrospective about feelings — a pass through the closed tickets to extract what they collectively imply about the product’s correctness.

After roughly 38 Biztrix tickets closed, I opened biztrix-product-evals-continuous-qa-goldens. The spec was direct:

As of 2026-05-22, many product behaviors have been shipped. None of these are covered by regression tests. Any future change could silently break them.

That ticket read the closed history and produced two outputs: new tests for evals/smoke.ts (HTTP checks that run before every deploy), and additions to evals/product-evals.md (a manual golden checklist for behaviors that need a browser and human eyes to verify).

The next retro came after BIZ-39–48. It opened:

The previous evals round covered up to roughly BIZ-38. Since then the following tickets shipped and are not yet covered by any test or golden…

Then listed every feature with specific smoke tests and goldens to write for each one.

The structure: tickets ship → retro reads the history → goldens are written → goldens go into the harness → the harness protects future tickets.

Each wave makes the harness stronger. Nothing evaporates.


Evals feed back into execution

The feedback loop closes here.

New tickets now start with context that includes the smoke test suite and the product-evals checklist. Implementation gets evaluated against existing goldens. Verification criteria in new tickets are written with the harness in mind.

BIZ-58 made this explicit. The context section included a table inherited from BIZ-57:

FailureSymptomRoot cause
design_front_path = null on all cardsEvery PDF page shows identical base designvoid generatePerCardDesigns() got killed before completing
Sharp crash (float dimensions)Same as above — silentqrConfig.width is a float; Math.round() missing
Wrong page orderCard 2 appears on PDF page 1code ASC sort used; reverted to created_at

Known failure modes, now executable specs. The system self-improves: bugs teach the goldens, goldens constrain the next implementation, implementation produces new bugs and new goldens.


What this looks like day to day

A session starts with a ticket. I read the spec, check the verification criteria, open Claude Code. Anything out of scope gets its own ticket. When the work is done, the smoke tests run.

Periodically — not on a schedule, when a wave feels closed — I run a retro. I open a ticket for the retro itself. It reads the closed ticket history and asks: what behaviors shipped that aren’t tested? What failure modes do we now know that aren’t documented? What should future-me know that current-me learned?

The output goes into evals/. The evals folder is part of the context for the next session.

None of this requires a methodology. It requires one discipline: write the ticket before doing the thing. Everything else follows from that.

The ticket is the thinking. The thinking is the work.