Armin Ronacher is quickly becoming one of my favorite sources for AI developer workflow advice. It started with this video where he uses Claude Code to solve two actual issues in the minijinja templating engine.
This post is a set of notes from two recent videos:
- A 3-hour screencast where Claude tries to build a Sentry clone.
- A talk packed with advice on how to make agentic coding work.
Read the "Screencast - Claude Vibe Codes a Sentry clone" sectionScreencast - Claude Vibe Codes a Sentry clone
This video is 3h23, no audio. I found it oddly entertaining and ended up skimming through the entire thing.
-
Sentry-clone Claude mostly works on its own for the whole thing. Armin doesn't prompt it a lot.
-
14:00: Armin uses another Claude Code to create an
agent-watcher
tool to follow the changes the main agent makes. -
Agent-watcher Claude starts by struggling with
uv init
.I see it created a subdirectory... Let me move files1.
It goes on for quite a while, credits for persistency I guess:
the actual files I created might be in a different location. Let me find them...
-
24:26: Armin opens VSCode, sees the AI created
agent-watcher
andagent_watcher
... kills it,rm -rf *
💥. -
1:15:28: Fun bit, Armin notices a bug with the agent watcher occasionally printing garbage to the terminal. He prompts Claude with:
the display kinda screwed up. This is what I saw: <pastes broken display>
Claude's response:
-
2:54:50: Armin is testing on port 3000, in the background Claude opens a window on port 3002 using playwright. Then it goes and tries to debug the signup process on its own. That's using the playwright MCP (see
.mcp.json
in the repo). -
The Sentry clone is not quite working.
The repository shows that Armin is also using a customized Claude setup (see the commands/
folder) to have it use "subagents" (each gets its own context window).
Some more details on this issue.
Read the "Talk - Agentic Coding: The Future of Software Development with Agents" sectionTalk - Agentic Coding: The Future of Software Development with Agents
The talk echoes some of the points made in his blog post. There is a lot in this talk.
General remarks:
-
Armin is very excited about this, feels a paradigm shift.
-
Had successful outputs having Claude Code run for more than 4 hours.
-
100% of the people he knows hooked on this run Claude on "yolo mode" all the time2
-
Claude Code de-emphasizes the role of the editor. You review, program much less.
-
Use cases he mentions: investigate issues, setup/debug CI (the agent uses
gh
to create draft PRs), create a presentation (the one from the video)...
Quality of the dev environment is key:
-
Works best with stable languages, Go, PHP, basic Python. Low ecosystem churn works best (breaking changes in library versions don't help LLMs).
-
Long function names > namespaces.
-
conflicting patterns in the codebase: not good for agents.
-
Log everything to one big file and tell the agent how to read it (a few lines, and more if needed).
-
Speed matters for agent iteration; it might kill a slow tool!
-
Your dev commands should be helpful when misused (~defensive programming).
Cool : Go test caching. No arguments. Only the relevant ones run.
-
reduce friction by enabling quick tool creation and execution: Tell Claude where it should place its bespoke tools
MCP is not good for coding agents:3
- Finds they pollute the context, and the CLIs just work better (Claude can use them in script, that you can run as a human if needed)
- Armin only uses the playwright MCP at this time.
Tip - Unified logging:
- Combine
console.log
+ server logs + everything else
Multi-process guidance: make it clear what processes should be running, provide healthcheck endpoints/easy status access.
Something Armin explores - synchronization points:
Goal: make async operations observable.
from .agentsupport import reached
reached(point="event-preprocessing-done")
make await POINT=event-preprocessing-done
It can have one thing running in the background, do some stuff, and then on the outside await to end up at this point
Armin mentions this isn't perfect, he'd like something that runs 'in lockstep, like a debugger'. I don't fully understand what it enables. Discussion is at 31:06.
Preserving context is key:
- Prevent the agent from spelunking: tools to navigate the codebase efficiently, tail 20 lines of combined logs...
- Consider sub tasks/sub agents to conserve context.
- When you need to
/compact
, you lost: "from that moment, everything is random". Armin aborts and starts from scratch.
Example: a tool make go-methods
. Lists all the methods that exist (with grep), saves the AI a lot of time/tokens navigating.
Read the "Conclusion" sectionConclusion
There's something refreshing about the quality of the dev environment being key to the success of agents.
Happy hacking 🤖