Multitasking with Coding Agents

A note about developing with LLMs / agents, and authorship responsibility.

I’ve been experimenting a lot with this recently. Multiple agents helping me tackle up to 4 tasks at once across different git worktrees.

For instance, yesterday I was working on:

Adding in fallback logic for associating emails with incomplete References headers with the correct inquiry, and sending debug slack notifications with diagnostics
Updating the UI around our Inquiry’s Date question to help narrow the guest to specific available dates when availability is low, to reduce email back and forths
Sending de-normalized event statuses on usage records when the status changes, to improve analytics and exclude certain events
Adding playwright to do full E2E tests to my CDK pipeline (pulled out the big gun Claude Code for this one 👊🏼)

Feels like having a small team working in parallel. Obviously, there are zillions of differences between this and a team, but one gigantic one I want to harp on;

Those agents have no responsibility for what they’re writing.

When you’re a lead or a senior on a team, code review becomes in many cases your primary responsibility.

But for any well-functioning team I’ve been a part of, the individuals creating the PRs are the ones that continue to own their changes. The reviewer will do their best, but generally could never own all the context associated with the area that is being changed to think of “all the things”. The checklists help embody the due diligence required to ensure this, but the biggest guarantee is the experience and reputation of that developer.

This is obviously tweaked a bit for open source. And will continue to be tweaked as devs now put forward 100% AI coded work (and be praised for it), but that responsibility piece is still real.

And fundamentally, this is a context problem. For developers and code reviewers, the reason why that responsibility is important is you trust the author to have held the codebase’s context in their head to the best of their known ability, and therefore trust the quality of their work (in conjunction with the testing methodology, as we know this is impossible in the absolute sense).

With LLMs, it is also a context problem, one that they’re getting better and better at solving at a rate that staggeringly better than humans ever could. So when you see an AI produce a silly result working on one of your 4 worktrees, you _can_ say “silly Claude, how could you,” but ultimately, the better way to think about this is how you could have fed it more complete context to produce a better result.

I mean — I could see a world or specific repos that are so well covered by tests and evals that the responsibility piece is less important, and the PR’s quality is purely based on fitness scores. This would obviously have some huge advantages…

The fun thing about all of this is that traditional agile thinking would dictate to minimize work in progress, even across an entire team. I love having this turned upside down. I’m sure PMs do too! Provided delivery is still occurring for your customer, of course. I could talk a lot more about barrels vs ammo here, but that’s for another day.

This type of work is great for the non-customer-related features, like the E2E test tasks for instance. Your PM really doesn’t want you to spend time on this, and it’s obviously not for free, but if you can use a bit of extra horsepower to do more multitasking, while still delivering on the PM features, that’s win-win. Provided the quality is there of course.

But for now, basically, you can’t delegate responsibility to LLMs. Not yet at least.

Barrels vs Ammo — Keith Rabois Stanford talk: https://www.youtube.com/watch?v=6fQHLK1aIBs