To-loop or not loop: figuring out the context behind agent usage and automation patterns

10 Jun, 2026

There was a lot of discourse around Peter's recent tweet. I think the main driver behind this was most people not taking the time to think about Peter's context before comparing it with their own. In this post, I will talk about this "context".

Here’s your monthly reminder that you shouldn’t be prompting coding agents anymore.

You should be designing loops that prompt your agents.
— Peter Steinberger 🦞 (@steipete) June 7, 2026

Things are already mentally destabilising these days due to the ai progress and new model releases, and I think a lot of the cognitive dissonance in people is due to people not considering the background context of other people's way of using agents.

People become curious (or ignore things) when they see something different from their experience and perception of doing things. When they see something wildly different, they get triggered.

Things are already mentally destabilising with all the agent shenanigans, and I think a lot of the cognitive dissonance in people is due to people not considering the background context of other people's way of using agents.

Peter Steinberger has been one such character (or trigger if you will), mainly because of the amount of tokens he has spent to work on his projects and being super vocal about it. Unless you have been living under a rock (you are not on Twitter), you might know that he worked on OpenClaw and was vocal about how he shipped stuff without reviewing it and just made sure things work (even though there were lots of bugs). "Ship at inference speed." Eventually he got acquihired by OpenAI and OpenClaw is its own foundation kind of thing. Now he has an infinite supply of tokens and he likes to shove it in the face of us peasants who are not in a frontier lab lol.

Some people say that these are glimpses into the future (but the future is measured in months, not years) and I think that's a good way to think about this. An alternative way to imagine this is: if you had lots of tokens at lower cost, what could you do with them?

If you want to scale yourself, then you need to keep finding opportunities to automate chunks of your work. It might be a necessity or it might not. Then there's an element of craft: some people took pride in or loved writing code by hand. I took pride in micro-managing my agent with intentional prompts. Let's briefly go through how people have scaled automation over the last 1 year or so.

The pattern of usage over the last 1 year since Claude Code got good

At each step, people have identified that "ok, the model is doing good enough at this, the blocks of code look correct. I don't need to look anymore." Basically, if accuracy increased, then there were diminishing returns to manually reviewing, so it just made sense to lay back and automate that step if possible (provided there are behaviours that can be automated). You have to take a trust fall and move to the next step. The pattern has evolved like -

Micro-managing the agent with prompts and doing your tasks, then leaving off threads of agents to do work for 20ish minutes (more common with Codex / GPT-5 series than Claude; Claude has always been more on the side of an interactive agent until Opus 4.7, since when they have tried to shift towards long horizon based workflows).
Writing a rubric and semi-automating the work with oversight until the task is done.
Doing the same but with a big agent with a big bad rubric and a few sub-agent threads spawned like sub-processes, with their own context/forks and probably personalities.
Doing the same with a bigger, smarter agent and lots of sub-agents (with access to their own tools and maybe an IDE). Don't say it. Ok I will say it RLM.
Dynamic workflows, which scale this idea both vertically and horizontally. We will probably have something soon which lets the sub-agents communicate with each other too (the agent swarms idea)

At a certain point, post the Opus 4.5 and GPT-5.2-codex era, we started seeing longer horizon workflows. This was the effort of both post-training and tooling efforts. Models got smarter (less compounding error over a trajectory), plus inference time compute (OpenAI has been more into this than Claude). As the models got smarter in each iteration, we saw that the models did the same task in less time, and then had more time to proceed to the next level of task.

We also got very strong reviewer agents with fewer false positives/hallucinations (particularly from OpenAI), so you could afford to have very little oversight over the code.

All this led us to stuff like /goal, /automation, /dynamic workflow, /loop (it's like a cron job to repeat a task until some goal is achieved, a new feature in Claude Code). I particularly liked the temporal based element for automation because a lot of tasks in software engineering in production systems involve going through stuff periodically and then doing something about it if there is some high entropy behaviour. Like babysitting agents, ML training runs, deployments, checking logs on a routine etc.

Ralph Loops and Gastown were early manifestations of these ideas, but models weren't as good back then.

The loop Peter was talking about is what people have been referring to as an orchestrator system for like 2-3 years now. When you implement an orchestrator for your personal building, your aim is to write a super detailed spec and then write the skills and rubrics and the evals so that the orchestrator can operate with minimal oversight.

A particular usage of /loop that I liked recently:

What drives people's usage patterns and pointers to evaluate the context behind

You have to work based on your own situation/context, and with the way technology is moving forward, tilting towards the direction of automation is the right way. Different people have different situtations/context which heavily influences the way they operate. You will have to consider cost, time, how much soul and intent you are willing to dilute to trade off for automation etc.

I think there are 4-5 main drivers of what leads people to do what they are doing. You can use these as pointers to evaluate someone's context or your own context.

Amount of work - You would want maximum automation when you have a shitload of work to do. This has been my experience over the last 2 years of working with agents. Whenever I have had a lot of work (or a deadline), I came up with novel ways to automate or at least parallelise my workflows. Last year, for example, I was working with multiple clients and had to do lots of work (I was manually micro-managing a lot, but I was able to delegate short threads of tasks that ran for 10-20ish minutes). I used several agents for reviews as well.

If you are a single maintainer and you have most of the context for some project and many people are dependent on it, then it just makes sense to automate (while burning a lot of tokens). If you are a small team working on an early product, then you will find yourself in this position too (I have been since the last few months).

Most people don't have this amount of work. Either that, or you are working on your own project or startup and you have to move very fast in the market in attempts to find PMF (provided you have the direction clear; once the implementation detail or a short term roadmap is clear, it's easier to plan these things).

Time - Deadlines. Naturally, when you are under a deadline, your mind focuses on completing the task. You become more output shaped than process shaped, and in such moments you will find yourself thinking ahead and automating the shit out of things. I have been the most innovative and efficient when I have been under a deadline.

Type of task and stage of task - Some scenarios where automation can really help:

Working on the same codebase building tonnes of features (lots of demand or finding PMF), where you have the basic foundation set up so most of the code part is integration.
You have some rubric and you want to make bespoke software that is spiritually similar.
Tasks where models are really good - frontend and backend (Codex for example is god, whereas Claude has been really good for meta-prompting, frontend, writing, doc generation).
How good the models are determines the level of oversight and trust fall you can take. If the model is not good at a task, then you have no alternative other than providing oversight / micro-managing the agent.

Stage of task matters too. If you are bootstrapping, working on something cutting edge, or working on an initial foundation against which everything will piggyback, you need oversight. If you are doing some kind of optimisation based on your taste (in product engineering), or just something cutting edge, or something that requires lots of human based context, then you can't really outsource it. Tasks with high intent density are the hardest to automate.

Cost - Big labs and companies have tonnes of budgets but the labs particularly have advantage of running their own inference and even more advanced models internally. They have essentially an unlimited supply of tokens so the employees/researchers naturally need to figure out the best automations or workflows. This can help them free up their time for more creativity or running multiple experiments. You might have heard of the tonnes of usage at OpenAI and the like by researchers. I presume they use models to read lots of new data (no prompt cache), babysit jobs, debug through traces and all, and run multiple of these since they must have multiple experiments going.

They are also rapidly working on product side - take Anthropic for example. In their case, it's faster to build something and get feedback than doing the steps before building to figure out the PMF. Just test on prod and get feedback and then double down on that (also yeah, product taste of their product people but you get my point).

Similarly, at bigger companies or heavily funded places or more ai native places, you can have a good amount of token supply so it makes sense to speed up development. Leadership people will see everyone speeding up so you might start facing speeding up pressure from them soon if you are not already facing it.

In Peter's case, I guess he was rich enough and was too bored and liked shipping more than than the process of coding.

to summarise, if you combine all of these - it's a personal demand and supply fused with incentive equation. If you make more than you burn, then it's justifiable and makes sense to automate parts of your work and scale yourself.

Just don't be the type of person who spends most of their time automating their setup.

I like a couple of quotes from Cat Wu from Lenny Podcast on this

Thriving in an AI-driven world — 1:07:49

I think AI gives everybody a ton more leverage than they used to. And so I would push you towards — anytime you realize that you're doing some manual task multiple times, think about how you can use Claude Code, Cowork or other AI tools to automate that for you. Most people have creative parts of their job that they absolutely love and then tedious parts of their job that they really hate doing. I think the beauty of AI is that it can do those tedious parts for you. It can learn from every time that you've done that manual task and generalize and then run it automatically, so that you can focus on the creative parts.

So I think my like immediate push for people is figure out the repetitive parts that you can pass to Claude. Iterate on those automations until the success rate is very high, and then focus on: okay, what more can you be doing for your team, for your product, for your company that people haven't had the bandwidth to pick up so far?... If AI can take care of the grunt work, then you have this extra 20% time now that you might not have before.

Underfunding forces automation — 24:35

if you have Claude, you can just use that to automate a lot of work... another principle is just encouraging people to go faster. So if you can do something today, you should just do it today... if you want to go faster, a really good way to do that is to just have Claude do more stuff.