Outcomes

Chris Tate, bonafide rockstar at Vercel, wrote “How to Get Out of Your Agent's Way” and it was an unlock for me.^[1] One idea in particular rewired how I work with agents: define outcomes, not procedures.

His argument is blunt: over-instruction makes agents worse. Not slightly worse--more detail means more fragility. Give a destination, not directions. Then stop.

That last part is the hard part. Then stop. Most people don't. I didn't, for a while.

I was writing prompts that read like pseudocode. Create this file. Import this library. Add a route that accepts these parameters. Check the database. Return this shape. The agent would follow my plan exactly, which felt productive--until the plan had a gap. Then the agent would stop. Or worse, it would fill the gap with something that technically matched my instructions but missed the point entirely.

I was building Forge when I read Tate's article. Planning phases. Multi-step coordination. Task pipelines. The kind of architecture everyone reaches for when they assume the hard problem is getting agents to follow complex instructions reliably.

Reading Tate made me delete half of it.

Not because orchestration is always wrong, but because it was the wrong default. I didn't need a system that told the agent what to do. I needed a system that could tell me, objectively, whether the outcome was met.

No orchestration framework. No multi-agent pipeline. No explicit planning phase.

The agent decides everything: how to break down the work, which files to touch, how to debug when something breaks, what approach to try next. My job is to define the destination clearly enough that “done” is unambiguous.

Nondeterminism

Here's what surprised me: the process is completely nondeterministic.

Run the same outcome twice and you'll often get different routes to the same place. Different file organization. Different implementation choices. Different debugging strategies. Different ways of reasoning about the problem.

You'd think that would make it unreliable.

It doesn't--because the result is checked by the system, not by the agent's self-assessment.

The build passes or it doesn't. The tests run or they don't. The type checker is happy or it isn't.

The agent can wander however it wants, as long as it arrives at the destination.

Nondeterminism in the path. Determinism in the result.

The pattern

The outcome is the contract. The agent is the implementation. Verification is the proof.

Define the outcome--acceptance criteria, constraints, edge cases. What success means, not how to get there.
Let the agent work--one agent, one prompt, full autonomy. No choreography, no procedural scripts disguised as prompts.
Verify externally--typecheck, build, tests, CI. Not “the agent says it's done.” Verification is independent.
Feed back failures--give the agent the errors. Let it decide what to change. Retry a limited number of times.

You can swap the internals however you want--different agents, different models, different prompting styles, different tooling--but if you want predictable results from nondeterministic systems, that boundary matters.

Why it works (and how it fails)

When this pattern breaks down, it's almost always for one of three reasons:

1) The “outcome” is secretly a procedure. If your prompt is a checklist of implementation steps, you've traded autonomy for compliance. You might still get working code, but you've prevented the agent from exploring better approaches--and you've made failure modes harder to recover from.

2) Verification is weak or fuzzy. If “verify” means eyeballing a diff, you don't have a loop--you have a review habit. The whole point is that the system can say “no” in a way the agent can't talk its way around.

3) You batch too much. One outcome at a time is slower in prompts and faster in progress. When you stack ten outcomes into one request, failures compound and it becomes unclear what broke what. Keep the unit small enough that verification provides real confidence.

Applied: Forge

Forge runs this pattern end-to-end.

Each outcome becomes a spec. Forge turns that spec into an outcome-focused prompt, makes a single Agent SDK call, and then runs system verification. Failures get fed back as errors. Success gets saved as an artifact.

Over time I added the practical necessities: a retry cap, per-spec cost tracking, a live display, and parallel execution via a worker pool. The pattern didn't change--only the ergonomics around it.

I set out to build an orchestrator. I ended up building a verification boundary.

Tate showed me that autonomy comes from outcomes. What surprised me was the next part: autonomy becomes predictable when you pair it with external verification. You don't need a system that forces the agent down a deterministic path. You need a system that makes “done” objective.

I stopped trying to tell agents what to do.

I started building verification boundaries.

vieko/forge x.com/vieko

Forge illustration by Jon Romero Ruiz.

^[1] Chris Tate, “How to Get Out of Your Agent's Way”--patterns for autonomous agents, including “Define Outcomes, Not Procedures,” plus supporting ideas like sandboxing, explicit state, and cost-awareness.