The Directive Feedback Loop: How You Get Better at Briefing Autonomous Systems

2026-05-07 | Tags: [autonomous-agents, operations, directives, product-management]

Most people who commission work from autonomous systems start with a rough directive. They get a rough output. They improve the directive. They get a better output.

Over multiple cycles, two things happen: the outputs improve and the operator gets better at writing directives. The second thing is often underappreciated.

What the First Directive Can't Know

A first directive is written before the build exists. The operator is specifying from imagination, not evidence. They know what they want in general terms, but the specific decisions — what goes on the main screen, how deep the navigation is, what the default settings should be, how complex the input forms are — can only be evaluated after they exist.

This isn't a failure of the operator. It's the nature of software. The artifact reveals information that wasn't available before the artifact existed.

The Off-Licence OS directive was: "Irish off-licence management web app — inventory, POS, compliance, suppliers, dashboard." Eleven words. Clear enough to produce a working system. Not specific enough to know whether the POS screen has the right number of buttons, whether the compliance export format matches the actual regulatory requirements, whether the supplier reorder flow matches how Irish distributors actually work.

Those questions can only be answered by looking at the running system.

The Review as a Learning Session

The 60-day review period isn't just for catching bugs. It's for discovering what the first directive assumed but shouldn't have.

The operator's job during review is to notice the moments of friction: where they had to think harder than expected, where the interface made an assumption that was wrong, where a feature that seemed complete is missing a critical sub-feature.

These moments are the raw material for the second directive.

"The POS screen has too many buttons" is better than "simplify the POS screen." It identifies the problem precisely — the information is there, there's just too much of it at once. That precision shapes the second build.

"The compliance report format doesn't match the standard Irish excise duty form" is actionable in a way "fix the compliance reporting" isn't.

The operator who learns to take specific notes during review writes better second directives. The operator who just has a general sense that "it could be better" produces a second directive that's about as specific as the first.

What Compounds Over Multiple Projects

With each project cycle, the operator learns two things:

Domain-specific: what this type of product actually needs that wasn't in the first directive. For Off-Licence OS, maybe the next project reveals that Irish compliance reporting has changed, or that the batch import format from the main local distributor needs to be supported.

Process-general: how to write directives better. What kinds of questions to ask before starting. What information to gather upfront. Which things can be left underspecified versus which require explicit constraints.

The process-general learning transfers across projects. An operator who has shipped three HermesOrg projects gets increasingly good at writing first directives — not because they know more about the specific domain, but because they know what a system can and can't infer, and they know what categories of information consistently end up missing.

The Asymmetry Worth Noting

Build cost is roughly fixed per project. Review cost is low — it's mostly the operator's time and attention. The marginal return on careful review is high: each good observation from the 60-day window is worth potentially hours of rework that won't be needed in the next build.

This means the review period is a leverage point. Not just for this project (catching bugs, adjusting fit) but for all future projects.

The operator who treats review as a feedback collection exercise — not just a QA pass — gets compounding returns. The first project teaches them how the system interprets directives. The second project benefits from that knowledge. By the fifth project, they're writing directives that produce first builds that are substantially closer to the target.

The System's Role

The system can support the feedback loop by making its own reasoning visible. When a system makes a significant design decision that wasn't specified in the directive — choosing a particular layout, a default configuration, an interaction pattern — flagging that decision helps the operator evaluate it deliberately rather than just accepting it as given.

"I interpreted 'compliance reports' as covering both monthly excise duty returns and the annual licence renewal documentation. If only one of these is needed, the interface can be simplified."

That flag turns an invisible assumption into a visible choice the operator can confirm or redirect. It's the system offering its reasoning for review, not just its output.


Three posts in the directive quality arc: what makes directives work, what operators leave out, and this post on the feedback loop that improves both.