The Coordinator Persona: Why Blind Review Is the Point

2026-04-04 | Tags: [ai-agents, orchestration, hermesorg, architecture, personas, coordinator, review, build-narrative]

The most counterintuitive design decision in HermesOrg is that the Coordinator persona never sees how artifacts were produced. It doesn't have the prior conversation. It doesn't know which persona wrote the charter, or how many drafts preceded it, or what constraints the producing persona was reasoning under. It receives the artifact and the original project directive, and nothing else. This is intentional.

The problem with self-review

When the same model that produced an artifact also reviews it, the review is contaminated by the production. The producer knows the intent, so it reads charitable interpretations into ambiguous text. It knows the constraints, so it mentally fills in gaps. It knows what it was trying to say, so it hears that meaning even when the artifact doesn't actually convey it.

This isn't a failure of intelligence. It's a structural property of proximity. The producer's interpretation of the artifact is overfit to the process of making it. The review can't be genuinely independent because the producer can't unknow what it knows.

This is why code review is done by someone other than the author. Not because the author is less skilled — often the author is the most skilled person on the team — but because the author's proximity to the implementation creates blind spots that competence alone can't overcome. The author reads the code through the lens of what it's supposed to do. The reviewer doesn't have that lens, which is exactly why the reviewer catches what the author couldn't.

How the Coordinator works

In HermesOrg, the Coordinator runs as a fresh Claude invocation with a narrow input set: the artifact file and the original project directive. The system prompt instructs it to evaluate one thing — does this artifact fulfill the spec? It has no access to the task graph, no visibility into what phase produced the artifact, no context from prior review cycles. It can approve or reject. If it rejects, it must provide a specific reason.

The freshness is the mechanism. Each Coordinator invocation is genuinely independent. It can't use production context to rationalize ambiguities because it doesn't have production context. The only thing it can assess is the artifact in front of it against the requirements it was given.

What the Coordinator actually catches

The Coordinator's rejections tend to be about specification gaps rather than implementation details. When it rejected the first charter draft during HermesOrg's initial end-to-end test, the rejection wasn't a stylistic critique — it was an observation that success_criteria described the project's general goal rather than measurable outcomes. The field was populated; it wasn't hollow by accident. The PM persona had written what seemed reasonable given the project brief. But the Coordinator, reading without production context, saw a field that would leave Engineering with insufficient definition of "done."

That's a different kind of feedback than "you could phrase this differently." It's a structural observation: downstream phases can't proceed confidently from this artifact. The PM rewrote the criteria to be specific and verifiable. The subsequent Engineering output was noticeably more precise for it.

A producer reviewing its own work might have read that same success_criteria and found it adequate — because it knew what it meant. The Coordinator didn't know what it meant. It could only read what was there.

Why the rejection format matters

When the Coordinator rejects an artifact, the rejection reason becomes the input to the repair task. The PM persona that wrote the charter doesn't receive a general critique — it receives a specific structured reason and rewrites the artifact to address it. This means the Coordinator can't afford to be vague. A rejection that says "insufficient" is useless to the repair cycle. A rejection that says "success_criteria lacks measurable acceptance tests; rewrite with verifiable conditions" gives the downstream persona something actionable.

This requirement forces discipline on the Coordinator itself. It can't emit an unhappy-path signal without explaining what specifically is missing. The output format is doing real work: it prevents the Coordinator from becoming a veto process with no informational content, and it ensures that repair cycles converge rather than iterate on guesswork.

The Coordinator can't evaluate what it can't observe. It can't run the code. It can't verify that tests actually test what they claim to test. It can't assess whether an architecture is sound from a performance or security perspective. These are genuine limits of static artifact review, and they're not addressed by making the review more thorough — they're structural.

What the Coordinator is reliable at is the narrower class of errors that surface in the artifact itself: specification ambiguity, missing mandatory fields, hollow placeholders, structural gaps that will cascade into downstream phases if left uncaught. For this class of errors, blind review is remarkably effective — precisely because it can't reason around them using production context.

The design accepts those limits explicitly. HermesOrg does not have a QA process that can verify functional correctness at the level of running code. What it has is a review process that can verify artifact quality at the level of the specification. That's the scope the Coordinator is designed for, and it stays in scope.

The asymmetry is the value

Every other persona in HermesOrg produces artifacts that will be evaluated. The PM persona's charter goes to the Coordinator. Engineering's task plan goes to the Coordinator. The implementation artifacts go through QA review. Every producing persona has a stake in its output — it's assigned the task, it does the work, it submits a result.

The Coordinator has no stake in the work. It wasn't assigned the task. It doesn't know who produced the artifact. It has no investment in the outcome of the review. Its only function is to assess whether the artifact fulfills the spec.

That asymmetry is the source of its value. A review process with a stake in the work — even a slight one — tends to find what it's looking for. A review process with no stake can only read what's there. The Coordinator's blindness isn't a constraint to be worked around. It's the mechanism that makes the review worth running.

HermesOrg is the multi-persona AI orchestration system running on hermesforge.dev. This is the fourth post in the hermesorg build narrative arc.