A Human-Gated, Audit-Logged Multi-Agent Pipeline

Context

A nonprofit animal shelter runs its operations from dozens of policy documents covering feeding, intake, and medical care. The library had grown organically over many years, and formatting and references had drifted away from a single standard, the way most working libraries do. The shelter wanted to start using AI tools as they became available. Those tools would read this library. Their ability to adopt new AI workflows and produce high-quality output is heavily determined by the quality of the source data. The library needed to be cleaned up and reviewed against regulations and industry code first. The shelter did not have staff hours to do that at the scale of its full library.

Approach

The work ran in three phases: plan, execute, validate with human review gates between each one.

Planning broke the cleanup into stages, one kind of change per stage, covering formatting, citations, terminology, and the like. Any stage that would change the meaning or wording of a document was routed to a human reviewer before it ran.

Execution ran the approved stages across the library in parallel. Every change was written to an audit log before the file was touched. After the change was written, the file was read back and compared to what the stage was supposed to produce.

Validation put the original and the revised document side by side, showed the differences, and reported them to a reviewer. A document was not finished until a person signed off on it.

Design

The pipeline used Docling to parse each document into a structured representation that preserved its headings, tables, lists, and reading order. A normalized document came out of the pipeline with the same structure it went in with.

The audit log was written before any file was modified. Each file was read back after the write and checked against the intended change. If a run was interrupted, the log was the record of what had been planned and what had completed. At the end of the engagement, the log matched what was on disk.

Human review sat on every change that altered wording. The pipeline produced the draft. A person approved or rejected it. Nothing in the library was modified without that approval. The shelter kept control of its own documents.

Documents moved through the pipeline independently, in parallel. A consolidation pass at the end handled findings that cut across multiple files. The library advanced as a set instead of one file at a time, which shortened the calendar and produced consistent results across the whole collection.

Outcome

The shelter received a normalized, compliance-reviewed policy library synced to the shared drive its staff works from. Each document came with a change report for the operations manager and a set of compliance notes tying the procedure to the relevant sections of the state administrative code, with flags on language worth updating. A full audit log accompanied the delivery.

The cleaned library is now the foundation for the next phases of AI adoption at the shelter, including AI-assisted asset development and an AI knowledge base for volunteers and staff.

What this demonstrates

These platform mechanics apply to any production workload where autonomous agents make changes you have to stand behind.

Cost-bounded parallel orchestration. Documents moved through the pipeline independently and in parallel, against a fixed work plan that capped the scope of each stage. Parallelism shortened the calendar without letting the run expand into open-ended, unbounded work.
Governance gates as policy. Any stage that altered wording was routed to a human reviewer before it ran. The pipeline enforced the gate: it deferred the action to a person and waited for an approve-or-reject decision before proceeding.
The audit log as source of truth. Every change was written to the log before the file was touched, and each file was read back and checked against the intended change after the write. At the end of the engagement the log matched what was on disk, so the record, not the agent's report, was the authority on what happened.
Durable resume after interruption. Because the log recorded what had been planned and what had completed, an interrupted run could pick up from its checkpoints instead of starting over. That is durable execution: checkpointed resume that survives a process dying mid-run.