Governance · RAG assurance

RAG assurance — source provenance, index drift, retrieval controls.

Retrieval-augmented generation has become the default production pattern for regulated AI. It also carries an evidence surface classical model governance did not account for. The firm describes the controls that make a RAG system defensible at examination — corpus lineage, index governance, retrieval controls, grounding evaluation.

Published 23 April 2026 · RegCore.AI

The corpus is the live risk surface.

In a classical model, the data-governance story ended at training and validation. In a RAG system, the story continues into every query. The retrieval corpus — whatever its representation, from a vector index to a structured search layer to a hybrid store — is a live artefact. Its contents change. Its source documents change. Its chunk boundaries are re-derived. The system’s behaviour changes with it, quietly and continuously. The supervisory question is no longer only whether the model is fit for purpose; it is whether the corpus is, on the day the question is asked. That shift is what makes retrieval the new governance surface.

Source provenance — the corpus ledger a regulator will read.

The first artifact every RAG programme has to produce is a source-level ledger for the corpus. Every document in the index traces to an origin: a controlled repository, an approved system-of-record, a licensed feed, a consented user input. The ledger records the acquisition date, the licence or consent basis, the owner, the retention posture, the redaction steps applied, and the chunking and embedding regime used to index it. The discipline is unspectacular. It is also the single artifact that lets a deployer respond to a deletion request, a licence challenge, a supervisory question about what the system is reading, and an incident that originates in the corpus rather than the model. The ledger survives reindexing, survives model replacement, survives vendor change. It is the spine.

Index governance — the operational controls on the live store.

The vector store is a production system. Its operational posture has to match. Access is controlled; writes are audited; ingestion pipelines are identified and monitored. Reindexing is a change event that is planned, reviewed and recorded. Index metadata — embedding model version, chunk size, retrieval parameters — is tracked with the artefacts it produced, so that a reviewer can reconstruct why a particular retrieval behaved the way it did on a particular date. The deployer that treats the vector store as an unmonitored appliance produces the pattern every post-incident review we have seen eventually lands on. The one that treats it as a governed system keeps the programme auditable.

Retrieval controls — the decisions made between query and context.

The retrieval layer is where most RAG-specific failure modes live. Poisoning — adversarial or malformed content inserted into the corpus. Drift — the shift in retrieval behaviour as the corpus evolves. Cross-tenant leakage — the inadvertent surfacing of content from one tenancy into another. Sensitivity bypass — the retrieval of content outside the sensitivity level a request was entitled to see. The controls that answer these are straightforward to name and non-trivial to operate: signed ingestion, access-control enforcement at the retrieval gate, sensitivity-aware routing, drift detection on retrieval distributions, and a kill-switch that can suspend retrieval while leaving the rest of the system running. The evidence these controls produce is the evidence a supervisor reads at examination.

Grounding evaluation — the measurement that binds output to source.

The capability the deployer has to be able to prove is grounding: the fidelity of the model’s output to the retrieved source. Grounding evaluation is not a one-off validation exercise; it is a continuous measurement loop that samples outputs, compares them to retrieved context, scores them for faithfulness, and routes failures for review. The methods are well-documented — retrieval precision, context-recall, citation-accuracy, hallucination rate against held-out sets. The discipline is to treat grounding as a live metric rather than a release-gate check. Drift in grounding is often the first signal that the corpus has shifted, or the retrieval parameters have been mistuned, or the model has been updated in a way the deployer did not fully understand.

The assurance file — one artifact, many reviewers.

The RAG-assurance file that comes out of the above is a small set of portable artefacts: a RAG manifest that declares the system architecture, a corpus provenance ledger, a grounding-evaluation report, and a change log for the index. Against the EU AI Act’s post-market monitoring expectations, the NIST AI RMF’s Measure and Manage functions, OSFI E-23’s ongoing-monitoring clauses and the disclosure posture the revised Product Liability Directive now implies, this is the artifact set the programme has to be able to produce on demand. It is also the artifact set that supports an internal incident review, an auditor’s walk-through and an underwriter’s pre-bind evaluation.

Our RAG assurance playbook walks the four phases — corpus lineage, index governance, retrieval controls, grounding evaluation — and produces the artifacts the firm maintains against change. Retrieval is where most regulated AI lives now. The governance has to match.

Make the retrieval layer defensible.

Corpus provenance, index governance, retrieval controls, grounding evaluation — the artifacts a supervisor, auditor and underwriter all read from the same file.