IntelligencePlaybook
Playbook · April 2026

The OSFI E-23 Vendor Playbook: How fintechs answer Big 5 bank AI questionnaires.

When a Big 5 bank evaluates your AI product under OSFI Guideline E-23, the vendor questionnaire arrives with specific asks: Model documentation. Validation evidence. Monitoring. Governance. Most fintechs are not ready. This playbook shows you what the banks are asking, what artifacts satisfy the ask, and how the governance architecture maps to each question.

12-minute readUpdated April 2026Practitioner-authored
Request a BriefingRead the underlying analysis
Context

What OSFI E-23 requires from your bank client.

OSFI Guideline E-23 is the operationalized successor to E-13 and becomes enforceable on May 1, 2027. It requires that federally regulated financial institutions (FRFIs) — Canada’s Big 5 banks among them — extend their Model Risk Management programs to every material model in use, including models operated on their behalf by third-party vendors. The definition of “model” is explicitly broad: statistical models, machine-learning models, and AI systems that drive or inform material financial decisions are all in scope.

In practice that means a documented model inventory, structured model documentation, independent validation, ongoing monitoring, and Board-level governance reporting against every material model. The obligation does not stop at the bank’s perimeter. Material vendor AI models are brought inside the bank’s E-23 governance program through vendor risk management — the bank cannot outsource its model-risk obligation to procurement. Every assurance a bank gives OSFI about a third-party model rests on evidence the vendor produces, maintains, and makes legible to the bank’s second-line-of-defence (2LOD) team.

The scope cascades explicitly to vendor AI models used in credit decisioning, fraud detection, anti-money-laundering triage, suitability assessment, claims processing, and regulated customer communication. If your product sits in any of those paths inside a Big 5 bank, you are inside the E-23 perimeter — whether or not OSFI regulates you directly. The 2027 enforcement date is widely mis-read as a 2027 problem; in practice, banks are already evaluating AI vendor onboarding decisions against the evidence posture they expect to hold on day one of enforcement, plus a margin for remediation. The practical procurement window closed before the regulatory one.

The vendor questionnaire is the artifact through which this cascade arrives at your inbox. It is not a generic procurement form. It is structured so that the bank’s 2LOD team can lift your responses directly into the bank’s own model risk inventory with minimal restructuring. The rest of this playbook maps the questionnaire to the artifacts that satisfy it. See the OSFI E-23 regulation page for the full timeline and obligation map, and the underlying cascade analysis for the procurement-cycle implications.

The Questionnaire

Ten questions every bank vendor-risk team will ask — and what satisfies each.

These questions are not aspirational. They appear in actual 2025–2026 Big 5 vendor assessments reviewed and responded to by our practice.

#What banks askWhat satisfies the ask
1What is the intended use of the model, and what material decisions does it inform?Model Card — intended use, decision type, and materiality classification, stated in a form the bank’s 2LOD can drop into its own model inventory.
2What is the data provenance of training data, and how is quality assured?Data lineage documentation plus a Data Quality Attestation covering sources, transformations, sampling, and refresh cadence.
3How is the model validated independently from the development team?Independent Validation Report conforming to the SR 11-7 “effective challenge” standard — benchmarks, sensitivity analysis, outcomes analysis, and a signed validator.
4What are the known limitations and failure modes?Model Card “Known Limitations” section plus an AI Incident Response plan that classifies, escalates, and notifies by severity.
5What monitoring program is in place, and who reviews it?Monitoring Program with specific metrics, thresholds, cadence, a named 2LOD reviewer, and captured evidence that escalations actually occurred when triggered.
6How are human oversight checkpoints implemented at decision time?HITL Gate architecture with pending_approval state documentation — the gate blocks commit; it does not review after the fact.
7How is model access controlled, and how are changes tracked?Deployment Readiness Gate plus a CI/CD audit trail binding every model change to an identified approver and a validation artifact.
8What happens when the model is retired or replaced?Model Decommissioning Plan and data retention schedule covering shutoff criteria, shadow-running, and disposition of production data.
9How is bias evaluated, and what fairness testing was performed?Bias Assessment Report plus Agent Card (constraints, escalation logic) and, where applicable, adverse-action explanation coverage.
10What governance structure owns this model post-deployment?Governance Operating Model plus an AIRSA use case inventory record naming the owner, reviewer, inherent risk rating, and next review date.

The questionnaire is an evidence request, not an essay prompt. A response framed in marketing language — “we take governance seriously,” “robust monitoring,” “enterprise-grade controls” — is read by a 2LOD reviewer as a null answer. The six artifacts below are the answer set. Every question on the list above resolves to one or more of them.

The Artifacts

Six artifacts that answer every vendor-risk question.

Each artifact is structured evidence a regulator or 2LOD reviewer can read — not a document deliverable.

Artifact 01

Model Card

Intended use, training data, validation, performance, known limitations, and monitoring thresholds. Structured to the Google / Partnership on AI format but adapted for OSFI E-23 evidence production. Maintained under version control alongside the code that implements the model, so that every production deployment has a Model Card tied to its commit.

Addresses questions 1, 2, 3, 4
Artifact 02

Agent Card

Agent rationale documentation — constraints, escalation logic, decision boundaries, tool scopes, and fairness considerations for agentic AI. A new artifact class that sits above the Model Card and describes how the agent actually uses models in production. Required for any customer-facing or decision-bearing agentic system.

Addresses questions 4, 6, 9
Artifact 03

HITL Gate Architecture

pending_approval state documentation covering who gates, when, under what conditions, with what audit trail. Satisfies E-23’s human oversight requirements by blocking commit, not by reviewing afterward. Implemented as code paths — not as a policy statement referencing a review dashboard.

Addresses questions 6, 7
Artifact 04

Independent Validation Report

Second-line-of-defence validation per SR 11-7’s “effective challenge” standard — benchmarking, sensitivity analysis, outcomes analysis, and documented challenge of the development team’s assumptions. Written to establish that the validation was genuine and independent, not a self-attestation.

Addresses questions 3, 5
Artifact 05

AIRSA Record

AI Use Case Inventory entry carrying governance status, inherent risk rating, residual risk rating, controls mapping, named owner, named reviewer, and next review date. The single source of truth the bank’s 2LOD team can lift into its own inventory during assessment.

Addresses question 10
Artifact 06

Governance Operating Model

Cadence of reviews, escalation paths, Board reporting, regulatory intelligence, and incident response. The ongoing motion after initial deployment — the thing that keeps every other artifact fresh and defensible. Without it, the other five artifacts drift within a quarter.

Addresses questions 5, 8, 10
The Path

90 days from questionnaire to submission-ready.

Fintechs that start governance at questionnaire receipt are late. Start now; treat the next questionnaire as a dry run.

The path below assumes an engineering team with access to production AI systems, at least one senior reviewer available for governance work, and an executive sponsor who can clear the path for independent validation. With those three in place, 90 days produces a submission-ready artifact package for the highest-materiality models and the scaffolding to extend coverage across the rest of the portfolio.

Days 0–30

Inventory & Assessment

  • AIRSA inventory of production AI.
  • Initial materiality classification.
  • Gap analysis against E-23 requirements.
  • Identify bank clients and their questionnaire history.
Days 31–60

Artifact Build

  • Model Cards for top-materiality models first.
  • HITL gate architecture formalization.
  • Independent validation scoping.
  • Agent Cards for customer-facing agents.
Days 61–90

Governance Operating Model

  • Review cadence, escalation paths, Board reporting format.
  • Incident response playbook.
  • Documentation package assembly.
  • Dry-run against a template Big 5 questionnaire.
What Goes Wrong

Five failure modes we see repeatedly.

Across fintech vendor engagements in 2025 and 2026, the same handful of failure patterns surface again and again — independent of product category, team size, or regulator exposure. Each one looks defensible from inside the organization and transparently thin from the 2LOD reviewer’s seat.

  1. 01

    Treating the questionnaire as a one-time exercise.

    The questionnaire repeats at every major model change and at each renewal cycle. If your artifacts aren’t generated by the same pipeline that deploys the model, they drift — and the next audit finds the drift. Treat governance as a pipeline output, not as a pre-sales deliverable.

  2. 02

    Documentation in slideware, not artifact form.

    A 40-slide deck does not satisfy E-23. A structured Model Card under version control does. Banks’ 2LOD teams prefer machine-readable evidence because they can diff it, cross-reference it, and cite it in their own internal memos — the deck gets politely filed and then re-requested in artifact form.

  3. 03

    Governance theater without operational gates.

    A “governance committee” that meets quarterly does not gate deployments. A HITL gate with pending_approval state that sits in the code path does. Policy without enforcement is performance, and an experienced 2LOD reviewer can distinguish the two in a single walkthrough.

  4. 04

    Validation by the development team.

    SR 11-7’s “effective challenge” — inherited by E-23 in spirit — requires independence. Self-validation is a finding waiting to happen. For highest-materiality models, expect the bank to additionally require a named external validator on top of internal independence.

  5. 05

    No incident response plan for AI.

    AI systems fail differently from deterministic software — they degrade, drift, hallucinate, and occasionally produce plausible but wrong outputs at scale. If you don’t have a classification scheme, escalation path, and regulator-notification protocol specific to AI incidents, expect a regulatory ask after the first incident — and a procurement freeze during the response.

Frequently Asked

Common questions from fintech teams.

Does OSFI E-23 apply to my fintech directly?

E-23 applies to FRFIs. It applies to you through your bank client: any material AI model you operate for a bank becomes the bank’s responsibility under E-23 and cascades to you through vendor risk. In practice, you will be asked for E-23-aligned documentation before any material deployment, and the bank will treat your responses with the same rigor it applies to its internal models. Vendors that treat E-23 as “not our regulator” are often surprised by how specific and how enforceable the bank’s vendor-risk posture has become in 2026.

What if my model is low-materiality?

Materiality classification is a bank call, not yours. Low-materiality models still require baseline documentation — a Model Card, some validation, basic monitoring — and the same use case can be reclassified upward as the bank’s understanding of the deployment context matures. Err toward higher artifact rigor from day one; upgrading documentation after a materiality reclassification is far more expensive than starting there.

How does E-23 interact with SR 11-7 if my bank client operates in the US?

E-23 inherits the architecture of SR 11-7 (sound development, independent validation, monitoring, governance) and is newer and more explicit on AI/ML. If your bank is a cross-border SIFI, expect alignment to both frameworks — the artifact set is largely the same, but documentation templates and naming conventions differ. A well-built evidence package for one will cover roughly 80% of the other, but the remaining 20% matters: Canadian banks expect OSFI-language artifacts and US-regulated entities expect SR 11-7 nomenclature.

Do Agent Cards replace Model Cards for agentic AI?

No. Agent Cards complement Model Cards. Model Cards describe the underlying model — training data, performance, limitations. Agent Cards describe how the agent uses models, applies constraints, escalates, and handles edge cases — tool scopes, fallback behavior, termination conditions. For agentic AI in banking, both are expected, and the absence of an Agent Card for an agentic system is an increasingly common finding in 2LOD review.

Who signs off as the independent validator?

Inside a bank, the 2LOD Model Risk Management group. For vendors, the bank’s 2LOD MRM will independently validate your model — you provide the evidence package. Some banks additionally require a named, independent third-party validator (for example, a qualified actuarial or quant firm) for highest-materiality models. Your own independent validator, whoever that is, must be structurally independent of the team that developed the model — reporting lines matter as much as the methodology.

How long does an OSFI E-23 readiness program take?

90 days to a submission-ready artifact package (see the phases above) assumes existing AI inventory clarity and engineering capacity. Organizations starting from zero (no AIRSA, no Model Cards, no HITL architecture) should plan 120–180 days for first full submission-ready state. The fastest path begins with the three or four highest-materiality models, produces their artifacts end-to-end, and uses that package as the template for the rest of the portfolio — not with a horizontal sweep across every model at once.

Engage

Ready to build your vendor-risk package?

We work alongside your engineering team to produce the artifacts — Model Cards, Agent Cards, HITL architecture, AIRSA, Governance Operating Model — within 90 days of kickoff.

Request a BriefingExplore Services