What OSFI E-23 Actually Requires from Fintech Vendors Selling AI to Banks
Most of the discussion around OSFI's Model Risk Management guideline treats it as a bank obligation. The more immediate consequence for the Canadian AI market is what banks are already asking of their fintech vendors — and what that procurement posture will look like by mid-2027.
Why OSFI E-23 becomes a vendor problem
OSFI Guideline E-23 is the Office of the Superintendent of Financial Institutions' Model Risk Management guideline. It becomes enforceable May 1, 2027 for federally regulated financial institutions, and it explicitly covers AI and machine learning models within the definition of “model”. The guideline requires documented model risk management programs — inventory, documentation, independent validation, ongoing monitoring, and Board-level governance — for every material model in use.
The detail that most fintech vendors have not yet internalised is the scoping paragraph on third-party models. E-23 requires federally regulated institutions to extend their model risk management programs to material models operated by vendors, not just to models built in-house. The bank cannot outsource a model risk obligation to its procurement department. If a fintech's AI system performs a material function inside a Canadian bank — credit adjudication, fraud detection, anti-money-laundering triage, customer communication, adviser assistance — the bank is required to bring that model inside its E-23 governance perimeter.
In practice, that means the bank's 2LOD risk team must evaluate the vendor's model documentation, validation methodology, and monitoring program against the same standard the bank applies to its internal models. The request arrives through vendor risk management, but the underlying expectation is E-23 alignment.
The cascade is already happening
The May 2027 enforcement date is often read as a 2027 problem. The procurement cycle tells a different story. Canadian banks are now placing AI vendor onboarding decisions that must hold up under E-23 at the moment it becomes enforceable. A fintech signing an enterprise agreement in mid-2026 is being evaluated against the evidence posture the bank expects to have by mid-2027 — plus a margin for remediation. The practical procurement window closed earlier than the regulatory one.
OSFI's April 2026 Annual Risk Outlook reinforced this direction by reintroducing non-bank financial institution risk and explicitly naming fintechs as systemic risk vectors. The supervisory signal to banks is unambiguous: the governance perimeter extends to material AI vendors, and the banks are accountable for that extension.
The vendor risk questionnaire package
The documentation request that arrives from a bank's vendor risk team is not generic. It is structured around a specific evidence set, assembled so that the bank's 2LOD team can drop the vendor's artifacts into the bank's own model risk inventory with minimal restructuring. The package typically covers:
- Model Card — intended use, model class, training data sources and provenance, validation methodology and results, known limitations, performance characteristics, monitoring thresholds, and decommissioning criteria.
- Independent validation evidence — documentation of validation that was performed by a party independent of model development, with sufficient detail to establish that the validation was genuine and not a self-attestation exercise.
- Monitoring program documentation — the specific metrics monitored in production, thresholds that trigger review, escalation pathways, and retraining or rollback protocols.
- Human-in-the-Loop evidence — where the vendor claims human oversight, the architecture of that oversight: where the gate exists, what states exist, how approvals are recorded, and what the audit trail looks like.
- Data lineage and provenance — the path from source data through transformation and into model inputs, in a form that supports the bank's own data governance program.
- Incident response documentation — classification, escalation, and regulatory notification protocols for AI incidents affecting the bank's customers or operations.
None of these artifacts are optional in a mature vendor package. A vendor that produces a Model Card but cannot evidence independent validation is asking the bank's 2LOD team to take a self-attestation at face value — which, under E-23, they cannot.
A Model Card written retroactively to close a project gate does not establish the provenance chain that E-23 requires. The artifacts that survive bank 2LOD review are those produced as operational outputs of a running governance process — not those assembled after the fact from scattered logs and slide decks.
What “material” actually means in practice
E-23 applies to material models, and the definition of materiality does real work. Models embedded in credit adjudication, fraud detection, AML transaction monitoring, suitability assessment, claims processing, and regulated customer communications will almost always be treated as material. Models that merely summarize internal documents for an operations analyst may not be. The bank's 2LOD team makes the materiality call, and vendors should assume that any model touching a customer-affecting decision, a regulated process, or a reportable control is in scope.
For fintechs selling general-purpose agentic systems, this has a subtle implication: the same product can be material in one deployment context and immaterial in another. The governance artifacts need to be produced at a level of specificity that allows the bank to assess materiality per use case, not per vendor.
HITL gates as architectural, not aspirational
Most vendor responses to “do you have human oversight?” point to a review dashboard or a post-hoc alerting channel. Neither satisfies the expectation that 2LOD review teams apply under E-23. A Human-in-the-Loop gate, in the sense the bank's risk team means, is a structural checkpoint at which the AI system enters a pending_approval state and waits for a human decision before an action is committed. The gate blocks execution; it does not review it after the fact.
The architectural distinction matters because the bank needs to evidence human oversight to its regulator. A dashboard that shows an analyst reviewed a decision after it was committed does not establish that human oversight governed the decision — it establishes that a human observed the decision. Vendors whose HITL claim rests on policy language rather than code paths will find that distinction surfaced during 2LOD review.
Monitoring as a program, not a dashboard
Ongoing monitoring is one of the areas where vendor answers most frequently diverge from the bank's expectation. E-23 expects a documented monitoring program with defined metrics, thresholds, review cadences, and escalation pathways. “We monitor model performance” is not a program. The specific questions a bank's 2LOD team will ask include: which metrics are tracked, at what frequency, against what thresholds, by whom, with what escalation when thresholds are breached, and with what evidence that those escalations actually occurred when triggered.
Monitoring also has to cover drift — data drift, concept drift, and performance drift — in a form that the bank can cross-reference against its own production telemetry. A monitoring story that exists only in a vendor console, with no export pathway into the bank's observability stack, will be flagged as a dependency the bank cannot independently verify.
The 6 – 12 month bank procurement timeline
Bank procurement cycles for AI vendors now routinely run six to twelve months end-to-end, and much of that timeline is dominated by vendor risk assessment rather than commercial negotiation. A fintech that presents a complete evidence package at the start of the process can compress that timeline significantly. A fintech that treats documentation as a late-stage deliverable will find the bank running ahead in the conversation and circling back on open evidence questions for months.
The asymmetry matters. The bank has an option value in a slower vendor decision — they can evaluate alternatives, build internal capability, or defer. The fintech does not. Every month of cycle time is revenue deferral and, increasingly, runway pressure.
What fintechs should be building now
For fintechs whose product roadmap touches Canadian banks, the work that pays back fastest is the work that produces audit-grade evidence as a continuous output rather than an assembled artifact. Concretely:
- A Model Card and, where applicable, an Agent Card for every material model or agent system, maintained under version control alongside the code that implements the model.
- Independent validation — either internal but structurally independent of model development, or external — with documentation sufficient for a 2LOD reviewer to assess the rigor of the validation itself.
- A written monitoring program, operational in production, with captured evidence that the monitoring is running and that escalations have occurred when thresholds were breached.
- HITL gates implemented as code — pending states, approval capture, audit trail — not as policy language referencing a review dashboard.
- Data lineage documentation that tracks training data, validation data, and any production retrieval sources through to model inputs.
- An incident response playbook that maps to the bank's own escalation and regulatory notification expectations.
These artifacts are not produced most efficiently by writing them as documents. They are produced most efficiently by building the governance process that generates them — and by treating every document as a downstream output of that process.
How RegCore.AI approaches this
Our AI Governance practice was built inside Canadian Big 5 bank governance perimeters. The Model Cards, Agent Cards, HITL gate architectures, and RAIOps operational frameworks we implement are the same artifact structures that satisfied Big 5 bank 2LOD governance review — not template adaptations. When a fintech client asks what a Canadian bank will require from them under E-23, the answer is grounded in direct experience of what those vendor risk assessment processes look like from the inside.
A Regulatory Readiness Assessment establishes current-state posture against E-23, identifies the highest-priority gaps, and produces a sequenced remediation roadmap. For fintechs already in bank procurement cycles, the assessment doubles as a structured response to the bank's vendor risk questionnaire.
- What OSFI Is Signalling Through Its AI Governance Workshops
How OSFI's FIFAI supervisory themes sit alongside the binding E-23 obligations.
- Three AI Governance Gaps That Existing Tools Cannot Close
Why observability, policy-in-prompts, and post-hoc alerting fall short of E-23 evidence expectations.
- FINTRAC 2026 Amendments and CIRO Consolidation
The AML and investment-dealer AI governance requirements that now sit alongside OSFI's perimeter.
- Our Services — AI Governance
Model Cards, Agent Cards, HITL gates, RAIOps, and the AI Evaluation Framework as operational systems.
Ready to assess your OSFI E-23 readiness?
Every engagement begins with a Regulatory Readiness Assessment that establishes current-state posture, identifies priority gaps, and produces a sequenced remediation roadmap.
Request an Assessment