Third-Party Model Governance¶

Purpose

Governance framework for managing third-party AI model providers — covering vendor assessment, version pinning, deprecation handling, and fallback architecture.

Most Blueprint projects are deployers: they consume third-party models via API rather than training their own. This creates a different set of risks from model ownership — dependency risks, deprecation risks, vendor lock-in, and liability chain issues.

1. Vendor Assessment Checklist¶

Before integrating any third-party model provider, evaluate:

Technical Assessment¶

Criterion	What to check	Minimum requirement
Version stability	Does the provider offer pinnable model versions?	Named versions (`gpt-4-0613`) not just `gpt-4`
Deprecation notice	Minimum advance notice before version end-of-life	≥ 6 months notice (12 months for High Risk)
SLA uptime	Contractual availability guarantee	≥ 99.5% for production
Latency guarantees	p99 response time under load	Document baseline; verify in load test
Data processing location	Where is data processed?	Must satisfy GDPR data residency if EU data
Audit logs	Can you access per-request logs?	Required for High Risk (Art. 12 EU AI Act)
Fine-tuning ownership	Who owns fine-tuned model weights?	Your organisation must retain ownership

Compliance Assessment¶

Criterion	What to check
GPAI classification	Is the vendor model GPAI with systemic risk? (>10²⁵ FLOPs) — see EU AI Act §4
GPAI Code of Practice	Has the vendor signed the Code of Practice? (conformity presumption)
Data retention policy	Does the vendor retain prompts/outputs? For how long?
Sub-processor disclosure	Are third-party sub-processors disclosed? (GDPR Art. 28)
Incident notification	Will the vendor notify you of model incidents? Under what SLA?
Liability clause	Does the vendor accept liability for model defects? Capped at what amount?

2. Contract Requirements for AI Vendors¶

Minimum Contract Clauses¶

Include these clauses in every vendor AI contract (adapt to local law):

1. Model versioning:  Provider shall offer pinnable versioned endpoints.
                      Deprecated versions remain accessible for [X] months after EOL notice.

2. Deprecation SLA:   Provider gives ≥ [6/12] months written notice before endpoint removal.
                      Provider maintains deprecated endpoint for ≥ 90 days after EOL date.

3. Data processing:   Prompts and outputs are not used for model training without explicit consent.
                      Data retained for ≤ [X] days; audit log available on request.

4. Incident response: Provider notifies customer within 24h of model behaviour incident
                      that materially affects output quality or safety.

5. Audit access:      Provider grants access to per-request logs for [retention period].

6. Liability:         Provider accepts product liability for defects in model output
                      up to [amount / 12 months fees]. AI Act compliance documentation
                      available on request.

Open-source models

Open-source models hosted by your organisation (self-hosted Llama, Mistral, etc.) move the "vendor" liability to you — you become the provider under PLD. Apply the same governance as Scenario B in AI Liability.

3. Version Pinning Strategy¶

Never use floating model aliases in production (gpt-4, claude-3, gemini-pro). Floating aliases silently change behaviour when the provider updates the underlying model.

Pinning Rules¶

Environment	Rule	Example
Development	Floating alias permitted	`claude-sonnet-4`
Staging	Pinned version required	`claude-sonnet-4-6`
Production	Pinned version required	`claude-sonnet-4-6`

Version Management Procedure¶

Track active versions in the Model Registry (see Model Governance §3)
Monitor provider release notes — subscribe to vendor changelog RSS or email
Test new versions on staging with full Golden Set before promoting to production
Document version change as a configuration change (peer review required, see Model Governance §6)
Update Model Card with new version and test results

4. Model Deprecation Playbook¶

When a vendor announces a model end-of-life:¶

Immediately (within 1 week of notice):

Log deprecation notice in Model Registry with EOL date
Assess impact: which production systems use the deprecated version?
Create a migration ticket with priority based on EOL proximity

Within 30 days of notice:

Evaluate successor model: run Golden Set test against candidate replacement
Compare outputs: flag regressions in key metrics (factuality, task completion, bias)
Document evaluation results in Validation Report addendum

At least 30 days before EOL:

Complete staging migration and validation
Update Hard Boundaries and system prompts for any behavioural changes
Guardian review if High Risk (behavioural changes may require re-certification)
Schedule production cutover

Production cutover:

Follow standard deployment procedure (Go-Live Plan)
Monitor output quality for 2 weeks post-migration
Archive deprecated model configuration for traceability

Hard deadline risk

If a vendor removes a model endpoint without adequate notice or before your migration is complete, you need a fallback. See §5.

5. Multi-Vendor Fallback Architecture¶

Single-vendor dependency is an operational risk. For Limited Risk and above, design fallback capability:

Fallback Tiers¶

Tier	Trigger	Fallback action
Primary	Normal operation	Primary vendor model (pinned version)
Fallback	Primary API error rate > 5% over 5 min, or deprecation	Secondary vendor model with pre-validated prompt
Degraded	Both vendors unavailable	Rule-based or cached response; human handoff

Fallback Implementation Checklist¶

Fallback model identified and validated against Golden Set
Prompt parity: primary prompt adapted for fallback model (different models need different prompts)
Routing logic implemented with automatic failover (e.g. circuit breaker pattern)
Fallback triggers monitored and alerting configured
Fallback activation logged (for audit trail and incident review)
Human notification triggered when degraded mode activates

Vendor Concentration Risk¶

For High Risk AI systems processing >10,000 requests/day, assess vendor concentration:

Primary vendor market share > 60%? Document concentration risk in Risk Pre-Scan
Consider contracts with ≥ 2 providers to avoid monopoly leverage
Self-hosted open-source model as degraded-mode fallback removes API dependency entirely

6. Vendor Governance Reviews¶

Quarterly Vendor Review¶

Item	Check
Version status	All production pinned versions still supported?
Upcoming deprecations	Any EOL notices received? Migration tickets open?
Incident history	Any vendor incidents in past quarter? Impact assessed?
Cost optimisation	Usage within budget; newer smaller model for same task?
Compliance	Vendor GPAI Code of Practice status updated?
Contract renewal	Approaching renewal — renegotiate SLA clauses?

Was this page helpful? Give feedback