Third-Party Model Governance¶
Purpose
Governance framework for managing third-party AI model providers — covering vendor assessment, version pinning, deprecation handling, and fallback architecture.
Most Blueprint projects are deployers: they consume third-party models via API rather than training their own. This creates a different set of risks from model ownership — dependency risks, deprecation risks, vendor lock-in, and liability chain issues.
1. Vendor Assessment Checklist¶
Before integrating any third-party model provider, evaluate:
Technical Assessment¶
| Criterion | What to check | Minimum requirement |
|---|---|---|
| Version stability | Does the provider offer pinnable model versions? | Named versions (gpt-4-0613) not just gpt-4 |
| Deprecation notice | Minimum advance notice before version end-of-life | ≥ 6 months notice (12 months for High Risk) |
| SLA uptime | Contractual availability guarantee | ≥ 99.5% for production |
| Latency guarantees | p99 response time under load | Document baseline; verify in load test |
| Data processing location | Where is data processed? | Must satisfy GDPR data residency if EU data |
| Audit logs | Can you access per-request logs? | Required for High Risk (Art. 12 EU AI Act) |
| Fine-tuning ownership | Who owns fine-tuned model weights? | Your organisation must retain ownership |
Compliance Assessment¶
| Criterion | What to check |
|---|---|
| GPAI classification | Is the vendor model GPAI with systemic risk? (>10²⁵ FLOPs) — see EU AI Act §4 |
| GPAI Code of Practice | Has the vendor signed the Code of Practice? (conformity presumption) |
| Data retention policy | Does the vendor retain prompts/outputs? For how long? |
| Sub-processor disclosure | Are third-party sub-processors disclosed? (GDPR Art. 28) |
| Incident notification | Will the vendor notify you of model incidents? Under what SLA? |
| Liability clause | Does the vendor accept liability for model defects? Capped at what amount? |
2. Contract Requirements for AI Vendors¶
Minimum Contract Clauses¶
Include these clauses in every vendor AI contract (adapt to local law):
1. Model versioning: Provider shall offer pinnable versioned endpoints.
Deprecated versions remain accessible for [X] months after EOL notice.
2. Deprecation SLA: Provider gives ≥ [6/12] months written notice before endpoint removal.
Provider maintains deprecated endpoint for ≥ 90 days after EOL date.
3. Data processing: Prompts and outputs are not used for model training without explicit consent.
Data retained for ≤ [X] days; audit log available on request.
4. Incident response: Provider notifies customer within 24h of model behaviour incident
that materially affects output quality or safety.
5. Audit access: Provider grants access to per-request logs for [retention period].
6. Liability: Provider accepts product liability for defects in model output
up to [amount / 12 months fees]. AI Act compliance documentation
available on request.
Open-source models
Open-source models hosted by your organisation (self-hosted Llama, Mistral, etc.) move the "vendor" liability to you — you become the provider under PLD. Apply the same governance as Scenario B in AI Liability.
3. Version Pinning Strategy¶
Never use floating model aliases in production (gpt-4, claude-3, gemini-pro). Floating aliases silently change behaviour when the provider updates the underlying model.
Pinning Rules¶
| Environment | Rule | Example |
|---|---|---|
| Development | Floating alias permitted | claude-sonnet-4 |
| Staging | Pinned version required | claude-sonnet-4-6 |
| Production | Pinned version required | claude-sonnet-4-6 |
Version Management Procedure¶
- Track active versions in the Model Registry (see Model Governance §3)
- Monitor provider release notes — subscribe to vendor changelog RSS or email
- Test new versions on staging with full Golden Set before promoting to production
- Document version change as a configuration change (peer review required, see Model Governance §6)
- Update Model Card with new version and test results
4. Model Deprecation Playbook¶
When a vendor announces a model end-of-life:¶
Immediately (within 1 week of notice):
- Log deprecation notice in Model Registry with EOL date
- Assess impact: which production systems use the deprecated version?
- Create a migration ticket with priority based on EOL proximity
Within 30 days of notice:
- Evaluate successor model: run Golden Set test against candidate replacement
- Compare outputs: flag regressions in key metrics (factuality, task completion, bias)
- Document evaluation results in Validation Report addendum
At least 30 days before EOL:
- Complete staging migration and validation
- Update Hard Boundaries and system prompts for any behavioural changes
- Guardian review if High Risk (behavioural changes may require re-certification)
- Schedule production cutover
Production cutover:
- Follow standard deployment procedure (Go-Live Plan)
- Monitor output quality for 2 weeks post-migration
- Archive deprecated model configuration for traceability
Hard deadline risk
If a vendor removes a model endpoint without adequate notice or before your migration is complete, you need a fallback. See §5.
5. Multi-Vendor Fallback Architecture¶
Single-vendor dependency is an operational risk. For Limited Risk and above, design fallback capability:
Fallback Tiers¶
| Tier | Trigger | Fallback action |
|---|---|---|
| Primary | Normal operation | Primary vendor model (pinned version) |
| Fallback | Primary API error rate > 5% over 5 min, or deprecation | Secondary vendor model with pre-validated prompt |
| Degraded | Both vendors unavailable | Rule-based or cached response; human handoff |
Fallback Implementation Checklist¶
- Fallback model identified and validated against Golden Set
- Prompt parity: primary prompt adapted for fallback model (different models need different prompts)
- Routing logic implemented with automatic failover (e.g. circuit breaker pattern)
- Fallback triggers monitored and alerting configured
- Fallback activation logged (for audit trail and incident review)
- Human notification triggered when degraded mode activates
Vendor Concentration Risk¶
For High Risk AI systems processing >10,000 requests/day, assess vendor concentration:
- Primary vendor market share > 60%? Document concentration risk in Risk Pre-Scan
- Consider contracts with ≥ 2 providers to avoid monopoly leverage
- Self-hosted open-source model as degraded-mode fallback removes API dependency entirely
6. Vendor Governance Reviews¶
Quarterly Vendor Review¶
| Item | Check |
|---|---|
| Version status | All production pinned versions still supported? |
| Upcoming deprecations | Any EOL notices received? Migration tickets open? |
| Incident history | Any vendor incidents in past quarter? Impact assessed? |
| Cost optimisation | Usage within budget; newer smaller model for same task? |
| Compliance | Vendor GPAI Code of Practice status updated? |
| Contract renewal | Approaching renewal — renegotiate SLA clauses? |