Skip to content

Third-Party Model Governance

Purpose

Governance framework for managing third-party AI model providers — covering vendor assessment, version pinning, deprecation handling, and fallback architecture.

Most Blueprint projects are deployers: they consume third-party models via API rather than training their own. This creates a different set of risks from model ownership — dependency risks, deprecation risks, vendor lock-in, and liability chain issues.


1. Vendor Assessment Checklist

Before integrating any third-party model provider, evaluate:

Technical Assessment

Criterion What to check Minimum requirement
Version stability Does the provider offer pinnable model versions? Named versions (gpt-4-0613) not just gpt-4
Deprecation notice Minimum advance notice before version end-of-life ≥ 6 months notice (12 months for High Risk)
SLA uptime Contractual availability guarantee ≥ 99.5% for production
Latency guarantees p99 response time under load Document baseline; verify in load test
Data processing location Where is data processed? Must satisfy GDPR data residency if EU data
Audit logs Can you access per-request logs? Required for High Risk (Art. 12 EU AI Act)
Fine-tuning ownership Who owns fine-tuned model weights? Your organisation must retain ownership

Compliance Assessment

Criterion What to check
GPAI classification Is the vendor model GPAI with systemic risk? (>10²⁵ FLOPs) — see EU AI Act §4
GPAI Code of Practice Has the vendor signed the Code of Practice? (conformity presumption)
Data retention policy Does the vendor retain prompts/outputs? For how long?
Sub-processor disclosure Are third-party sub-processors disclosed? (GDPR Art. 28)
Incident notification Will the vendor notify you of model incidents? Under what SLA?
Liability clause Does the vendor accept liability for model defects? Capped at what amount?

2. Contract Requirements for AI Vendors

Minimum Contract Clauses

Include these clauses in every vendor AI contract (adapt to local law):

1. Model versioning:  Provider shall offer pinnable versioned endpoints.
                      Deprecated versions remain accessible for [X] months after EOL notice.

2. Deprecation SLA:   Provider gives ≥ [6/12] months written notice before endpoint removal.
                      Provider maintains deprecated endpoint for ≥ 90 days after EOL date.

3. Data processing:   Prompts and outputs are not used for model training without explicit consent.
                      Data retained for ≤ [X] days; audit log available on request.

4. Incident response: Provider notifies customer within 24h of model behaviour incident
                      that materially affects output quality or safety.

5. Audit access:      Provider grants access to per-request logs for [retention period].

6. Liability:         Provider accepts product liability for defects in model output
                      up to [amount / 12 months fees]. AI Act compliance documentation
                      available on request.

Open-source models

Open-source models hosted by your organisation (self-hosted Llama, Mistral, etc.) move the "vendor" liability to you — you become the provider under PLD. Apply the same governance as Scenario B in AI Liability.


3. Version Pinning Strategy

Never use floating model aliases in production (gpt-4, claude-3, gemini-pro). Floating aliases silently change behaviour when the provider updates the underlying model.

Pinning Rules

Environment Rule Example
Development Floating alias permitted claude-sonnet-4
Staging Pinned version required claude-sonnet-4-6
Production Pinned version required claude-sonnet-4-6

Version Management Procedure

  1. Track active versions in the Model Registry (see Model Governance §3)
  2. Monitor provider release notes — subscribe to vendor changelog RSS or email
  3. Test new versions on staging with full Golden Set before promoting to production
  4. Document version change as a configuration change (peer review required, see Model Governance §6)
  5. Update Model Card with new version and test results

4. Model Deprecation Playbook

When a vendor announces a model end-of-life:

Immediately (within 1 week of notice):

  • Log deprecation notice in Model Registry with EOL date
  • Assess impact: which production systems use the deprecated version?
  • Create a migration ticket with priority based on EOL proximity

Within 30 days of notice:

  • Evaluate successor model: run Golden Set test against candidate replacement
  • Compare outputs: flag regressions in key metrics (factuality, task completion, bias)
  • Document evaluation results in Validation Report addendum

At least 30 days before EOL:

  • Complete staging migration and validation
  • Update Hard Boundaries and system prompts for any behavioural changes
  • Guardian review if High Risk (behavioural changes may require re-certification)
  • Schedule production cutover

Production cutover:

  • Follow standard deployment procedure (Go-Live Plan)
  • Monitor output quality for 2 weeks post-migration
  • Archive deprecated model configuration for traceability

Hard deadline risk

If a vendor removes a model endpoint without adequate notice or before your migration is complete, you need a fallback. See §5.


5. Multi-Vendor Fallback Architecture

Single-vendor dependency is an operational risk. For Limited Risk and above, design fallback capability:

Fallback Tiers

Tier Trigger Fallback action
Primary Normal operation Primary vendor model (pinned version)
Fallback Primary API error rate > 5% over 5 min, or deprecation Secondary vendor model with pre-validated prompt
Degraded Both vendors unavailable Rule-based or cached response; human handoff

Fallback Implementation Checklist

  • Fallback model identified and validated against Golden Set
  • Prompt parity: primary prompt adapted for fallback model (different models need different prompts)
  • Routing logic implemented with automatic failover (e.g. circuit breaker pattern)
  • Fallback triggers monitored and alerting configured
  • Fallback activation logged (for audit trail and incident review)
  • Human notification triggered when degraded mode activates

Vendor Concentration Risk

For High Risk AI systems processing >10,000 requests/day, assess vendor concentration:

  • Primary vendor market share > 60%? Document concentration risk in Risk Pre-Scan
  • Consider contracts with ≥ 2 providers to avoid monopoly leverage
  • Self-hosted open-source model as degraded-mode fallback removes API dependency entirely

6. Vendor Governance Reviews

Quarterly Vendor Review

Item Check
Version status All production pinned versions still supported?
Upcoming deprecations Any EOL notices received? Migration tickets open?
Incident history Any vendor incidents in past quarter? Impact assessed?
Cost optimisation Usage within budget; newer smaller model for same task?
Compliance Vendor GPAI Code of Practice status updated?
Contract renewal Approaching renewal — renegotiate SLA clauses?