πŸ€– AI Explained
Emerging area 5 min read

Managing AI Risk at the Org Level

AI systems introduce risk categories that traditional software governance does not cover. This module maps the five risk categories, explains how to set risk appetite, and distinguishes real risk management from risk theatre.

Layer 1: Surface

AI systems create risks that traditional software governance was not designed to manage. A conventional software system does what its code says. An AI system does what its model and data predict, and that prediction can be wrong in ways that are hard to anticipate, hard to reproduce, and hard to audit.

There are five categories of risk an organisation must manage:

  1. Model risk: The AI produces wrong, biased, or harmful outputs. This includes hallucinations, discriminatory outputs, and confidently incorrect recommendations.
  2. Operational risk: The AI system fails, is slow, is unavailable, or behaves inconsistently in ways that disrupt operations.
  3. Reputational risk: The AI system fails publicly, a viral screenshot, a news story, a social media incident, damaging trust in the brand.
  4. Compliance risk: The AI system violates laws or regulations: data protection, anti-discrimination, financial advice, medical advice, or sector-specific rules.
  5. Strategic risk: The organisation becomes dependent on an external AI provider that changes pricing, restricts access, or ceases to operate.

Good risk management is not about preventing all failures. It is about knowing which failures are acceptable and which are not, and building appropriate controls for the ones that are not.

Why it matters

Organisations that treat AI risk as a one-time project approval exercise discover the problem when a failure occurs in production: typically at the worst possible time, with the highest possible visibility.

Production Gotcha

Common Gotcha: AI risk registers are typically completed at project initiation and never updated. Production AI systems change continuously, model versions are upgraded, prompts are edited, new tools are added, and each change potentially introduces new risks. Risk management must be continuous, not a one-time approval gate.

The assumption: β€œWe approved this system at launch, so it’s covered.” The reality: the system you approved no longer exists: it has been updated dozens of times since.


Layer 2: Guided

Risk appetite by use case

Not every AI application carries the same risk. Set your risk appetite explicitly for each use case category:

Use case typeModel risk toleranceCompliance exposureRecommended oversight
Internal productivity tools (summarisation, drafting)Medium: errors are caught by human reviewerLowMinimal: basic output review
Customer-facing informational (FAQ, search)Low: errors reach customers directlyMediumOutput guardrails + monitoring
Customer-facing advisory (recommendations, advice)Very low: incorrect advice has real consequencesHighHuman review checkpoint for high-stakes outputs
Automated decision-making (approvals, scoring)Very low: no human in the loopVery highRegulatory compliance required; human oversight often mandated
Internal process automation (document processing, data entry)Low: downstream processes depend on accuracyMediumValidation of outputs before use

Governance structure

Who approves AI use cases? Who owns incident response? Who monitors production? These are governance questions that need answers before systems ship, not after.

A minimal governance structure:

Use Case Approval
β”œβ”€β”€ Who: Cross-functional review (legal, compliance, product, engineering)
β”œβ”€β”€ What: Use case brief, risk assessment, proposed controls, eval evidence
└── Gate: Approval required before customer-facing deployment

Incident Response
β”œβ”€β”€ Owner: Designated on-call engineer + product manager
β”œβ”€β”€ Process: Detect β†’ assess severity β†’ mitigate or disable β†’ communicate β†’ post-mortem
└── Trigger: Any output that causes user harm, regulatory exposure, or significant reputational risk

Production Monitoring
β”œβ”€β”€ Owner: Engineering team with assigned on-call rotation
β”œβ”€β”€ Signals: Quality score drift, guardrail trigger rate, error rate, user complaint volume
└── Review cadence: Weekly dashboard review + automated alerts

The difference between risk management and risk theatre

Risk management is an ongoing operational process. It changes the system’s behaviour. Examples:

  • Output guardrails that block harmful content before it reaches users
  • Eval suites that catch regressions before they ship
  • Automated alerts that fire when quality metrics degrade
  • Quarterly risk register reviews that update assessments based on what happened

Risk theatre is a process that generates documentation but does not change outcomes. Examples:

  • A 40-page risk assessment produced at launch and never read again
  • A review committee that approves systems based on documentation completeness, not on evidence of actual testing
  • β€œAI ethics principles” published on the website but not embedded in any decision process
  • A compliance sign-off that checks whether the system was described correctly, not whether it behaves correctly

A practical risk register format

# AI Risk Register entry β€” pseudocode
from dataclasses import dataclass
from datetime import date
from typing import Literal

RiskLevel = Literal["low", "medium", "high", "critical"]

@dataclass
class RiskEntry:
    system: str
    risk_category: str           # model / operational / reputational / compliance / strategic
    risk_description: str
    likelihood: RiskLevel
    impact: RiskLevel
    controls: list[str]          # what is actually preventing or reducing this risk
    residual_risk: RiskLevel     # risk remaining after controls
    owner: str                   # the person accountable for this risk
    last_reviewed: date
    next_review: date
    open_actions: list[str]      # specific things that need to happen to reduce risk further

# Example entry
example = RiskEntry(
    system="customer-support-ai",
    risk_category="model",
    risk_description="AI provides incorrect product information that misleads customers",
    likelihood="medium",
    impact="high",
    controls=[
        "RAG retrieval from verified product catalogue only",
        "Output guardrail blocks responses with unsupported claims",
        "Eval suite includes 50 product information questions; threshold 92%",
    ],
    residual_risk="low",
    owner="product-manager-support",
    last_reviewed=date(2026, 3, 24),
    next_review=date(2026, 6, 24),
    open_actions=[
        "Add eval cases for discontinued products",
        "Test guardrail against new product category launched in Q2",
    ],
)

Vendor concentration risk

If your primary LLM provider increases pricing by 40%, has a multi-day outage, or discontinues your chosen model, what happens?

Mitigation strategies:

Risk eventMitigation
Pricing increaseNegotiate enterprise contract with price protection; abstract vendor behind internal interface to ease switching
OutageGraceful degradation: AI features fail safely (fall back to non-AI version); SLA with financial penalties
Model deprecationPin to specific model version; subscribe to deprecation notices; document upgrade process
Data policy changeTrack vendor policy changes; maintain contracts that protect your data rights
Vendor failureFor mission-critical use cases: multi-vendor architecture or self-hosted fallback

Layer 3: Deep Dive

The regulatory landscape

AI regulation is evolving rapidly and varies by jurisdiction and sector. As of early 2026, the key regulatory frameworks to be aware of:

  • EU AI Act: Risk-based classification system. High-risk AI uses (certain employment decisions, credit scoring, biometric identification, critical infrastructure) face mandatory conformity assessments, human oversight requirements, and transparency obligations.
  • GDPR and equivalent data protection laws: AI systems that process personal data for automated decision-making face specific requirements including explainability and the right to human review.
  • Sector-specific regulations: Financial services (fair lending, model risk management guidelines), healthcare (FDA oversight for clinical decision support), and employment (EEOC guidance on AI in hiring) each add sector-specific requirements.

Compliance risk management for AI is not a one-time legal review. It requires ongoing monitoring of both the system’s behaviour and the regulatory environment, with the ability to update the system as requirements evolve.

Building a continuous risk management practice

One-time risk reviews fail because AI systems change continuously and the risk landscape shifts. A continuous practice requires:

Change management integration: Every change to a production AI system, prompt update, model version change, new tool added, new data source, triggers a lightweight risk reassessment. Not a full approval process; a structured checklist that takes under 30 minutes.

Production monitoring as risk intelligence: The signals from production monitoring are not just operational: they are risk signals. A spike in guardrail trigger rate may indicate a new attack pattern. A quality score drop may indicate model drift. A cluster of user complaints may indicate a failure mode the risk register missed.

Regular risk register reviews: Quarterly at minimum. Review open actions, update likelihood and impact assessments based on what has happened, and add new risks discovered through monitoring.

Post-mortem integration: When an AI system causes a significant incident, the post-mortem should feed back into the risk register. What risk was not anticipated? What control failed? What new control is needed?

The model risk management parallel

Financial services regulators have developed model risk management (MRM) frameworks over decades for quantitative models used in trading, credit, and operations. The principles translate well to AI systems:

  • Models must be validated by an independent function before deployment
  • Model performance must be monitored in production against expected benchmarks
  • Material changes to models require revalidation
  • Model inventory must be maintained with ownership, purpose, and status

Organisations outside financial services can adopt these practices without the regulatory mandate. The discipline they create, treating models as governed assets, not code, is exactly what AI production systems need.

Further reading

✏ Suggest an edit on GitHub

Managing AI Risk at the Org Level: Check your understanding

Q1

Six months after a customer-facing AI system was approved and launched, the engineering team upgrades the underlying model version, edits the system prompt to fix a tone issue, and adds a new retrieval source for product information. No governance review is triggered. What risk does this represent?

Q2

An organisation's primary LLM provider announces a 40% price increase and reduces the deprecation notice period from 12 months to 8 weeks for older model versions. Which risk category does this represent?

Q3

A company produces a comprehensive 60-page AI risk assessment at project launch, gets it signed off by legal and compliance, and files it. No further reviews occur. The system ships and runs in production for two years. Is this adequate risk management?

Q4

An AI system that assists with employee performance reviews is being deployed at a large organisation. Which risk category demands the most immediate and specific governance attention?

Q5

A production monitoring dashboard shows a sudden spike in the AI system's guardrail trigger rate: from 0.5% of requests to 4.5% over 48 hours. What is the most appropriate first interpretation of this signal?