Safety & Compliance
Prompt injection, supply chain risk, guardrails, EU AI Act โ what you need to ship responsibly.
The AI Threat Landscape
Every LLM application has a multi-layer attack surface: model, context, tools, memory, and outputs. Understanding what attackers want and what they can do is the prerequisite to building defences that actually hold. This module maps the threat landscape and establishes why defence in depth is not optional.
Prompt Injection
Prompt injection is the most prevalent attack class in LLM applications. It takes two forms: direct injection from user input, and indirect injection through retrieved documents or tool results. Both exploit the same root cause: the model cannot distinguish instructions from data when they share the same channel.
Jailbreaking and Policy Bypass
Jailbreaking is the attempt to get a model to produce output that its alignment training or system prompt prohibit. No defence is permanent: the arms race between jailbreak techniques and countermeasures is ongoing. This module covers the attack taxonomy and the multi-layer defences that reduce, but never eliminate, the risk.
Data Privacy and PII
LLM systems create new PII leakage vectors that traditional data protection controls do not cover: model memorisation, cross-user context leakage, and RAG pipelines that pull in customer records without scrubbing. This module covers detection, scrubbing, retention, and the vendor agreements that govern what happens to your data.
Guardrails Architecture
Guardrails are controls on inputs, outputs, or both: classifiers, validators, and policy checks that run independently of the model. Designing a guardrails architecture means choosing which controls to apply, how to layer them for coverage and performance, and how to calibrate them so false positives do not kill legitimate use.
Supply Chain Security
The AI supply chain, base model, fine-tuning data, adapters, Python packages, and API keys, has more attack surfaces than teams typically consider. A .pkl file is executable code. An unverified model weight can contain backdoors. This module covers the controls that keep your AI system trustworthy from training data to production inference.
Regulatory Landscape
The regulatory environment for AI is moving quickly. The EU AI Act introduced risk tiers and mandatory requirements. GDPR has always applied to automated decision-making. The US has the NIST AI RMF. This module maps the landscape for a B2B SaaS product using LLMs: what you likely need to document, what you need to avoid, and where you need legal counsel.
Incident Response for AI Systems
An AI incident is not a software incident: it involves model misbehaviour, safety violations, or data leakage, each with distinct root causes and remediation paths. This module covers detection, containment, investigation, and post-mortem structure for AI-specific incidents, and the one logging investment that makes all of it possible.