NEW

Start with the pressure: sales, launch, abuse, agents, data, or guardrails

AI Security Academy
Print edition

Model Gateways and Secure AI Platform Engineering

A technical enterprise course for platform, DevOps, SRE, cloud security, AI infrastructure, and security architecture teams building model gateways, provider routing, policy enforcement, telemetry, RAG boundaries, and agent execution controls.

Print manuscriptWeb edition

Build the approved path for safe AI adoption.

Course thesis

AI adoption becomes risky when every team connects directly to model providers, stores prompts differently, manages keys inconsistently, logs sensitive data accidentally, or gives agents uncontrolled tool access.

The durable enterprise solution is not only policy. It is an approved platform path that makes safe model access easier than unmanaged AI use.

Audience

This course is for platform engineers, DevOps engineers, SRE teams, cloud security teams, AI infrastructure teams, internal developer platform teams, security architects, product security teams, AppSec teams, SecOps teams, engineering managers, AI governance teams, and developer experience teams.

Learning outcomes

Learners will be able to:

  • map AI platform threat models
  • design approved provider routing and access patterns
  • explain model gateway architecture and control points
  • manage secrets, keys, quotas, budgets, and rate limits
  • design prompt and context logging with redaction
  • enforce policy and data classification at model access points
  • protect RAG, vector store, and tenant boundaries
  • control agent tool execution and approvals
  • design telemetry, observability, and incident response for AI platforms
  • create a secure AI gateway platform plan

\pagebreak

# Module 1: AI Platform Threat Model

Secure AI platform engineering starts with a threat model.

Key points

  • AI platform threat modeling maps model access, data flows, secrets, retrieval, agents, logs, policy points, and evidence.
  • Assets include provider accounts, API keys, prompts, context, vector stores, logs, tools, telemetry, and audit exports.
  • Risks include shadow AI, data exposure, key sprawl, policy bypass, unsafe agent action, and observability gaps.
  • Trust boundaries exist between users, applications, gateways, providers, retrieval systems, logs, agents, and tools.

Practice

Create a threat model for an organization with developer AI tools, product model calls, RAG, agents, and no central gateway.

\pagebreak

# Module 2: Approved Provider Routing and Access Patterns

Model access should follow approved paths.

Key points

  • The approved path must be safer and easier than unmanaged direct provider use.
  • Access patterns include direct provider access, model gateway access, managed cloud model services, local model runtimes, and hybrid routing.
  • Routing can depend on app, team, environment, role, data class, capability, latency, cost, region, and risk tier.
  • Exceptions need owner, expiration, controls, and migration plan.

Practice

Design an approved provider routing policy for public content generation, internal summarization, customer-data RAG, coding assistance, and agentic workflows.

\pagebreak

# Module 3: Model Gateway Architecture

A model gateway is a control point for model access.

Key points

  • A model gateway is useful when it enforces, observes, routes, and records important decisions.
  • Components include client, identity, authorization, policy engine, routing layer, redaction, telemetry, and audit.
  • The request lifecycle should authenticate, classify, enforce policy, route, log, and emit evidence.
  • A gateway does not replace secure application design.

Practice

Design a model gateway architecture for an enterprise SaaS company.

\pagebreak

# Module 4: Secrets, Keys, Quotas, and Rate Limits

Model access creates credential, cost, and abuse risk.

Key points

  • Provider keys should not become the platform.
  • Prefer application identity and gateway-held provider credentials.
  • Quotas can apply by application, team, environment, tenant, user, model, provider, use case, token budget, request count, and cost.
  • Rate limits can reduce provider instability, runaway agents, retry storms, and tenant unfairness.
  • Key rotation should be designed before incidents.

Practice

Design key, quota, and rate-limit controls for sandbox, staging, production assistant, summarization jobs, and agent workflows.

\pagebreak

# Module 5: Prompt, Context, Logging, and Redaction

AI platforms need visibility, but visibility can create data exposure.

Key points

  • Prompt and context logs are sensitive data unless classified, minimized, redacted, retained, and access-controlled.
  • Logging levels include metadata-only, redacted content, sampled content, and full content by exception.
  • Default gateway logs should prioritize safe metadata.
  • Full content logging should be disabled by default unless explicitly approved.
  • Debugging should use scoped temporary logging where possible.

Practice

Design a logging and redaction policy for public generation, internal summarization, customer-data RAG, agent execution, and incident debugging.

\pagebreak

# Module 6: Policy Enforcement and Data Classification

Model gateways become valuable when they enforce policy.

Key points

  • Data classification becomes useful when the platform can act on it.
  • Policy decisions include allow, allow with redaction, route, require approval, block, and log only.
  • Unknown data should receive conservative handling.
  • A policy matrix maps use case, data class, environment, model path, logging, redaction, approval, and evidence.
  • Every policy decision should emit safe evidence.

Practice

Build a policy matrix for public marketing content, engineering notes, customer support RAG, source code analysis, regulated records, and unknown pasted data.

\pagebreak

# Module 7: RAG, Vector Store, and Tenant Boundaries

RAG security is platform security.

Key points

  • A RAG answer is only as safe as the retrieval boundary that selected its context.
  • RAG assets include source documents, ingestion jobs, embedding services, vector stores, metadata filters, retrieval APIs, citations, and lifecycle workflows.
  • Retrieval must enforce tenant, workspace, role, and document-level access before context is sent to a model.
  • Vector store risks include missing metadata, filter bypass, stale indexes, citation leaks, and shared index confusion.
  • RAG evidence helps prove that retrieval is bounded.

Practice

Design RAG tenant controls for a SaaS platform with multiple tenants, workspaces, admin and standard users, confidential documents, deleted documents, and shared vector infrastructure.

\pagebreak

# Module 8: Agent Tool Execution Controls

Agent platforms need execution controls.

Key points

  • Agent security depends on tool design, permission boundaries, approval gates, execution limits, and audit evidence.
  • A tool registry should capture owner, purpose, side effects, permissions, approvals, data class, limits, and evidence.
  • Approval gates must be meaningful and risk-based.
  • Execution controls include allowlisted tools, scoped identities, tenant-aware permissions, human approval, timeouts, retry budgets, cost budgets, audit logging, and kill switches.
  • Agent evidence must make actions reconstructable without unnecessarily storing sensitive payloads.

Practice

Design execution controls for an internal support agent that can search account data, draft replies, create tasks, send messages, and escalate incidents.

\pagebreak

# Module 9: Observability, Telemetry, and Incident Response

AI platforms need operational visibility.

Key points

  • AI observability supports debugging, cost management, abuse detection, governance, and incident response.
  • Telemetry categories include model usage, provider routing, policy decisions, redaction decisions, blocked requests, quota events, retrieval events, tool calls, approvals, eval failures, latency, errors, and cost.
  • Alerts should focus on signals tied to security, reliability, cost, abuse, or customer impact.
  • Incident response should triage, contain, investigate, remediate, add regression, and produce evidence.
  • Incident evidence can itself be sensitive.

Practice

Design telemetry, alerts, and incident response for model gateway requests, customer-data RAG, agent tool execution, blocked policy decisions, and redaction failures.

\pagebreak

# Module 10: Capstone Secure AI Gateway Platform

The final deliverable is a secure AI gateway platform plan.

Required sections

  • AI platform threat model
  • approved access paths
  • provider routing policy
  • gateway architecture
  • identity and authorization model
  • key management and rotation
  • quota and budget controls
  • prompt logging and redaction policy
  • data classification policy matrix
  • RAG and vector store boundary design
  • agent tool execution controls
  • telemetry schema
  • alerting plan
  • incident response playbook
  • exception workflow
  • developer experience plan
  • rollout roadmap
  • evidence and audit plan

Practice

Build a secure AI gateway platform plan for an organization with developer coding assistance, public content generation, internal summarization, customer-data RAG, support agents, product AI features, and experimental notebooks.

\pagebreak

# Appendix A: Quick Checklists

Model gateway architecture

  • Client or SDK available.
  • Caller authentication implemented.
  • Application authorization implemented.
  • Provider routing implemented.
  • Policy engine integrated.
  • Data classification available.
  • Redaction path defined.
  • Quotas enforced.
  • Rate limits enforced.
  • Telemetry emitted.
  • Audit export supported.
  • Incident controls available.

RAG boundaries

  • Source inventory complete.
  • Ingestion metadata complete.
  • Tenant ID propagated.
  • Workspace or account scope enforced.
  • Role scope enforced.
  • Document classification applied.
  • Query-time authorization enforced.
  • Citation authorization checked.
  • Deleted documents removed or blocked.
  • Revoked documents removed or blocked.
  • Cross-tenant canary tests run.
  • Retrieval evidence stored safely.

Agent execution

  • Tool registry exists.
  • Tool owners assigned.
  • Tool side effects classified.
  • Tool permissions scoped.
  • Approval gates defined.
  • Retry limits defined.
  • Timeouts defined.
  • Cost budgets defined.
  • Tenant-aware authorization enforced.
  • Audit events emitted.
  • Kill switch available.

\pagebreak

# Appendix B: Sample Prompt Templates

AI platform threat model

Create an AI platform threat model.

Organization context: [context]

Known AI use cases: [use cases]

Known model access paths: [access paths]

Known data classes: [data classes]

Known RAG or agent systems: [systems]

Output:

  • assets
  • actors
  • model access paths
  • data flows
  • trust boundaries
  • likely threats
  • existing controls
  • missing controls
  • evidence needed
  • owner map

Secure AI gateway platform plan

Build a secure AI gateway platform plan.

Organization: [organization]

Current state: [current state]

Use cases: [use cases]

Constraints: [constraints]

Output:

  • threat model
  • approved access paths
  • provider routing matrix
  • gateway architecture
  • secrets and rotation model
  • quota and budget controls
  • prompt logging and redaction policy
  • data classification policy matrix
  • RAG tenant boundary controls
  • agent tool execution controls
  • telemetry schema
  • alert plan
  • incident response playbook
  • exception workflow
  • developer experience plan
  • rollout roadmap

\pagebreak

# Final Message

Secure AI platform engineering is about paved paths.

Do not only tell teams what not to do. Give them approved model access, safe defaults, useful SDKs, observable policy decisions, controlled agent execution, and evidence that supports trust.

Build the platform path that makes safe AI adoption the easiest path.