NEW

Start with the pressure: sales, launch, abuse, agents, data, or guardrails

Deliverablesdeliverable
deliverable
public-sample

AI Red-Team Scope Document

A formal AI red-team scope covering objectives, exclusions, allowed techniques, severity rubric, evidence format, safety rules, and communications protocol.

10-20 pages
Client deliverable
public-sample
10-20 pages

Synthetic public-safe AI red-team engagement scope defining objectives, systems, boundaries, exclusions, allowed techniques, safety rules, severity rubric, evidence format, and communications protocol.

System
Northstar Support Cloud / Customer Support Copilot
Environment
staging with production-like synthetic tenant data

# AI Red-Team Scope Document

Sample Deliverable

Executive Summary

This scope document defines what an AI red-team engagement will test, what it will not test, which techniques are allowed, what safety boundaries apply, how findings will be scored, and what evidence will be produced.

The goal is safe adversarial testing. A strong AI red-team scope makes the work useful without turning the engagement into uncontrolled production abuse.

Heads up

Public sample notice

This is a shortened, synthetic excerpt prepared as a public sample. A client version would include system-specific evidence, implementation references, architecture screenshots, control test results, owner sign-offs, and full supporting documentation. This sample uses Northstar Support Cloud / Customer Support Copilot as the synthetic reference system. This sample is not legal advice, not a compliance certification, not an audit opinion, not a warranty, and not proof that any unreviewed system is secure.
Decision · planned

Scope decision

Approve adversarial testing only within the named staging environment, synthetic tenants, approved model route, approved retrieval sources, and non-destructive tool simulations.

Metrics

Scope Snapshot

Objectives
5
Systems in scope
4
Limited-scope systems
2
Excluded systems
1
Excluded techniques
10
Note

Good scope is a security control

AI red-team work needs boundaries. The test should pressure the AI product surface, not create unmanaged risk through real customer data, destructive actions, or provider abuse.

Engagement scope

Evidence pack

AI Red-Team Scope Document

The scope document maps objectives, systems, allowed techniques, exclusions, severity, evidence format, and communications protocol.

Synthetic public-safe AI red-team engagement scope defining objectives, systems, boundaries, exclusions, allowed techniques, safety rules, severity rubric, evidence format, and communications protocol.
implemented
0
partial
0
missing
0
planned
0

Objectives

Red-team objectives

ObjectivePriorityWhat it tests
Retrieval-mediated data exposureCriticalunauthorized, cross-tenant, restricted, stale, or poisoned content in answers
Direct and indirect prompt injectionHighuser or retrieved content overriding policy or intent
Agent tool authority boundariesCriticalread, draft, queue, approve, execute, and workflow trigger boundaries
Approval bypass and approval theaterHighsensitive actions approved without meaningful evidence
Incident reconstruction evidenceHightrace reconstruction across prompt, retrieval, model, tool, and approval

Systems in scope

Systems in scope

SystemScopeNotes
AI GatewayIn scopeprompt envelope, routing, policy, retrieval orchestration, tool policy, traces
Retrieval IndexIn scopesynthetic tenant data, knowledge-base content, source labels, chunk metadata
Approved Model Provider RouteIn scopegateway-managed route only
Case Management ToolLimited scoperead and queue paths in staging only
Customer Messaging ToolLimited scopedraft and approval simulation only
Billing SystemExcludedno writes, credits, refunds, or plan changes

Allowed techniques

Checklist

Allowed techniques

Direct prompt injection.
Indirect prompt injection through synthetic retrieved content.
Authorization negative tests using synthetic tenants.
Tool policy and action-class tests in staging or simulation.
Trace reconstruction tests using test traces.

Excluded techniques

Checklist

Excluded techniques

Phishing employees or customers.
Credential theft.
Production data exfiltration.
Denial of service.
Provider account abuse.
Malware.
Social engineering.
Testing outside approved tenant or environment.
Destructive tool execution.
External customer messaging.

Severity rubric

Severity rubric

SeverityCriteria
Criticalrestricted data exposure, unauthorized state-changing execution, billing/customer-visible action without valid approval
Highprompt injection changes behavior, unsafe action queued with weak approval, trace evidence insufficient
Mediumblocked unsafe action lacks evidence, low-trust content influences rationale, evidence pack stale
Lowminor output-quality issue, documentation mismatch, non-sensitive trace inconsistency

Evidence format

Checklist

Required finding evidence format

Finding id.
Severity.
Affected boundary.
Test objective.
Safe reproduction summary.
Observed behavior.
Expected behavior.
Business impact.
Evidence references.
Affected control.
Recommended remediation.
Validation criteria.
Decision · planned

Stop-condition decision

Stop testing immediately if real customer data exposure, unexpected production effect, provider abuse risk, unsafe tool action outside simulation, or legal/privacy concern occurs.

Related artifacts

Artifact

Related artifact: AI Red Team Assessment Executive Summary

The executive summary communicates results after the scoped assessment.

/deliverables/ai-red-team-executive-summary
Artifact

Related artifact: AI Red-Team Findings Register

The findings register captures the technical results in a structured remediation format.

/deliverables/ai-red-team-findings-register