aisecurity.llc
AI SECURITY ENGINEERING
Los Angeles, CA · Athens, GR
aisecurity.llc
hello@aisecurity.llc
Legal Agreement · Negotiation Draft
AI Red Team Rules of Engagement
Rules of engagement for authorized AI red-team validation, including targets, test windows, allowed techniques, prohibited actions, safety controls, evidence handling, escalation paths, and stop conditions.
AI Red Team — Rules of Engagement
aisecurity.llc Addendum to Statement of Work · Negotiation Draft
Required caveat: This template is provided for transparency and scoping. It is not legal advice and does not replace a final reviewed and signed agreement. All bracketed placeholders must be completed and confirmed before testing begins. Testing must not start until this document is signed by both parties.
1. Engagement Summary
| Field | Value |
|---|---|
| Client | [CLIENT_ENTITY_NAME] |
| Provider | aisecurity.llc |
| Engagement Name | [RED TEAM ENGAGEMENT TITLE] |
| Related SOW | [SOW_REFERENCE] |
| Test Window | [START DATE] to [END DATE] |
| Client Security Contact | [NAME · EMAIL · PHONE] |
| Provider Test Lead | David Wolf · hello@davidwolf.org |
2. Authorized Targets
Testing is authorized against the following systems, endpoints, and interfaces only:
- System: [DESCRIBE — e.g., internal LLM-powered chat endpoint]
- Endpoint: [URL / IP RANGE / ENVIRONMENT — e.g., staging.internal/api/chat]
- Model or provider: [e.g., OpenAI GPT-4o via internal proxy]
- RAG corpus: [e.g., internal document store — read-only access]
- Agent workflow: [DESCRIBE AGENT BEHAVIOR AND TOOL ACCESS]
- Other: [DESCRIBE]
Testing is explicitly limited to the above. Any system, endpoint, account, or data source not listed here is out of scope.
3. Authorized Attack Techniques
The following attack technique families are authorized:
- Prompt injection (direct and indirect)
- Jailbreak and policy bypass attempts
- Context window manipulation
- RAG corpus poisoning simulation (read-only, no write access unless separately authorized)
- Tool misuse and function-call abuse
- Unsafe action path exploration
- Output handling failures
- Data exfiltration via model output
- Agent goal hijacking
- Multimodal input abuse (if applicable)
- Other: [DESCRIBE]
Prohibited Actions
The following are explicitly prohibited regardless of technical capability:
- Accessing, modifying, exfiltrating, or destroying production data not listed in Section 2
- Attacking systems, endpoints, or accounts not listed in Section 2
- Social engineering of Client personnel unless separately authorized
- Denial of service or load testing unless separately authorized
- Introducing persistent artifacts (backdoors, modified prompts, etc.) into any environment
- Sharing test artifacts, prompts, or findings outside the named delivery contacts
- Publishing, disclosing, or referencing Client systems or findings without written consent
4. Test Windows and Scheduling
| Window | Dates / Times |
|---|---|
| Primary test window | [DATE RANGE] |
| Authorized testing hours | [e.g., Mon–Fri 09:00–18:00 PT / or "anytime"] |
| Blackout dates | [DATES TO AVOID — e.g., product launches, major releases] |
| Coordination cadence | [DAILY STANDUP / ASYNC / END OF WINDOW ONLY] |
Provider will notify Client security contact at the start and end of each active testing session.
5. Evidence and Data Handling
- All test artifacts (prompts, outputs, screenshots, logs) are treated as confidential Client information.
- Evidence will be stored in a [SECURE LOCATION — e.g., encrypted shared folder, dedicated vault] accessible only to named delivery contacts.
- Raw test outputs containing sensitive data will be redacted before inclusion in reports.
- Evidence will be retained for [X DAYS / until final delivery] and then deleted or returned per Client instruction.
- Provider will not use test artifacts to train AI models or for any purpose beyond this engagement.
6. Emergency Stop Procedure
If testing causes unexpected production impact, discovery of a critical zero-day, or exposure of regulated data:
- Provider immediately stops all test activity and notifies Client security contact via [PHONE / SIGNAL / SECURE EMAIL].
- Client security contact confirms receipt within [X MINUTES].
- Both parties jointly assess impact before testing resumes.
- If Client cannot be reached within [X MINUTES], Provider halts all testing until contact is re-established.
Emergency contact (Client): [NAME · PHONE · BACKUP EMAIL]
Emergency contact (Provider): David Wolf · hello@davidwolf.org
7. Reporting
| Deliverable | Due Date |
|---|---|
| Daily or session status notes | [ONGOING / END OF EACH SESSION] |
| Preliminary findings (critical severity) | [WITHIN X HOURS of discovery] |
| Draft report | [DATE] |
| Client review period | [X BUSINESS DAYS] |
| Final report | [DATE] |
| Readout session | [DATE or TBD] |
Reports will include: attack scenarios executed, findings with evidence, severity and exploitability notes, and mitigation guidance.
8. Severity and Escalation
| Severity | Response |
|---|---|
| Critical | Immediate notification to Client security contact. Pause testing pending acknowledgment. |
| High | Notification within [X hours]. Continue testing unless Client requests pause. |
| Medium / Low / Info | Include in report. No pause required. |
Severity is assessed by Provider based on exploitability and impact against the AI system and any downstream systems.
9. Caveats
- This engagement is time-bound and scenario-scoped. It does not guarantee exhaustive vulnerability discovery.
- Findings are limited to the authorized targets and attack techniques listed above.
- Test results reflect the state of the system at the time of testing. Changes after the engagement may affect the validity of findings.
- No certification, compliance approval, or formal audit opinion is provided.
- Public disclosure of findings requires written agreement from both parties.
10. Signatures
Testing must not begin until both parties have signed.
Client (authorizing test against listed targets):
Signature: ______________________________
Name: ______________________________
Title: ______________________________
Date: ______________________________
Provider (aisecurity.llc):
Signature: ______________________________
Name: David Wolf
Title: Principal
Date: ______________________________
These materials are provided for transparency and scoping. They are not legal advice and do not replace a final signed agreement. Consult qualified legal counsel before execution.