Academy Labs/AI Logging & Forensics Lab

AIPSA Academy Lab45 minPractitionerDefend

AI Logging & Forensics Lab

Evaluate AI telemetry coverage by analyzing real guardrail denial logs and SOC event streams. Identify forensic gaps, correlate events across log sources, and produce a chain-of-custody and investigatability assessment.

Build evidence

Progress

0/100 points

Status

not-started

Steps

0/4

Mission

Primary objective

Analyze both log sources. Classify each event, correlate across sources, identify what evidence is present and what is missing, and write an investigatability assessment that tells the security team whether they can answer the questions an incident response requires.

Brief

Scenario

Post-incident AI telemetry review

A security operations team has flagged anomalous activity in an AI assistant deployment. You have access to two log sources: a guardrail denial log (freeform) and a SOC event stream (structured JSONL). The events span a 15-minute window. Your job is to determine whether this constitutes an incident, what happened, and whether the logs are sufficient to support a full forensic investigation.

Objectives

Classify AI security events from guardrail and SOC log formats.
Correlate events across log sources to reconstruct incident timelines.
Identify forensic gaps: what information is missing that an investigation would require.
Produce a chain-of-custody document and investigatability assessment.

Prerequisites

Understand basic log formats: structured JSON events and freeform log lines.
Review MITRE ATT&CK and ATLAS event classification concepts.
Complete the RAG Security Lab or review AI pipeline architectures.

Expected signals

prompt injection attempt
PII exposure
tool abuse
token spike anomaly
guardrail denial
forensic gap

Prepare

Reading materials

AIPSA Handbook · Ch 7

Chapter 7 — Data Exposure and Privacy

PII in prompt and retrieval context, cross-tenant data leakage, training data exposure, prompt log privacy, data minimization, and retention controls.

2.2 MB

Checking…

AIPSA Handbook · Ch 10

Chapter 10 — Logging and Telemetry

Prompt/response/tool-call log requirements, trace correlation, PII-safe telemetry, abuse monitoring signal design, and the minimum log surface for AI forensics.

2.2 MB

Checking…

AIPSA Handbook · Ch 11

Chapter 11 — Detection Engineering

Building detectors for prompt injection, jailbreaks, credential exposure, anomalous tool calls, and AI-specific abuse patterns using telemetry pipelines.

2.3 MB

Checking…

AIPSA Handbook · Ch 12

Chapter 12 — Incident Response

AI incident classification, containment playbooks, prompt/tool-call forensics, rollback procedures, customer notification, and post-incident control improvements.

2.3 MB

Checking…

AIPSA Field Guide · Ch 9 · Ch 9

Privacy and Data Protection in AI Systems

Customer data usage, training policy, retention, prompt/log privacy, PII redaction, data minimization, data residency, and privacy controls for AI systems.

~2 MB

Checking…

AIPSA Field Guide · Ch 12 · Ch 12

Incident Response and AI Observability

AI incident detection, prompt/response/tool-call logs, traceability, abuse monitoring, alerting, forensics, containment, rollback, and post-incident learning.

~2 MB

Checking…

Do not rebuild

Reusable source assets

fixturestatic/read-only

Guardrail denial log

Freeform log file capturing guardrail trigger events with timestamps, request IDs, detected patterns, and applied policies. Four events covering injection detection, PII redaction, and request blocking.

llm-attack-range/ai-security-range/fixtures/logs/guardrail-denials.log

fixturestatic/read-only

SOC event stream

Structured JSONL SOC events: WAF block, guardrail trigger, tool abuse anomaly, and token spike. Each event includes timestamp, event type, user ID, payload snippets, and MITRE IDs.

llm-attack-range/ai-security-range/fixtures/simulation-logs/soc-events.jsonl

Sample inputs

Guardrail denial log · log

From fixtures/logs/guardrail-denials.log. Note: the log captures policy trigger and action but does NOT include the full prompt, user identity, session context, or downstream tool calls.

2026-05-02T10:00:01Z [INFO] Request received: id=req_123
2026-05-02T10:00:02Z [WARN] Guardrail trigger: Prompt Injection detected. Pattern: "Ignore previous instructions". Policy: P01_STRICT
2026-05-02T10:00:02Z [ERROR] Request req_123 blocked. Action: 403 Forbidden.
2026-05-02T10:05:15Z [WARN] Guardrail trigger: PII leak attempt. Field: SSN. Policy: P02_PII_REDACT

SOC event stream · text

From fixtures/simulation-logs/soc-events.jsonl. These events are structured with MITRE ATT&CK/ATLAS IDs where applicable. The TOKEN_SPIKE event has no MITRE mapping — is that a gap?

{"timestamp": "2026-05-02T11:00:00Z", "event_type": "WAF_BLOCK", "source_ip": "192.168.1.50", "payload_snippet": "Ignore previous instructions", "mitre_id": "AML.T0051"}
{"timestamp": "2026-05-02T11:05:00Z", "event_type": "GUARDRAIL_TRIGGER", "user_id": "user_821", "rule": "PII_REDACTION", "action": "REDACTED"}
{"timestamp": "2026-05-02T11:10:00Z", "event_type": "TOOL_ABUSE_ANOMALY", "tool_id": "python_repl", "command": "os.system('rm -rf /')", "status": "BLOCKED"}
{"timestamp": "2026-05-02T11:15:00Z", "event_type": "TOKEN_SPIKE", "user_id": "user_999", "tokens": 128000, "anomaly_score": 0.95}

Track progress

Lab steps

Classify each log event

Read both log sources. For each event, classify it by type (prompt injection attempt, PII exposure, tool abuse, DoS/resource exhaustion, policy violation) and severity (low/medium/high/critical). Note which events have MITRE IDs and which do not.

Evidence prompt: Build a table: timestamp, source, event type, severity, MITRE ID (or 'none'). Flag any event that is missing a classification field you would expect.

Correlate events across sources

The guardrail log and SOC stream cover overlapping time windows. Determine whether any events in the two sources describe the same underlying activity. What is the timeline? Is there a user or session ID that connects events across sources?

Evidence prompt: Describe any correlations you found. If the same event appears in both sources, what additional detail does each source add? What is still unknown about the actor or session?

Identify forensic gaps

An incident investigation needs to answer: Who? What? When? How? Impact? For each unanswered question, identify what log field or log source is missing. Common gaps: no session token in the guardrail log, no full prompt captured, no user identity on the WAF block, no completion log to see what the model returned.

Evidence prompt: List each forensic question and whether the current logs can answer it. For gaps: name the missing field or log source.

Write the investigatability assessment

Produce a structured assessment: can this set of logs support a full incident investigation? Rate each of the five forensic dimensions (attribution, timeline, impact, containment evidence, chain of custody) as sufficient, partial, or insufficient. Fill in the evidence artifact builder below.

Evidence prompt: Fill in all required fields in the evidence artifact builder. The sufficiency rating field determines your overall finding.

Submission draft

Evidence artifact builder

Forensic Log Analysis

Document event classifications, cross-source correlations, forensic gaps, and your investigatability assessment. This artifact supports post-incident review and telemetry improvement planning.

Event classification table*

Overall log sufficiency*

Cross-source correlation*

Forensic gap analysis*

Chain of custody notes*

Telemetry improvement recommendations

Reference

Framework mappings

NIST AI RMF

MANAGE · Incident response and risk treatment

MITRE ATLAS

AML.T0051 · LLM Prompt Injection

OWASP LLM Top 10

LLM02 · Sensitive Information Disclosure

Self-assessment

Scoring checklist

Score estimate: 0/100

Correctly classifies all events with severity and MITRE IDs (20 pts)Every event must have a type, severity, and MITRE ID (or explicit 'none'). Missing classifications for the TOKEN_SPIKE must be flagged.Correlates events across both log sources (20 pts)Must identify which events in the guardrail log and SOC stream describe the same activity, and note what additional context each source provides.Identifies specific forensic gaps by question (Who/What/When/How/Impact) (25 pts)Generic 'logs are incomplete' is insufficient. Must name specific missing fields: e.g., 'no session token', 'no completion log', 'no user identity on WAF event'.Addresses chain of custody and log integrity (15 pts)Must consider whether the logs can prove the events occurred — are they signed, timestamped with an authoritative source, and tamper-evident?Recommends specific telemetry improvements (20 pts)Must propose at least two concrete improvements: specific fields to add, sources to instrument, or retention policies to change.

Explore

Related tools

Incident Response Lab

Continue from log analysis into full incident response — classification, containment, and after-action reporting.

RAG Security Lab

Review RAG pipeline authorization — a common source of the data exposure events visible in these logs.

Ecosystem tools

Langfuse

Trace and log evidence across prompt, retrieval, and tool calls.

TruLens

Evaluation and trace analysis for investigation-ready evidence.

Arize Phoenix

LLM observability and analysis for forensic review.

Export

Submit or export your lab evidence

Save a local progress draft, submit the self-scored artifact, or export Markdown for evidence portfolio use.

Continue the AIPSA lab path

Incident response RAG security

← Back to Academy Labs