What Companies Really Mean by
“AI Security Engineer”
A structured analysis of 293,846+ job descriptions across 5,350 companies — validated by direct practitioner surveys, arXiv research momentum, open-source builder activity, public adversary vocabulary, framework intelligence, and industry media signals. The most comprehensive benchmark of the AI security engineering labor market available.
Required reading for CISOs scoping programs, hiring managers writing requisitions, recruiters sourcing candidates, and practitioners navigating a discipline being defined in real time.
A new security discipline is being staffed before it has been defined. AI Security Engineering has crossed from experimentation into institutional demand — but the organizational infrastructure required to absorb that demand is running years behind. This report documents the gap between what companies say they need and what the market can actually deliver, drawn from 293,846+ job descriptions, direct practitioner surveys, academic research, open-source builder activity, industry media coverage, and vulnerability intelligence. Across the report's independent signal layers, the conclusion is the same.
Five numbers that define the 2026 market
Number
What it means
What this means for your next hire
290×
Growth in AI security hiring from 2022 to 2026. Every position is calibrated for senior experience in a discipline that did not exist four years ago.
Every experienced AI security candidate has 2–4 competing offers right now. Scoped single-domain roles close 2–3× faster than chimera specs. Over-scoped roles sit open.
39:1
Legacy compliance frameworks (GDPR, HIPAA, SOC 2) vs AI-native governance frameworks (NIST AI RMF, EU AI Act, ISO 42001) in hiring language. The compliance reflex is structural, not incidental.
Your JD is almost certainly calibrated for a compliance program manager, not an AI security engineer. If it mentions GDPR but not NIST AI RMF, you're hiring the wrong profile.
8:1
Traditional security tool vocabulary (detection, SIEM, AppSec scanners) vs AI-native evaluation and observability tools in the same hiring corpus. The tooling stack is calibrated for compliance audit, not AI behavioral risk.
If your JD requires Splunk but not LangSmith, you're screening for infrastructure security, not AI behavioral risk. You will select for the wrong profile.
0.24%
Share of all 2026 postings that contain agentic attack surface language (prompt injection, function calling, tool calling). Agentic deployments are at scale; agentic security hiring is at near-zero.
If you've shipped an AI agent, you have agentic security exposure right now. The hire you need doesn't exist at scale in the market — fractional or contract coverage is likely the only near-term option.
57:1
Job postings that reference agentic attack surfaces vs postings with agentic control design language. The market is learning the vocabulary of the problem faster than it is staffing for the solution.
Your JD can name the risk — but hiring for the control architect who can actually solve it requires a different archetype than what most postings describe. See the decision tree in §04.
The consequence of these five numbers is not subtle: companies are posting role language that implies team-shaped capability while budgeting and interviewing as if they are hiring one specialist contributor. The result is a hiring market defined by Chimera Specs — one salary, five professions — and a discipline being invented at the top of the org chart with almost no junior entry pathways. Governance language has outrun evidence language. Compliance vocabulary has crowded out AI-native control language. And the hiring market is running three years behind the deployment curve on the agentic attack surfaces that matter most.
Three structural shifts would change this trajectory. First: separate ownership domains from hiring units — one role, one primary function, one accountability domain. Second: require evidence artifacts at the point of governance obligation creation — every compliance framework reference in a job description should come paired with a named evidence artifact the hire will produce. Third: invest in practical assessment infrastructure now — adversarial AI labs, scenario-based evaluation environments, and archetype-specific interview rubrics are not future-state investments; they are the prerequisite for building any repeatable hiring signal in a discipline without shared standards. The organizations that solve these three problems first will not just hire better — they will define the operating model that the rest of the market eventually adopts.
These conclusions are not derived from a single source. The hiring corpus is the primary evidence layer, but it does not stand alone. Primary practitioner survey research confirms the ownership gap: 22% of respondents report no clear AI security owner in their organization. Self-reported program maturity averages 2.1/5 — squarely in the emerging band, consistent with executive awareness without engineering delivery. 48% of practitioners recognize AI Security Engineering as an emerging distinct discipline, yet are watching their organizations hire as if it is a subspecialty of AppSec. The most cited risk — No AI asset inventory (34% of respondents) — is precisely the class of threat that agentic deployment creates and that 0.24% of job descriptions currently screen for. Meanwhile, arXiv research shows academic output in agentic attack surfaces accelerating faster than any other AI security domain. GitHub builders are funding the same taxonomy with open-source activity. Industry media is amplifying governance framing to board audiences who interpret it as compliance progress rather than control absence. And public vulnerability disclosures — 1,458 AI-relevant CVEs, 3 confirmed in CISA KEV — confirm that the attacks are not hypothetical: the exploits exist and are being used. The hiring market, the practitioners, the researchers, the builders, the press, and the CVE feed are all observing the same structural condition from different vantage points — and all of them are describing a gap between deployment and defense.
294K+
Job descriptions analyzed
5,350 companies · ATS data 2013–2026
22%
Report no clear AI security owner
Primary survey · cross-persona signal
9
Role archetypes defined
With boundary conditions and 90-day outputs
39:1
Legacy vs AI-native frameworks
GDPR (5.5K) vs NIST AI RMF (91) — the compliance reflex is structural
"The hiring market, the practitioners, the researchers, the builders, the press, and the CVE feed are all observing the same structural condition from different vantage points — and all of them are describing a gap between deployment and defense."
Key findings — where the signals converge
Finding #1 · Agentic Attack Surface
Agentic deployments are live; agentic security hiring is near-zero.
511 job postings reference agentic attack surfaces in 2026. Only 9 reference agent control design language — a 57:1 gap. arXiv confirms agentic attack surface research is the fastest-accelerating academic domain. The hire the market needs doesn't exist at scale.
Finding #2 · Compliance Displacement
Legacy compliance frameworks are crowding out AI-native engineering requirements.
39:1 ratio of legacy (GDPR, HIPAA, SOC 2) vs AI-native (NIST AI RMF, EU AI Act, ISO 42001) framework mentions in 2026 hiring language. Organizations are hiring compliance program managers for AI security engineering roles — and selecting for the wrong archetype.
Finding #3 · LLM Framework Vulnerability Density
AI/ML product and framework vulnerabilities are the dominant disclosed attack surface.
1,458 AI-relevant CVEs classified from 26K+ total records. The top bucket — ai ml framework library vulnerability — accounts for 378 disclosures. 3 appear in CISA KEV (active exploitation confirmed). The vulnerabilities exist; the roles to remediate them are barely hired.
Finding #4 · Entry Pipeline Collapse
290× hiring growth with zero junior pathways is building a structural staffing collapse.
AI security hiring grew 290× from 2022 to 2026. Every role targets senior experience in a discipline that did not exist four years ago. There is no OSCP equivalent for AI security, no standardized assessment infrastructure, no certifiable junior curriculum. The pipeline cannot refill without deliberate investment now.
Finding #5 · Signal Convergence Confirms the Structural Diagnosis
Six independent data sources — none sharing methodology or institutional source — describe the same gap.
The ATS corpus (employer language), practitioner survey (first-person experience), arXiv (academic attention), GHArchive (builder behavior), media (editorial coverage), and CVE disclosures (exploit pressure) are each independently classified against the same AI security taxonomy. Each produces a different slice of evidence. All six describe the same structural gap between deployment and defense. When independent systems converge this consistently, the convergence is the argument — not any single signal.
The 15 Findings
Fifteen market findings derived from job description signal analysis across 293,846+ postings. Each finding is a reusable market concept supported by corpus evidence, with explicit claim boundaries and audience-specific action implications.
0.24% of postings mention agentic controls; trajectory steep from near-zero
Add prompt injection and function calling scenarios to interview loops now
Finding 01 · Talent & role-design crisis
The Frankenstein Role
AI Security Engineer postings increasingly bundle five historically separate capability families into one requisition — at one salary.
AI feature velocity, buyer scrutiny, and compliance pressure converged before organizational design matured. Companies needed someone who could simultaneously own product security, model governance, adversarial testing, regulatory evidence, and agentic controls. They wrote that into one job description instead of designing a team.
6.79
Highest avg role breadth score
Government and Defense — 2026
5.21
Cross-corpus average breadth score
Across all 294K+ job descriptions analyzed
100K
Jobs in highest-volume industry
Manufacturing Industrial and OT — 2026
The corpus-level average role breadth score is 5.21 — across all 294K+ analyzed postings. Government and Defense leads all industries at 6.79, followed by Retail and Ecommerce at 6.29. Financial Services — with 38.8K jobs, the largest single scoped-security market — sits at 5.44. Manufacturing leads in volume (100K postings) but has the weakest AI-native specificity. Even Telecommunications, the most conservative sector, posts roles averaging 3.73 capability families. No industry is writing clean, scoped AI security roles.
"No industry is writing clean, scoped AI security roles. The Frankenstein pattern is not a failure of individual job descriptions — it is a failure of organizational role architecture across an entire discipline."
What leaders are misreading
Leaders treat this as a talent shortage problem alone. It is primarily a role-design failure. The scarcity is not of capable people — it is of organizations willing to do the architectural work of separating ownership domains before opening requisitions.
Failure mode if unaddressed
Chronic mis-hiring, scope collapse after onboarding, low tenure stability, and perpetual re-opening of the same role.
What this changes now
Define one primary ownership domain per role before writing the job description.
Require every AI security requisition to name the capability family it sits in.
Use role-architecture workshops before recruiting — not after.
Finding 02 · Title/substance mismatch
Skill Washing
AI-labeled security titles often outpace the AI-specific control, testing, and evidence language inside the same posting.
Title demand outpaced discipline standardization. Recruiting teams adopted AI prefix language as a market signal before organizations developed meaningful AI-specific control requirements to pair with it. The title changed; the job description content did not.
100K
Jobs in largest AI hiring market
Manufacturing Industrial and OT
38.8K
Jobs in second-largest market
Financial Services
Manufacturing Industrial and OT leads by a wide margin at 100K AI security-labeled roles in 2026 — but this sector also has the weakest AI-native specificity signals. Financial Services posts 38.8K such roles and Government/Defense 30.5K, sectors where legacy compliance language dominates. The AI label is applied broadly; the AI-specific control substance is applied narrowly.
"The AI label is applied to the title; the AI-specific control substance is applied to almost nothing. The gap between these two numbers is not a measurement artifact — it is an organizational design failure at industry scale."
What leaders are misreading
AI in the title is treated as proof of AI security scope. It is not. Title density is a market demand signal, not a capability evidence signal.
Failure mode if unaddressed
Teams hire for legacy security profiles while believing they have staffed AI risk. The coverage gap is invisible until an incident makes it legible.
What this changes now
Rewrite requisitions around named controls, evidence artifacts, and execution outputs.
Screen for role-language specificity in candidates, not keyword density.
Train recruiting teams to distinguish AI-labeled roles from AI-specific roles.
Finding 03 · Team-shaped requirements
The Unicorn Index
The market prices one role while describing team-level capability breadth. Every role family is affected — none is exempt.
Immediate pressure to cover product, governance, and customer assurance simultaneously led hiring managers to compress team-shaped requirements into single-contributor budgets.
6.53
Highest breadth — Data Security roles
2026 role family
6.42
AI Security-specific roles
2026 role family
5.90
Lowest of top 8 role families
Identity Security
Data Security roles lead with a 6.53 average breadth score, followed by Application Security at 6.49 and Product Security at 6.44. AI Security-specific roles sit at 6.42 — high, but not uniquely high. The unicorn problem is structural across security hiring, and AI has not made it worse — it has made it visible.
"Every security role family sits above 5.9 on the breadth scale. The Unicorn problem is not uniquely an AI security problem — AI has simply made it impossible to ignore. You cannot hire your way out of a role architecture failure."
What leaders are misreading
Compensation is treated as the only lever. Scope architecture is the lever that actually matters. Paying more for the unicorn does not change the impossibility of the spec.
Failure mode if unaddressed
Open roles stay unfilled or are filled with scope collapse: the hire arrives, renegotiates scope in week four, and the role re-opens within eighteen months.
What this changes now
Scope every security role to one primary ownership domain before posting.
Use phased staffing plans — not omnibus hires — for multi-domain coverage.
Align compensation explicitly to stated breadth assumptions.
Finding 04 · Systems reasoning shift
The Probability Pivot
AI security demands a cognitive shift from deterministic defect reasoning to probabilistic systems reasoning — but neither hiring loops nor interview rubrics have adapted to evaluate it.
Traditional security is built on deterministic failure logic: a buffer overflows or it does not. A CVE is present or patched. AI systems fail probabilistically: the same prompt produces different outputs across temperature settings, context window states, and retrieval results. An adversarial input may work 40% of the time, not 100%. A control that reduces attack success from 80% to 20% is meaningful even though it does not eliminate the vector. This is an entirely different reasoning mode — and it is not what standard AppSec or penetration-testing interview loops are designed to evaluate.
511
Jobs with agentic attack surface language — 2026
Up from 48 in 2025 — a 10× year-over-year jump
9
Jobs with agent control design language — 2026
The gap between attack surface awareness and control architecture is 57:1
0
Standardized interview rubrics for probabilistic AI failure reasoning
The cognitive shift is named in research; it is not yet codified in hiring practice
Agentic attack surface signals appeared in effectively zero job postings before 2024. By 2025: 48 postings. By 2026: 511 — a 10× year-over-year jump in a single year. The same 2026 dataset shows only 9 postings with meaningful agent control language — a 57:1 gap between attack surface awareness and control design language. This gap is not a data artifact. It reflects a genuine organizational pattern: teams recognize the agentic attack surface exists before they know how to architect controls for it. Interview loops still screen for static AppSec knowledge — deterministic vulnerability identification, CVE analysis, exploit reproduction. None of these evaluate the probabilistic tradeoff reasoning that AI security actually requires.
What leaders are misreading
Security interview performance on traditional vulnerability questions is used as a proxy for AI security capability. Strong AppSec candidates may fail at probabilistic reasoning; strong probabilistic reasoners may not have AppSec backgrounds. The screen is measuring the wrong variable.
Failure mode if unaddressed
Hiring systematically selects for deterministic-thinking security candidates in roles that require probabilistic-reasoning AI security judgment. The discipline builds a talent cohort optimized for the wrong problem class.
What this changes now
Add explicit probabilistic failure-mode scenarios to every AI security interview loop.
Evaluate tradeoff reasoning and risk quantification under uncertainty — not only exploit identification.
Define "acceptable control posture" for AI systems in probabilistic terms before designing interview criteria.
Finding 05 · Governance-to-execution gap
The Evidence Gap
Governance language appears before engineering evidence language in AI security hiring. Organizations can describe the policy obligation but cannot yet describe the proof.
Policy and framework adoption moved faster than productionized control instrumentation. Boards demanded governance narratives; governance teams wrote policies; engineering teams were not yet staffed to produce the evidence that would validate those policies.
5.5K
GDPR job postings in 2026
0.9% contain evidence-producing control language
91
NIST AI RMF job postings in 2026
13.2% contain evidence-producing control language
60:1
GDPR-to-NIST-AI-RMF job ratio
Volume disparity between legacy and AI-native governance frameworks
Framework adoption in AI security hiring — 2026
Cyan = legacy compliance & privacy frameworks · Violet = AI-native governance frameworks. GDPR (5,461 jobs, 0.9% with evidence language) vs NIST AI RMF (91 jobs, 13.2% with evidence language) — the inverse specificity paradox.
GDPR appears in 5.5K job postings — the single largest framework signal in 2026. Only 0.9% of those contain evidence-producing language such as control attestations or telemetry requirements. HIPAA: 3.1K postings, 0% evidence language. SOC 2: 1.9K postings, 0% evidence language. The signal inverts with AI-native frameworks: NIST AI RMF appears in just 91 postings, but 13.2% contain evidence-producing language. ISO/IEC 42001: 15 postings, 13.3% evidence language. Where organizations are deliberately staffing AI-native governance, they write more rigorous requirements — but almost none are doing so yet.
What leaders are misreading
Policy completion is interpreted as risk reduction. Governance programs that produce narrative confidence without operational proof are not risk reduction programs — they are risk documentation programs.
Failure mode if unaddressed
Governance programs accumulate policy artifacts while control behavior remains unmeasured. The board believes the posture is improving; the actual posture is static.
What this changes now
Map every governance obligation to a named evidence artifact at design time.
Require telemetry, eval outputs, and remediation closure as deliverables — not documentation.
Treat evidence quality as a board-reportable KPI, not an engineering detail.
Finding 06 · Delegated action risk
Agentic Anarchy
Agent security is delegated action security. The market is still framing it as chatbot security — a category error with operational consequences.
Tool-calling and workflow automation expanded AI blast radius from response quality to action execution. A chatbot that says the wrong thing is a reputational risk. An agent that takes the wrong action is an operational one. The control architectures for these two scenarios are fundamentally different.
511
Jobs with agentic attack surface signals
2026 — from effectively zero in 2023
9
Jobs with agent control language
2026 — the actual control gap
0.21%
Share of 2026 jobs with agentic signals
Out of 242K analyzed
Agentic attack surface signals — job mention frequency
Count of job postings mentioning each agentic attack surface. Function calling, prompt injection, and tool calling are the dominant emerging signals.
The agentic attack surface vocabulary is present but sparse. Function Calling appears in 278 postings; Prompt Injection in 258; Tool Calling in 236. Combined, these three represent under 800 job postings against a 2026 corpus of 242K — less than 0.35%. Meanwhile, agentic deployments are accelerating across every vertical. The hiring market is running approximately three years behind the deployment curve on agentic control vocabulary.
"A chatbot that says the wrong thing is a reputational risk. An agent that takes the wrong action is an operational one. The control architectures for these scenarios are not the same."
What leaders are misreading
Prompt-layer defense is treated as sufficient control architecture for agentic systems. It is not. Prompt hardening addresses what the model says. Action authorization addresses what the agent does — and action authorization is almost entirely absent from the hiring vocabulary.
Failure mode if unaddressed
Permitted agent actions become high-impact misuse pathways. Authorization, rollback, and audit capabilities are absent at launch because no one owned them.
What this changes now
Design action authorization as a first-class control — not an afterthought.
Require rollback, telemetry, and approval logic for every delegated agent workflow.
Threat-model delegated action paths before release, not after incident.
Finding 07 · Mid-market exposure gap
The vCISO Vacuum
Many organizations are too small to hire the AI security unicorn but too exposed to defer. Mid-market companies are the gap the market has not designed for.
AI exposure emerges before dedicated staffing maturity. A 200-person SaaS company shipping AI features faces the same model behavior, agentic control, and data-boundary obligations as a 20,000-person enterprise — without the headcount budget to staff a full AI security program. The discipline has no designed operating model for this segment.
100K
Manufacturing Industrial and OT jobs
2026 — largest AI security hiring market, lowest role specificity scores
38.8K
Financial Services jobs
2026 — high governance volume, unicorn compression at mid-market
0
Designed operating models for fractional AI security
The market has produced role archetypes, not service delivery models
AI security job volume by industry — 2026
Hiring volume by sector. High-volume markets with low role specificity (Manufacturing) have the most acute vCISO vacuum — exposure without staffing architecture to address it.
The vCISO vacuum is most acute at the intersection of three conditions: organizations large enough to face real AI security exposure, too small to hire a dedicated AI security team, and operating in sectors without established third-party support models. Manufacturing leads on volume (100K postings) but has the weakest AI-specific control language — which signals that most of those organizations are posting aspirational language without the internal expertise to execute it. Financial Services faces a different version: the governance vocabulary exists, but mid-market firms (500–5,000 employees) cannot absorb a unicorn AI security hire at the compensation level required. MSSPs and fractional AI security services represent the structurally correct answer to this gap — a managed capability model where the expertise is amortized across clients, and the operating model is designed for staged delivery rather than a single hire.
"A 200-person SaaS company shipping AI features faces the same agentic control and data-boundary obligations as a 20,000-person enterprise — without the headcount to staff a response. The market has not designed an operating model for this gap. That is an MSSP opportunity, not a hiring problem."
What leaders are misreading
Staffing constraints are treated as justification for deferral. They are the reason to adopt a managed or fractional operating model — not the reason to defer risk exposure.
Failure mode if unaddressed
AI exposure accumulates without named controls or evidence artifacts. When an incident occurs, the organization discovers simultaneously that it has no controls, no evidence of prior posture, and no internal expertise to respond.
What this changes now
Define a staged AI security operating model before hiring: what do you need coverage on in 90 days, 6 months, and 12 months?
Use fractional or MSSP AI security support as a bridge — not a permanent deferral.
Establish evidence artifacts and internal ownership before the first full-time AI security hire arrives.
Finding 08 · Execution translation failure
Boardroom-to-Backlog Gap
Executive AI risk narratives fail to translate into named engineering controls, accountable owners, and evidence artifacts — the machinery to answer the board's AI risk questions does not exist.
Board pressure on AI risk arrived before organizations built the execution infrastructure to respond to it. The result is a performative loop: the board asks, the CISO narrates, governance teams document, and engineers are not yet staffed or empowered to produce the evidence artifacts that would validate any of it. The gap between the boardroom question and the backlog item with an owner is where AI risk exposure actually lives.
8.6K
Privacy framework hiring signal
2026 — largest governance category in the hiring corpus
295
AI Governance-specific job postings
2026 — 29× smaller than Privacy framework hiring
0
Standard formats for AI control evidence artifacts
The board is asking questions; the industry has no agreed format for the answer
Framework category job volume — 2026
Privacy and Compliance framework hiring dominates. AI Governance hiring (295 jobs) is 29× smaller than Privacy framework hiring — but AI Governance roles write more rigorous evidence language when they do appear.
The boardroom-to-backlog gap is visible in two ways in the hiring data. First: Privacy framework hiring (8.6K jobs) and Compliance hiring (3.8K jobs) dwarf AI Governance hiring (295 jobs) — organizations are staffing compliance narrative, not AI control execution. Second: fewer than 1% of Privacy and Compliance framework job postings contain evidence-producing language. The hiring architecture reflects the governance architecture: policy is produced; proof is not. This is not a CISO failure — it is an organizational design failure. The board deck exists; the backlog item with a named owner and an evidence requirement does not.
What leaders are misreading
Strategy articulation is treated as execution readiness. Board confidence in the CISO's AI risk narrative is confused with board visibility into control posture. These are different things, and the difference will surface in the next incident.
Failure mode if unaddressed
Risk narratives repeat unchanged across four to eight board cycles. Each cycle the narrative grows more detailed; the underlying control posture remains unmeasured. When an incident occurs, the organization discovers it cannot demonstrate what it claimed to the board.
What this changes now
Require every AI risk statement presented to the board to map to at least one named backlog item with an accountable owner.
Set evidence artifact requirements — telemetry, eval output, remediation closure — at strategy planning time, not after audits.
Build a control evidence scorecard reviewed at the same cadence as board reporting.
Finding 09 · Assessment maturity lag
Skills Validation Gap
The market demands AI security engineering capability before it has standardized practical evaluation pathways.
Role demand accelerated before assessment models matured. There are no standardized AI security certifications, no widely-accepted lab environments for AI security skill demonstration, and no practical exam pathway equivalent to OSCP or equivalent for AI-specific attack surfaces. Organizations are running generic security interviews against AI security requirements — selecting candidates through loops calibrated for a different discipline.
30.8K
ML and AI Engineering — 2026
Largest role family in AI security-adjacent hiring
2.5K
Cyber Defense — 2026
Second largest AI security-adjacent hiring pool
0
Standardized practical AI security assessments
No OSCP equivalent exists for prompt injection, agentic control, or RAG security
AI security job volume by security role family — 2026
Each role family requires distinct assessment approaches. ML/AI Engineering hiring uses engineering screens calibrated for model building, not model security. Governance/GRC hiring uses policy screens calibrated for compliance, not control evidence.
The skills validation gap is structural. Organizations are hiring across 13+ security role families under AI security labels, each with distinct competency profiles that generic interview loops cannot differentiate. A standard AppSec interview tests for vulnerability identification in code — it does not evaluate agentic control design, RAG boundary enforcement, or probabilistic failure-mode reasoning. There is no OSCP equivalent for AI security. No standardized adversarial AI lab examination. No practical certification pathway that tests whether a candidate can exploit a RAG retrieval pipeline, bypass a prompt injection filter, or design a delegated-action authorization policy under adversarial conditions. The emerging solution pathway is scenario-based practical assessment: hands-on cyber range environments where candidates demonstrate real judgment — not recall — against deployed AI system configurations. Exercises that mirror real attack scenarios: prompt injection chains against agentic workflows, RAG boundary manipulation to extract context-window data, model behavior evaluation under distribution-shift attacks, authorization logic design for tool-calling systems. This mirrors how traditional security assessment evolved from certification to practice-based evaluation — a shift the AI security discipline is only beginning to make. The organizations that standardize practical AI security assessment infrastructure first will build the only reliable hiring signal in the market, and create the first reusable evaluation standard for the discipline.
"There is no OSCP equivalent for AI security. No standardized adversarial AI lab. No practical exam for prompt injection exploitation, agentic control design, or RAG boundary enforcement. The market is hiring at scale for a discipline the assessment ecosystem cannot yet validate."
What leaders are misreading
Credential density (certifications, degree programs, tool familiarity) is treated as competency evidence. It is a prior-discipline filter, not an AI security evaluation. The skills that matter most — ambiguity tolerance, probabilistic failure reasoning, control design under non-deterministic conditions — are not assessed by any current credential pathway.
Failure mode if unaddressed
Selection quality becomes noisy and non-repeatable across interviewers. False positive rate is high, false negative rate is equally high (good candidates rejected for the wrong signals). Scope collapse occurs after onboarding because the interview loop measured credentials, not capability.
What this changes now
Build role-archetype-specific interview rubrics before opening any AI security requisition — one rubric per archetype, scored criteria, not generic security questions.
Replace recall-based technical screens with adversarial AI lab exercises: prompt injection exploitation chains, RAG boundary manipulation, agentic authorization design under attack conditions.
Define a practical skills demonstration standard before the first screen — name the specific attack scenarios and control design tasks a candidate must demonstrate, not just the credentials they must hold.
Finding 10 · Lifecycle control deficit
Model Supply Chain Blind Spot
Model provenance, artifact integrity, dependency management, and deployment gates are systematically under-specified in AI security role language.
Organizations focus first on model behavior and user interaction risks. The assumption is that if the model behaves correctly at inference time, the system is secure. Supply chain compromise does not attack inference — it attacks the artifact pipeline upstream of it.
54
Data Poisoning signal jobs
Supply chain attack — model training corruption signal in 2026 postings
33
Model artifact / weights signal jobs
Lifecycle integrity signal — dramatically under-represented vs runtime signals
278
Prompt Injection signal jobs
Runtime signal — 5× more prominent than supply chain signals in the same corpus
Attack surface signal job mention frequency
Frequency of attack surface vocabulary in AI security job postings. Model supply chain signals (model weights, data poisoning) lag behind runtime and agentic signals.
The attack surface vocabulary in AI security hiring is dominated by runtime and agentic signals: function calling, prompt injection, tool calling. Model supply chain signals — model weights, data poisoning, artifact integrity — appear far less frequently, suggesting that lifecycle security ownership is not yet a primary hiring criterion. Organizations actively building AI security programs are staffing runtime controls; lifecycle controls remain an afterthought. This creates a specific risk class: an attacker who poisons the training dataset, corrupts model weights during packaging, or injects backdoors into a fine-tuning pipeline bypasses all runtime controls because the compromise occurred upstream of every inference-time defense.
"Runtime controls address what happens when the model runs. They do not address what happens when the model was built from a poisoned dataset, packaged from a compromised artifact, or deployed through a backdoored fine-tuning pipeline. The supply chain attack surface is outside every runtime security dashboard in the current stack."
What leaders are misreading
Runtime controls are treated as complete security posture. They address what happens when the model runs. They do not address what happens when the model was built, packaged, or deployed from a compromised artifact.
Failure mode if unaddressed
Silent supply chain risk accumulates outside visible incident pathways. When it surfaces, it bypasses all runtime controls because the compromise occurred upstream.
What this changes now
Add model lifecycle and provenance control ownership into every AI security role scope.
Require release-gate evidence for model artifact changes.
Include dependency and artifact integrity checks in security review processes.
Finding 11 · Talent supply crisis
Entry-Level Extinction
AI Security Engineering is being invented at the top of the org chart. The market is hiring senior-only into an unproven discipline, with almost no junior pathways.
Immediate risk pressure and budget constraints favor experienced hiring language. Organizations need someone who can own the domain on day one. The consequence is a discipline with no talent pipeline — which is sustainable for approximately one hiring cycle.
39.3K
AI security-adjacent jobs in 2026
Up from 7,833 in 2025
7.8K
Jobs in 2025
5× growth year-over-year
134
Jobs in 2022
The starting point four years ago
AI security-adjacent job posting volume — by year
Year-over-year growth of AI security role postings. Explosive recent growth with no junior pipeline creates a structural supply problem.
The growth trajectory is extraordinary: 134 postings in 2022, 526 in 2024, 7.8K in 2025, 39.3K in 2026. This is a 290× expansion in four years. Every position in this expansion is calibrated for senior experience in a discipline that did not exist four years ago. The pipeline problem is not hypothetical — it is already present, and the organizations hiring aggressively today are consuming the very talent pool they will depend on in three years.
"290× growth in four years. Every position calibrated for senior experience in a discipline that did not exist four years ago. The pipeline problem is not coming — it is already here."
What leaders are misreading
Senior-only staffing appears efficient in the short term. It consumes available experienced candidates without building the pipeline that will replace them. The efficiency is borrowed from the future.
Failure mode if unaddressed
Future mid-level talent pipeline collapses. Organizations that did not invest in junior pathways will face a structural shortage of experienced AI security engineers at exactly the moment when they have the budget and maturity to hire them.
What this changes now
Create explicit junior-to-mid transition pathways in AI security programs now.
Define apprentice-supportable scopes tied to measurable first-year outputs.
Pair senior hires with skills-transfer mandates as a condition of the role.
Finding 12 · Role language confusion
The Red Team Misnomer
"AI red team" is used as a catch-all for governance reviews, product assessments, platform controls, and abuse testing — diluting the term to meaninglessness.
The phrase carries market credibility and executive legibility. It is used as shorthand for broad AI risk work because it is understood by leadership and valued in the market, regardless of what the actual role delivers.
201
AI Security-specific roles
2026 — highest avg breadth score of any bucket
28.8K
Software Engineering Roles
2026 comparison cohort
3.9K
Traditional Security Roles
2026 comparison cohort
Job volume by role classification bucket — 2026
AI Security-specific roles represent a small fraction of total security hiring but carry the highest average role breadth score — the clearest signal of the Frankenstein problem concentrated at the AI-labeled tier.
AI Security-specific roles number just 201 in the 2026 dataset against 3,894 traditional security roles and 28,768 software engineering roles. But those 201 roles carry the highest average Frankenstein score — 6.52 — of any role bucket. Organizations using precise AI security language are simultaneously writing the most over-scoped roles. The "AI red team" label is a primary driver: it is applied to adversarial prompt testing, product risk assessment, governance review, platform security architecture, and abuse testing — interchangeably, in the same posting, against a single hire. A real AI red team exercise is a hands-on adversarial evaluation against a deployed model or agent system: crafting inputs that elicit unsafe behavior under realistic operational conditions, testing authorization boundaries at inference time, exploiting RAG retrieval pipelines to extract out-of-scope context, mapping tool-calling attack paths through multi-step agentic workflows. This requires lab environments, reproducible finding formats, and evaluators who understand probabilistic failure modes — not governance documentation skills. What most "AI red team" postings actually describe is risk program management or product security review with an adversarial framing — a categorically different function requiring a categorically different profile.
"A real AI red team exercise means crafting adversarial inputs against a deployed model, testing authorization boundaries at inference time, and exploiting RAG retrieval pipelines under realistic attack conditions. Most 'AI red team' postings describe governance review with a red-team brand applied to it."
What leaders are misreading
Label precision is assumed from title vocabulary. "Red team" is assumed to mean active adversarial evaluation. It frequently means risk review with a red-team brand. The candidate who gets hired is selected for governance fluency; the organization then discovers it cannot conduct an adversarial AI exercise.
Failure mode if unaddressed
Organizations build "AI red team" programs that produce governance documentation rather than adversarial findings. The program exists; the adversarial capability does not. Security posture remains unmeasured at exactly the layer the label promised to test.
What this changes now
Require every "AI red team" role posting to name at least three specific adversarial exercise types the hire will execute — prompt injection chains, RAG exfiltration, agentic authorization bypass, jailbreak evaluation.
Separate governance review roles from active adversarial testing roles explicitly: different job family, different interview loop, different success criteria, different budget line.
Evaluate AI red team candidates with live adversarial exercises in realistic lab environments — not scenario recall, not governance case studies, not generic AppSec challenges.
Finding 13 · Legacy framework dominance
The Compliance Reflex
Legacy compliance frameworks dominate AI security hiring language by 39:1 versus AI-native governance frameworks — GDPR alone outweighs all AI-native frameworks combined.
Regulatory readiness programs were already funded and staffed when AI security emerged as a hiring need. Organizations reached for the framework vocabulary they already had — GDPR, HIPAA, SOC 2, FedRAMP — rather than developing AI-native governance language.
EU AI Act · NIST AI RMF · MITRE ATLAS · ISO/IEC 42001 — 2026
39:1
Legacy-to-AI-native ratio
GDPR alone (5.5K) vs all AI-native frameworks (317)
Named framework adoption in AI security hiring — 2026
Cyan = legacy compliance and privacy frameworks · Violet = AI-native governance frameworks. GDPR (5,461) alone exceeds EU AI Act + NIST AI RMF + MITRE ATLAS + ISO 42001 combined (317) by ~17×.
Named framework data makes the disparity concrete: GDPR appears in 5.5K job postings, HIPAA in 3.1K, SOC 2 in 1.9K, FedRAMP in 1.2K, PCI DSS in 694. Against this: EU AI Act appears in 189 postings, NIST AI RMF in 91, MITRE ATLAS in 22, ISO/IEC 42001 in 15. The 39:1 ratio is a direct measure of which vocabulary organizations actually use when staffing AI security. Legacy compliance frameworks produce familiar evidence artifacts — audit reports, attestation letters — that organizations already know how to generate. AI-native governance frameworks require evidence types most organizations cannot yet produce.
"GDPR in 5,461 postings. NIST AI RMF in 91. Organizations are using the vocabulary they already own to staff a risk they do not yet understand."
What leaders are misreading
Framework density in hiring language is interpreted as AI security readiness. A posting requiring GDPR and SOC 2 expertise is not an AI security posting. It is a compliance posting with an AI prefix.
Failure mode if unaddressed
Teams ship AI systems with governance narratives built on legacy framework compliance, while leaving AI-specific control surfaces — model behavior, agent authorization, prompt security — unmeasured and unevidenced.
What this changes now
Require at least one AI-native governance framework reference for every legacy compliance requirement in AI security roles.
Add agentic control and evaluation language as mandatory criteria in AI security requisitions.
Use framework mix as a hiring-quality diagnostic in your AI security program assessment.
Finding 14 · Incumbent tooling lock-in
The Tool Incumbency Trap
Traditional security tooling appears in AI security hiring at 8:1 versus AI-native evaluation and observability tooling — the tooling stack mirrors the compliance reflex exactly.
Organizations procure through existing trust paths and repurpose familiar tooling rather than adopting AI-specific testing stacks. Each individual decision is rational. The collective result is a security program whose tool vocabulary is calibrated for compliance audit, not AI behavioral risk.
3.7K
Detection & Response tool jobs
Splunk, Sigma, Falco, Elastic — 2026 corpus
552
AI-native eval + observability jobs
LLM observability (431) + model eval (94.0) + guardrails (27.0)
8:1
Traditional vs AI-native tool ratio
Category-level signal: detection/AppSec language vs LLM evaluation language
The named tool picture is unambiguous: Splunk appears in 1.1K AI security job postings, Falco in 485, Semgrep in 141, Snyk in 137. Against these: LangSmith in 260, Langfuse in 145, Ragas in 49, DeepEval in 25. At the category level the gap is larger — Detection and Response tooling totals 3.7K job mentions versus 552 for all AI-native evaluation and observability tools combined — 8:1. Splunk detects attacker behavior in logs. It does not evaluate model output distribution, detect prompt injection at inference time, or assess delegated-action authorization posture. These are structurally different capabilities requiring different tooling — and that tooling is almost absent from AI security hiring language.
What leaders are misreading
Tool familiarity is treated as AI risk coverage. SIEM and AppSec scanners address what attackers do to infrastructure. They do not address what AI systems do under adversarial prompt conditions or what agents do when tool-call authorization is absent.
Failure mode if unaddressed
Security programs build audit-ready portfolios with familiar tooling while systematically underinvesting in AI behavioral evaluation. The coverage gap is structurally invisible inside existing security tool dashboards — it will surface first as an incident.
What this changes now
Audit your AI security tooling explicitly against AI-specific threat coverage, not compliance or SIEM coverage.
Require at least one AI-native evaluation or observability tool in every AI security program's tooling baseline.
Track your traditional-to-AI-native tool ratio as a program maturity diagnostic — not just tool count.
Finding 15 · Early but accelerating risk surface
The Agentic Surface Emergence
Prompt injection, function calling, and tool calling security signals are still under 0.3% of all postings, but rising quickly from a near-zero baseline.
Deployment of tool-calling and function-calling systems is outpacing hiring-market adaptation. AI agents are being built and shipped; the hiring market does not yet have language for the controls those agents require.
278
Function Calling signal jobs
2026 — top agentic signal
258
Prompt Injection signal jobs
2026
1205
Total agentic surface signal jobs
Across all 9 tracked surfaces
Agentic attack surface signal job frequency
Total job postings containing each agentic attack surface signal. Combined total is under 1,300 mentions against 294K+ analyzed postings — but the growth rate is the signal.
Combined agentic attack surface signals total 1205 job mentions — approximately 0.24% of the analyzed corpus. Function Calling leads at 278, Prompt Injection at 258, Tool Calling at 236. Jailbreak (75), Model Drift (64), and Data Poisoning (54) follow. These numbers look small. They are the leading edge of a surface that is growing faster than the hiring market can name it. The appropriate response is not to wait until the numbers are large — by then the deployment gap will be measured in years, not months.
"Agentic systems in production are already at scale. Agentic security hiring is at near-zero. The deployment-to-hiring gap is measured in years, not quarters."
What leaders are misreading
Low absolute share is interpreted as low urgency. The urgency signal is in the deployment-to-hiring gap, not the absolute numbers. Agentic systems in production are already at scale; agentic security hiring is at near-zero.
Failure mode if unaddressed
Delegated-action controls lag deployment and become latent high-impact risk. By the time the hiring market normalizes agentic security vocabulary, the first generation of agentic systems will have been operating for years without appropriate controls.
What this changes now
Treat delegated-action authorization and rollback controls as near-term hiring priority — not future-state.
Add prompt injection and function calling scenarios to AI security interview loops now.
Instrument authorization, rollback, and audit evidence for all current agent workflows.
Vertical Intelligence
Ten industry verticals analyzed across AI security hiring signal, role breadth, and control specificity. Each profile represents the pattern visible in job description language — not company-level security posture.
Financial Services
5.44
breadth score 38.8K jobs · 505 cos
Signal: Control frameworks, model risk validation, regulatory vocabulary, auditability.
Blind spot: Weak AI-specific testing workflows and tooling language.
Hire for: Governance Evidence Lead + AI Product Security Engineer pairing.
Hire for: AI AppSec Engineer + Agent Security Engineer with operations maturity.
ScaleDetectionOperations
Role Architecture Canon
Nine distinct AI security engineering archetypes, each with explicit mission scope, trigger conditions, boundary definitions, danger signals, and first-90-day deliverables. Use these as role design inputs — not job description templates.
Which archetype do you need?
Answer one question about your primary use case. The right archetype follows directly.
Your primary use case
The one question to ask
Right archetype
Shipping AI features in a product
Are AI features secure before they reach production?
AI Product Security Engineer
Embedding AI security in the dev process
Do developers know how to write AI-safe code?
AI AppSec Engineer
Proving your AI systems are exploitable (before attackers do)
Can you run an adversarial exercise against your deployed model today?
AI Red Team Engineer
Securing AI agents that take real actions
Who owns authorization and rollback for delegated-action workflows?
Agent Security Engineer
Controlling what data RAG can surface
Can a query return data the user isn't authorized to see?
RAG Security Engineer
Producing evidence for governance obligations
Can you show the board a control that proves the policy works?
Governance Evidence Lead
Securing model development and deployment
Who owns artifact integrity and training pipeline security?
ML Security Engineer
Converting model risk into security controls
Do model risk assessments produce enforceable control requirements?
Model Risk Security Partner
Defining AI security standards across teams
Is there a shared trust model and control ownership map across all AI teams?
AI Security Architect
Risk domain ownership matrix
● Primary domain · ○ Contributing · — Not in scope. Use to identify coverage gaps and avoid ownership conflicts.
Archetype
Prompt Security
RAG / Retrieval
Agent / Action Auth
Model Lifecycle
Gov Evidence
Adversarial Testing
Product / SDLC
Architecture
AI Product Security Engineer
●
●
○
—
○
○
●
○
AI AppSec Engineer
●
○
—
—
—
●
●
—
AI Red Team Engineer
●
●
●
—
—
●
—
—
Agent Security Engineer
●
—
●
—
○
●
○
○
RAG Security Engineer
○
●
—
—
○
●
—
—
Governance Evidence Lead
—
—
—
○
●
—
—
○
ML Security Engineer
—
—
○
●
○
—
○
○
Model Risk Security Partner
—
○
—
●
●
○
—
—
AI Security Architect
○
○
○
○
●
—
○
●
AI Product Security Engineer
High demandSenior IC
Hire this if
You're shipping AI features and security reviews happen after merge, not before.
Secure AI-enabled product capabilities from design through release and post-release operation.
Boundary
Does not own enterprise-wide governance strategy or model lifecycle outside product scope.
Anti-pattern to avoid
Overloaded with policy ownership and customer-assurance narrative without implementation authority.
First 90-day outputs
Product threat model set, AI feature control backlog, release-gate checklist, customer assurance pack.
Danger signals in candidates
Claims broad AI governance ownership with no shipped product features; confuses adversarial testing with code review.
Interview question
Walk me through how you would threat-model a new AI feature that uses RAG to answer user questions about account data.
Strong answer
Identifies data boundary risks at retrieval, context injection at the prompt layer, and authorization failures on what data the RAG can surface — then produces a control backlog item, not just a risk list.
Weak answer
Describes generic OWASP Top 10 threats applied to the API layer without addressing the RAG-specific attack surface.
AI AppSec Engineer
High demandMid to Senior IC
Hire this if
You have AI features in the SDLC but your AppSec process has no AI abuse cases.
Integrate AI abuse patterns and controls into secure SDLC practice.
Boundary
Does not own broad AI governance program design.
Anti-pattern to avoid
Measured on generic vulnerability volume instead of AI-specific control outcomes.
Focuses only on code-level vulnerabilities; conflates static analysis with behavioral testing; no AI abuse-case library.
Interview question
What is the difference between a prompt injection vulnerability and an input validation vulnerability? How would you test for each?
Strong answer
Explains that prompt injection exploits the model's inability to distinguish instruction from data; designs a test sending adversarial instruction-breaking inputs, not just malformed strings.
Weak answer
Treats prompt injection as a type of XSS or SQL injection with LLM-specific syntax.
AI Red Team Engineer
Scarce — criticalSenior IC / Staff
Hire this if
You've built AI systems and need to know whether they're exploitable before adversaries do.
Execute adversarial evaluation against AI systems and corresponding controls.
Boundary
Does not own governance reviews or architecture assessments by default.
Anti-pattern to avoid
Labeled "red team" but scoped as policy review or general risk management.
First 90-day outputs
Adversarial scenario suite, reproducible finding format, retest protocol, control hardening recommendations.
Danger signals in candidates
Cannot name three specific adversarial exercise types; focuses on policy review; no hands-on adversarial lab experience.
Interview question
Describe a multi-step adversarial scenario you would run against a deployed RAG system to exfiltrate out-of-scope context.
Strong answer
Describes a prompt injection chain that escalates from query reformulation to system-role override, then retrieval boundary bypass to extract out-of-scope context; includes a reproducible finding format and retest protocol.
Weak answer
Describes "testing for jailbreaks" without a structured adversarial methodology or reproducible format.
Agent Security Engineer
Critical — emergingSenior IC / Staff
Hire this if
You have AI agents taking real actions — file writes, API calls, data access — and nobody has designed the authorization layer.
Secure delegated-action pathways for tool-calling and autonomous workflows.
Boundary
Does not own all conversational safety policy and UX moderation concerns.
Anti-pattern to avoid
Confined to prompt-layer defenses while action authorization is left undefined.
Treats RAG security as prompt engineering; can't explain retrieval boundary enforcement; no index access control design experience.
Interview question
How would you test whether a RAG system can be induced to surface documents a querying user isn't authorized to see?
Strong answer
Designs tests using queries that embed system-level keywords or other-user context; checks whether retrieved documents respect the same access controls as direct document access; tests for prompt-embedded retrieval manipulation.
Weak answer
Suggests adding output-side guardrails; doesn't test the retrieval boundary itself.
Governance Evidence Lead
High — regulatory pressureSenior IC / Manager
Hire this if
Your board or regulator is asking for AI risk evidence and your team produces policy documents, not control proof.
Translate governance requirements into verifiable engineering evidence.
Boundary
Does not own implementation of every control — owns the evidence standard.
Anti-pattern to avoid
Assigned reporting responsibility without authority to enforce evidence quality.
Produces policies without engineering evidence; can't explain what an evidence artifact is; treats compliance attestation as equivalent to control proof.
Interview question
What is the difference between a policy, a control, and evidence? Give one concrete example of each for an AI governance program.
Strong answer
Policy: "Model outputs shall be monitored for harmful content." Control: "LLM observability tool logs all outputs with flagging." Evidence: "Monthly eval report showing flag rate, sample review, and remediation closure." Distinct and hierarchical.
Weak answer
Conflates policy language with evidence; calls a risk register "evidence"; cannot name a specific evidence artifact format.
ML Security Engineer
Growing — nicheSenior IC
Hire this if
You're building or fine-tuning models in-house and haven't addressed artifact integrity, training-data provenance, or deployment gate security.
Secure model development, packaging, deployment, and serving pathways.
Boundary
Does not own enterprise risk narrative or board reporting by default.
Anti-pattern to avoid
Reduced to inference-endpoint hardening while artifact lifecycle remains unmanaged.
First 90-day outputs
Model lifecycle control map, artifact integrity checks, deployment gate definitions, monitoring baselines.
Danger signals in candidates
Focused only on inference security; no MLOps pipeline knowledge; can't describe artifact integrity checks or model provenance verification.
Interview question
How would you design a release gate for a fine-tuned model being promoted from development to production?
Strong answer
Defines checks for: training data provenance, artifact hash verification, eval suite pass/fail, access control audit on who changed model weights, lineage documentation; mentions rollback protocol.
Weak answer
Describes a code review process; doesn't address model artifact integrity, training pipeline controls, or lifecycle-specific security requirements.
Model Risk Security Partner
High in regulated verticalsSenior IC / Director
Hire this if
You have a model risk or governance function but it doesn't produce enforceable security control requirements.
Convert model risk language into enforceable security control requirements.
Boundary
Does not own standalone product engineering roadmap.
Anti-pattern to avoid
Trapped in governance language without implementation pathways.
Can't translate a risk statement into a specific control requirement; conflates model performance risk with model security risk; produces risk ratings without remediation timelines.
Interview question
You've determined that a customer-facing credit decision model has a 12% adversarial perturbation success rate under targeted attacks. What do you do next?
Strong answer
Quantifies business impact threshold for acceptable perturbation rate; designs a monitoring requirement with a named threshold and escalation path; proposes control options with tradeoffs; writes the control into the risk register as a remediation requirement.
Weak answer
Reports the finding to the risk committee without a control design or remediation timeline.
AI Security Architect
High at scaleStaff / Principal
Hire this if
You're building multiple AI systems across teams with no shared trust model, control ownership map, or architecture standard.
Define secure reference architecture across AI systems, data paths, and delegated-action components.
Boundary
Does not own day-to-day control operations.
Anti-pattern to avoid
Architecture ownership without decision rights over implementation standards.
First 90-day outputs
Architecture baseline, trust-boundary definitions, control ownership map, design-review rubric.
Danger signals in candidates
Produces architecture diagrams without control ownership assignments; can't distinguish reference architecture from specific system design; no cross-team implementation authority.
Interview question
You've been asked to define a secure reference architecture for an enterprise that will build 30 AI-powered features across 12 product teams. Where do you start?
Strong answer
Starts with trust boundary mapping and shared control surfaces (auth, logging, model serving); identifies platform-level vs product-level controls; produces a control ownership map before any architecture diagram; designs the design-review process.
Weak answer
Jumps immediately to technology selection (which LLM, which vector DB); doesn't establish ownership or governance before building.
Boardroom-to-Backlog Playbook
Four audiences. Each with a distinct failure mode and a distinct set of decisions that change posture. The common thread: evidence artifacts are the connective tissue between strategy and execution.
Audience
CISO
Common mistakes
Assuming AI security ownership is implicit in existing team charters
Accepting policy language without evidence pathways
Treating one hire as a full operating model
Three decisions in 90 days
Assign explicit cross-functional AI security ownership with a named map
Approve role architecture before opening requisitions
Require evidence artifacts in quarterly AI risk reporting — not policy updates
Minimum evidence artifacts
AI security ownership matrix
AI risk-to-backlog register with named owners
Control evidence scorecard reviewed at the same cadence as policy reviews
Early warning signal
Risk narratives repeat in board decks while the control evidence scorecard remains static quarter-over-quarter.
Audience
Hiring Managers
Common mistakes
Writing Chimera Specs — one salary, five professions
Mixing incompatible seniority expectations within a single requisition
Using "AI red team" as undefined shorthand for any AI risk work
Three decisions in 90 days
Define one primary ownership domain per role in a single sentence without conjunctions
Separate required from adjacent capabilities — adjacent belongs in interview context, not requirements
Align the interview loop to the role archetype before the first screen
Minimum evidence artifacts
Role-boundary brief (one page maximum)
Interview rubric by archetype with scored criteria
First-90-day output checklist shared with candidates at offer
Early warning signal
Interview debrief feedback is high-variance across interviewers and not comparable. The loop is measuring different things.
Audience
Recruiters
Common mistakes
Title matching without execution evidence — "AI security engineer" in title is not AI security capability
Over-indexing on tool and framework keyword density
Ignoring adjacent-role transferability from ML engineering, product security, and AppSec
Three decisions in 90 days
Screen for artifact-backed signals — shipped work, not self-reported skills
Use archetype-specific scorecards calibrated with the hiring manager before sourcing
Calibrate weekly on top-of-funnel quality, not volume
Minimum evidence artifacts
Scored screening rubric by role archetype
Candidate evidence log with artifact references
Role-fit risk notes passed to interviewers with each advance
Early warning signal
High top-of-funnel volume with low technical conversion rate at first interview. The screening criteria are not calibrated to the actual role.
Audience
Practitioners
Common mistakes
Presenting broad capability claims without concrete shipped outputs or demonstrated AI-specific work
Under-communicating cross-functional collaboration value in interviews
Relying on traditional security credentials as AI security proxies — OSCP, CEH, and CISSP do not assess AI-specific attack surfaces
Three decisions in 90 days
Publish at least one public-safe execution artifact demonstrating AI security control design judgment (threat model, eval design, authorization architecture)
Build demonstrated competency in one primary archetype through hands-on practice — practical lab environments and adversarial AI exercises before interview season
Practice executive risk translation anchored in technical evidence — the skill that differentiates senior AI security candidates from senior security candidates with an AI prefix
Minimum evidence artifacts
Hands-on AI security exercise completions (adversarial prompting, RAG boundary testing, agentic control design)
Control design examples with outcome context — not just "did X" but "designed X and it changed Y"
Cross-functional delivery proof: evidence you shipped a control that engineering, product, and governance all accepted
Early warning signal
Strong technical screens, weak role-fit decisions. The gap is executive communication and organizational translation — and the inability to demonstrate AI-specific work beyond claimed familiarity.
For service providers & MSSPs
The 15 findings in this report describe a hiring market with a structural gap between exposure and staffing capacity. MSSPs and managed AI security service providers are positioned to fill this gap — but only with service delivery models explicitly designed for AI-specific control surfaces, not extensions of legacy compliance audit or traditional SOC offerings.
Client Segment
Primary Exposure
Service Model
Mid-market (500–5K employees)
AI feature deployment without dedicated AI security staffing
Fractional AI Security Lead + quarterly control evidence review
AI-native companies (pre-IPO)
Rapid agentic deployment, RAG boundary risk, no governance evidence
Agent Security Engineering retainer + evidence artifact program
Enterprise (existing program)
Compliance-reflex programs missing AI-native control coverage
AI security program gap assessment + AI-native framework overlay
Regulated industries (FSI, Healthcare)
Evidence gap — governance language without evidence artifacts
Governance evidence production + board-reportable control scorecard
The service provider hiring signal: Only 4 job postings in the 2026 corpus explicitly reference security training platform experience. Practical skills demonstration platforms (adversarial AI labs, cyber range environments, hands-on LLM security exercises) represent an emerging assessment infrastructure gap — the first providers to standardize AI security practical assessment will define the evaluation standard for the discipline.
What you wrote vs. what you actually need
Common job description language and what it actually signals about the role you're trying to fill.
What you wrote in the JD
What you actually need
Right archetype
"Own the AI security program end-to-end"
A team charter, not a single role. Define one anchor domain first.
Start with AI Security Architect, then add specialists
"AI red team and governance program lead"
Two incompatible full-time profiles compressed into one requisition.
AI Red Team Engineer + Governance Evidence Lead — separate reqs
"Prompt engineering and AI security testing"
Behavioral red-teaming is not prompt engineering. The skillsets don't overlap.
AI Red Team Engineer with adversarial lab methodology
"GDPR, HIPAA, SOC 2 compliance for AI systems"
Legacy compliance ≠ AI security engineering. You're building a compliance posture, not an AI control posture.
Governance Evidence Lead — or reclassify as a compliance role
"Secure our LLM-powered chatbot"
Depends on "secure" — data access risk vs conversational safety vs agentic actions are different scopes.
RAG Security Engineer (data leak) or Agent Security Engineer (actions)
"ML pipeline security and model governance"
Lifecycle control and risk reporting are separate ownership surfaces.
ML Security Engineer + Model Risk Security Partner — sequential or parallel
"Experience with Splunk and AI threat modeling"
Detection tooling and AI behavioral risk are different coverage planes requiring different profiles.
Two roles: traditional security + AI Product Security Engineer
"5+ years experience in AI security"
The discipline is 3 years old at scale. You're describing a market that doesn't exist yet.
Rewrite as capability-domain experience, not time-in-title
Before you post — 6-item pre-posting checklist
Can you state this role's primary ownership domain in one sentence without using the word "and"?
If you need "and," you have two roles. Split them or choose one.
Have you named the three specific deliverables this hire will produce in the first 90 days?
If you can't name them, the role scope isn't defined yet.
Have you removed experience requirements that describe tenure in a discipline younger than the required years?
AI security at scale is ~3 years old. "7 years in AI security" screens out everyone.
Does your interview loop include at least one AI-specific adversarial scenario or control design exercise?
If not, you're selecting on legacy security credentials, not AI security capability.
Does the JD include at least one AI-native framework reference (NIST AI RMF, EU AI Act, MITRE ATLAS)?
GDPR-only is a compliance posting, not an AI security posting.
Does this role map to a single archetype from the nine defined in §04?
If it spans more than one, it's a Chimera Spec. Redesign before posting.
The 30-day test — what any good hire delivers in month one
1
Has produced one named AI security deliverable — threat model, control design, eval output, or policy-to-control mapping.
2
Has met every team that contributes to or depends on AI security coverage. Can name each team's primary AI security gap.
3
Can name the three highest-priority AI security gaps in the current program — in order, with rationale.
4
Has identified at least one gap in current evidence artifacts for board-reportable AI risk. Has proposed a format to close it.
5
Has proposed a 90-day roadmap with named owners and at least two measurable outputs. Has shared it with the hiring manager.
Role design red flags — quick reference
Any single job description containing three or more of the following signals is exhibiting the Frankenstein pattern. Use this as a pre-posting checklist.
Red Flag Signal
What it indicates
The design fix
Five or more "and" connectives in requirements
Chimera Spec — multiple capability families compressed into one role
Define one primary ownership domain. Move secondary capabilities to "preferred" or adjacent team scope.
"AI red team" without named exercise types
Red Team Misnomer — label applied to undefined scope
Name three specific adversarial exercise types the hire must execute. If you can't name them, you need a governance reviewer, not a red teamer.
GDPR + HIPAA + SOC 2 with no AI-native framework
Compliance Reflex — legacy vocabulary without AI-specific coverage
Add at least one AI-native governance reference (NIST AI RMF, EU AI Act, MITRE ATLAS) or reframe as a compliance role, not AI security.
"Experience with Splunk/SIEM" as primary AI security tool
Tool Incumbency Trap — runtime detection coverage mistaken for AI behavioral risk coverage
Add at least one AI-native evaluation or observability tool requirement. Distinguish infrastructure security from AI behavioral security.
5+ years experience in a discipline that is 2 years old
Unicorn Index — experience requirements impossible to satisfy at stated scope
Rewrite experience requirements around capability domains and demonstrated outputs, not time-in-market.
"Responsible for building and maintaining the AI security program" as a single-contributor role
vCISO Vacuum in reverse — team-shaped mandate on individual budget
Scope the first hire to one anchor deliverable. Use a phased staffing plan, MSSP support, or fractional coverage for the remaining program surface.
AI Security Tools Intelligence
Tool-market mapping complements job-description intelligence by showing the ecosystem teams actually evaluate and deploy. This section provides public-safe aggregate coverage metrics and directional product-landscape context.
149
Tools in manifest
Curated market coverage across AI security workflow categories
21
Distinct categories
Spanning evaluation, observability, governance, red-team, and control tooling
n/a
Sample average rating
Directional quality signal from available rated entries only
The tools layer is ecosystem intelligence, not endorsement. Coverage is designed to support CISO vendor scanning, hiring-manager tooling literacy, and practitioner comparison workflows. For interactive exploration, use the tools directory at /tools.
External Validation Signals
A market thesis derived from a single source is an opinion. This section is the evidence layer: seven independent signal sources — practitioner surveys, academic research, open-source builder activity, industry media, public knowledge codification, vulnerability intelligence, and framework-control intelligence — each arriving at the same conclusion through entirely different mechanisms. The hiring corpus tells you what companies say they need. These signals tell you what researchers are studying, what builders are shipping, what practitioners are experiencing, what the press is amplifying, where exploit pressure is visible in public disclosure, and how control frameworks map in practice. When all seven describe the same structural gap, the convergence is the argument.
Evidence layer at a glance
22%
No clear AI security owner
Survey · 48% recognize it as a distinct discipline · avg maturity 2.1/5
2,730
arXiv papers analyzed
Top term: privacy-preserving · 67 papers
333
GitHub repos tracked
0K+ event proxies · top topic: vulnerability-qa/sast-agent-qa
613K+
Media items classified
37,172 mapped to AI security taxonomy
1,458
AI-relevant vulnerability records
3 CISA KEV overlaps · top bucket ai ml framework library vulnerability
8
Frameworks tracked
42 directional crosswalk mappings
1,458
AI-relevant CVEs/advisories
From 26K+ total records · 3 CISA KEV entries
378
Top domain: ai ml framework library vulnerability
Largest AI security vulnerability classification bucket
Primary Research — Practitioner Survey
The survey layer is the only signal in this report that asks practitioners directly: what are you experiencing? Four survey instruments — CISOs and security leaders, AI security practitioners, recruiters and hiring managers, and adjacent security engineers — plus a flash assessment collected first-person responses across the same dimensions the hiring corpus measures at a distance. The results are striking for how closely they mirror the corpus signal. 22% report no clear AI security owner in their organization — the practitioner's version of the ownership vacuum Finding 07 documents in job description language. Self-reported program maturity sits at 2.1/5 (emerging band), confirming that executive awareness has outrun engineering delivery, not the reverse. 48% of respondents recognize AI Security Engineering as an emerging distinct discipline, yet describe hiring practices built on AppSec and GRC assumptions. The most-cited practitioner risk — No AI asset inventory at 34% — maps directly to the agentic attack surface language in 0.24% of job postings. Practitioners can name the threat. The hiring market has not yet built the role that owns it.
Top practitioner-cited AI security risks — cross-persona survey
arXiv — Research Momentum
Academic research is an independent leading indicator: what researchers publish today becomes practitioner vocabulary in 12–24 months, and hiring language in 24–36. The arXiv signal measures where that leading edge is. A seeded metadata pull across eight AI security taxonomy buckets — analyzing 2,730 papers by title, abstract, and category — surfaces the same domain topology as the hiring corpus, without any shared data. Top matched term: privacy-preserving (67 papers). Largest classified research bucket: prompt and generation security. The most recent month in the dataset (2026-05) shows 1036 papers — part of a sustained acceleration trajectory. This is not coincidence. Researchers are studying the problems that practitioners cite as critical risks and that hiring language is only beginning to name. The gap between academic term frequency and hiring-language adoption is the predictive signal: the concepts with high arXiv velocity but low hiring-corpus density are the next generation of required skills.
arXiv matched term frequency — seeded AI security pull
GHArchive — Builder Ecosystem
Open-source activity is the market's most honest signal: builders allocate time to things they believe matter, independent of employer mandate or press framing. The GHArchive signal tracks 333 classified GitHub repositories across eight AI security domains — 415 total event proxies (stars, forks, pull requests, issues) from Unknown-dominated tooling. The result: active builder ecosystems exist across every AI security taxonomy bucket without exception. No domain is purely academic. The top topic tag is vulnerability-qa/sast-agent-qa, consistent with Finding 06 (Agentic Anarchy) and Finding 15 (The Agentic Surface Emergence) — the threat the hiring market has barely noticed is the one that builders have been building defenses for. Governance/assurance engineering and detection/runtime monitoring show the highest unique-actor density, which also maps to the governance hiring signal dominance documented in Finding 08 (Boardroom-to-Backlog Gap). Builders are ahead of the hiring market in every domain where the hiring market lags. That is the normal order — until organizations start hiring to catch up.
Unique GitHub contributors by AI security domain — GHArchive
Active GHArchive repos by month
GHArchive event type mix (AI-security-scoped stream)
Collaboration intensity by bucket (avg actors per repo)
Review pressure by bucket (review-to-push ratio)
Event concentration risk by bucket (top-10 share)
Classifier evidence strength by bucket
Control artifact signals by bucket
Release cadence by bucket
Media — Industry Coverage
Industry media shapes board-level mental models before any hiring signal reaches them. 613,416 items from aggregated RSS/Atom feeds — major tech media, security outlets, AI lab blogs — classified against the same AI security taxonomy. Of the classified volume (6.1% of total; the remainder is general tech coverage without AI security signal), AI model research and AI cyber defense dominate, with secure AI SDLC and governance following. This distribution explains a specific failure mode documented in Finding 13 (The Compliance Reflex) and Finding 08 (Boardroom-to-Backlog Gap): when boards read about AI security, they read about model capability risks and regulatory frameworks — not about control engineering, evidence artifacts, or ownership structures. The media vocabulary that reaches boardrooms is calibrated for awareness, not for the operational decisions that CISOs and hiring managers actually need to make. The result is governance language that drives policy creation without driving control creation. The hiring corpus reflects the same bias: governance-shaped role language crowding out engineering-shaped role language, in the same proportion media shapes board perception.
Media volume by AI security theme — classified items only
Wikimedia — Knowledge Codification
Public knowledge codification is a discipline-maturity clock. When a concept acquires a Wikipedia article, a Wikidata entity, a taxonomy entry in public knowledge graphs, it has crossed from practitioner jargon into institutional vocabulary. The codification lag — the delay from practice emergence to public encoding — is an independent proxy for how far behind educational infrastructure, recruiting knowledge, and junior candidate preparation sit relative to the frontier. AI security subfields with strong codification coverage signal that knowledge infrastructure exists to train and credential the next generation. Subfields with thin codification are where hiring managers are demanding senior experience for problems that have no curriculum, no certifications, and no standardized vocabulary yet — which is the direct precondition for Finding 11 (Entry-Level Extinction). The discipline cannot build junior pipelines for concepts that are not yet in the knowledge infrastructure that junior candidates use to learn and recruiters use to screen.
Public vulnerability disclosures are the market's most unambiguous signal: when a CVE is published against an AI/ML product or framework, the attack surface is no longer theoretical. This layer aggregates 26K+ records from NIST NVD, GitHub Advisory Database (GHSA), and OSV.dev, classified for AI/ML relevance using a two-stage pipeline: product/package name matching against a 35+ package dictionary, followed by keyword-weighted scoring across 21 semantic buckets from the MITRE ATLAS taxonomy. Of 26K+ total records, 1,458 were classified as AI-relevant at confidence ≥ 0.5. 3 appear in the CISA KEV — confirming active exploitation in the wild. The dominant classification bucket is ai ml framework library vulnerability (378 records), directly mapping to the LLM application frameworks that practitioners cite as their top risk surface and that arXiv research focuses on most intensely.
The vulnerability signal is structurally different from the other signal layers: it does not measure what people say they need, what researchers are studying, or what builders are shipping. It measures where attackers have already found exploitable weaknesses. When the CVE distribution aligns with practitioner-cited risks and arXiv research focus across independent classification taxonomies, the convergence moves from directional to evidential. The exploits exist. The question is whether the organizations deploying AI have staffed the roles capable of remediating them — and the hiring corpus documents the answer.
ATLAS — Public Adversary Vocabulary
MITRE ATLAS is the public matrix for adversary tactics, techniques, mitigations, and case studies. This repository mirrors the curated machine-readable bundle into a dedicated Labs navigator so readers can inspect tactic coverage, technique maturity, case-study distribution, and the Navigator exports directly. The current release contains 16 tactics, 170 techniques, 35 mitigations, and 57 case studies. That is a vocabulary and mapping source, not proof of any organization's internal maturity. Open the live ATLAS navigator for the interactive view.
16
Tactics
Adversary goals in the curated ATLAS bundle
170
Techniques
Technique and sub-technique entries in the current release
35
Mitigations
Public control ideas mapped to technique coverage
57
Case studies
Incident and exercise narratives in the bundle
Framework Intelligence — Control and Mapping Coverage
Framework intelligence adds a different validation lens: not attack pressure, but control-language interoperability. This layer tracks public framework assets across MITRE ATLAS, NIST AI RMF, OWASP LLM Top 10, and related governance references; currently 8 frameworks with 42 directional crosswalk mappings. The coverage split — 3 machine-readable vs 5 document-only frameworks — is itself operationally important. Teams cannot automate control validation where framework assets are only narrative text. Crosswalk density and domain coverage show where organizations can build control traceability today versus where taxonomy translation is still heuristic. This is directional signal, not official equivalence mapping.
Framework crosswalk coverage by control domain
Framework source retrieval status
What convergence means for claim confidence
Each of these seven signal layers is independent. They share a taxonomy, but not a dataset, a methodology, or an institutional source. The hiring corpus comes from employer language. The survey comes from practitioner experience. arXiv comes from academic attention. GHArchive comes from builder behavior. Media comes from editorial selection. Wikimedia comes from knowledge community consensus. Vulnerability intelligence comes from public CVE/advisory disclosures. Framework intelligence comes from public control-framework assets and directional crosswalk analysis. When seven independent systems all map the same topology of problems — agentic surface exposure with no control ownership, governance vocabulary without engineering delivery, discipline emergence without junior pipeline — that convergence is not confirmation bias. It is structural evidence. The central claim of this report is not that the hiring corpus suggests a gap. It is that seven independent systems are measuring the same gap from seven different angles, and the measurements agree.
Glossary
Twelve AI security terms every hiring team needs to know. These are the vocabulary gaps between what hiring managers write and what AI security engineers actually do.
Prompt Injection
An attack where adversarial text in the input causes a model to follow attacker instructions rather than the application's system prompt. Distinct from input validation attacks — it exploits the model's inability to reliably separate instruction from data.
Why it matters: affects any user-facing AI system. Cannot be fully mitigated by output filters alone.
RAG (Retrieval-Augmented Generation)
Architecture where the model retrieves relevant context from an external data store before generating a response. Creates retrieval boundary and data-access control risks that standard chatbot security does not address.
Why it matters: the retrieval layer can surface documents the querying user isn't authorized to see.
Agentic AI
An AI system that takes actions — calls APIs, writes files, executes code, sends messages — rather than only generating text responses. Requires action authorization controls, rollback capability, and audit trails that conversational AI does not require.
Why it matters: a chatbot saying the wrong thing is reputational risk. An agent taking the wrong action is operational risk.
Jailbreak
A prompting technique that causes a model to violate its alignment training or application-level guardrails. Distinct from prompt injection — it exploits the model's own reasoning rather than instruction/data confusion.
Why it matters: jailbreak-resistant systems require architectural controls, not just content filters.
Model Evaluation (Eval)
A structured assessment of model behavior across a defined test suite. The primary evidence artifact for demonstrating that a model meets safety, quality, and security specifications. Not a one-time test — an ongoing process.
Why it matters: the difference between "we believe the model is safe" and "we have evidence the model is safe."
NIST AI RMF
NIST AI Risk Management Framework (2023). The primary US government AI governance standard. Organized around four functions: Govern, Map, Measure, Manage. The AI-native counterpart to NIST 800-53 for traditional IT security.
Why it matters: 91 job postings cite it vs 5,461 for GDPR — the adoption gap tells you exactly how far most programs lag.
MITRE ATLAS
Adversarial Threat Landscape for AI Systems. The AI-native counterpart to MITRE ATT&CK. Catalogs adversarial ML attack techniques including model evasion, data poisoning, and model inversion. Used for adversarial red team exercise design.
Why it matters: if your red team doesn't reference ATLAS, they're probably not running AI-specific exercises.
EU AI Act
European Union AI regulatory framework. Establishes risk classification tiers: unacceptable, high-risk, limited risk, minimal risk. High-risk AI systems face mandatory transparency, documentation, and human oversight requirements.
Why it matters: affects any company serving EU customers. Effective 2024, enforcement rolling through 2026–2027.
Delegated Action
An action taken by an AI agent on behalf of a user or system, via tool calling or function calling. The security-critical concept: delegated actions may be irreversible and require explicit authorization scoping that conversational responses do not.
Why it matters: authorization for delegated actions is the #1 agentic control gap in the 2026 corpus.
Governance Evidence
A verifiable artifact — eval output, telemetry log, remediation closure record — that demonstrates a stated control is operating as claimed. Distinguished from a policy document, which states what should happen but does not prove it does.
Why it matters: boards are asking for evidence; most programs produce policies. The gap is the Governance Evidence Lead's entire mandate.
Frankenstein Role
A job description that bundles five or more historically separate capability families into one requisition at one salary. The primary driver of chronic mis-hiring, scope collapse after onboarding, and low tenure stability in AI security.
Why it matters: the corpus-wide average breadth score is 5.21. Almost no organization is exempt.
Chimera Spec
A job description with team-shaped capability requirements — spanning multiple ownership domains — at individual contributor budget and compensation. Related to the Unicorn Index finding. Not a talent shortage problem; a role-design failure.
Why it matters: the fix is role architecture, not compensation. Paying more for the unicorn doesn't make the spec possible.
Methodology & Claim Boundaries
This section defines what this report can and cannot claim, how each signal layer was constructed, and the limits within which its evidence is reproducible. These are not disclaimers — they are operational guardrails. A finding is only as useful as the clarity about what data produced it and what inferences it can support.
Data sources & multi-signal approach
This report triangulates across the report's independent signal layers: (1) ATS job corpus — 293,846+ job descriptions from 5,350 companies spanning 2013–2026, with primary analysis weight on the 241,553 postings from 2026; (2) Primary survey research — four survey instruments across CISOs, practitioners, recruiters, and adjacent engineers; (3) arXiv research momentum — seeded metadata from 903+ academic papers; (4) GHArchive builder ecosystem — 333 classified repos and 415 scoped event proxies in the current ingest window; (5) Media/news corpus — 776K+ items from aggregated RSS/Atom feeds; (6) Wikimedia knowledge codification — concept maturation tracking via public knowledge artifacts; (7) Vulnerability intelligence — normalized CVE/advisory streams with AI-relevance classification and KEV cross-reference. Each signal layer is independently classified against the same AI security taxonomy.
ATS corpus & primary data collection
The ATS job corpus is based on structured analysis of 293,846+ job descriptions (5,350 companies) collected from major applicant tracking systems (ATS) spanning 2013–2026, with primary analysis weight on the 241,553 postings from the 2026 period where AI security signal density is highest. The corpus includes roles across the full AI security adjacent spectrum: explicitly labeled AI security roles, traditional security roles (AppSec, cloud security, penetration testing), software engineering roles with security signals, and general software engineering roles used as a baseline comparator. Job descriptions were deduplicated by content hash and normalized for signal extraction.
AI-native companies, cybersecurity vendors, and financial services are over-represented in the hiring corpus relative to their share of total employment — reflecting their higher job posting volume and ATS adoption rates. Findings account for this concentration where relevant.
Signal extraction & taxonomy
Signal Domain
What it measures
Coverage
Role breadth (Frankenstein score)
Distinct capability families bundled in a single posting — product security, governance, adversarial testing, lifecycle control, agentic controls
All 294K+ postings
Framework mentions
Named governance, compliance, and AI-native security frameworks cited in role language (GDPR, NIST AI RMF, EU AI Act, MITRE ATLAS, ISO 42001, etc.)
All 294K+ postings
Tool mentions
Named security and AI-native tools cited in role language — detection, AppSec, LLM observability, model evaluation, guardrails
All 294K+ postings with quality filters
Agentic surface signals
Vocabulary specific to agent-layer attack surfaces: prompt injection, function calling, tool calling, RAG boundary, jailbreak, model drift, data poisoning
All 294K+ postings
Evidence language
Governance obligation language vs. evidence-producing language (telemetry requirements, eval output requirements, remediation proof, attestation)
Framework-tagged postings
Seniority distribution
Role-family breadth scores stratified by seniority level
AI security-labeled roles
Tool signal quality note: Tool mention signals use substring dictionary matching. Signals with high false-positive risk (where the tool name appears as a substring of common English words) are excluded from named tool analysis. Category-level tool signals aggregate across multiple tools and are more robust than individual tool name signals.
Claim boundaries
Claim Level
Definition
Use
Public Claim Ready
Direct quantitative signal from job description corpus. Reproducible from the extraction taxonomy.
External report, media, decks. Cite with stated corpus caveats.
Public Claim with Caveat
Corpus pattern analysis. Directional signal, not precise measurement. Individual-company variation is real.
Directional assertions, with explicit "based on hiring language" qualifier.
Internal Hypothesis Only
Inferred or extrapolated beyond corpus evidence. Not validated by job description signal.
Research agenda, further study. Not suitable for external publication.
What this data can support: Market-level patterns in hiring language · Framework and tool adoption signals · Role breadth and capability-family trends · Year-over-year signal growth rates
What this data cannot support: Company-level security maturity assessments · Individual practitioner capability evaluations · Proof that any company has or lacks any specific security control · Claims about actual deployed AI systems or security incidents
Sponsor independence: Research methodology, signal taxonomy, and findings are independent of sponsor involvement. Sponsors receive access to findings; they do not influence finding selection, framing, or data interpretation.
About the authors and editors
Contributor notes for the 2026 report
These bios are intentionally brief. They identify the people who shaped the manuscript and the narrow reason each one is included here.
Authors
Primary manuscript authors and research framing.
Author
David Wolf
Building the operating model, controls, detection, and evidence layer for enterprise AI adoption. Translates market signals and regulatory requirements into engineering controls that actually reduce risk.
Relevance
Led the manuscript's framework, control-plane architecture, and operating-model design.
Author
Alex Eisen
Finds real AI attack paths using applied vulnerability research methodology, adversarial testing, and incident-pattern analysis grounded in Mandiant-grade DFIR and 3.5 years of principal AI security research at ServiceNow.
Relevance
Applied security-research and AI-risk framing to the control-plane sections.
Editors
Editorial review for clarity, precision, and publication-safe language.
Editor
Tim Kerimbekov
Risk translation, governance, and security tool guidance grounded in product and enterprise experience.
Relevance
Reviewed risk language and operating-model guidance for practical clarity.
Editor
Alex Karoulias
Backend engineering and adversarial data modeling.
Relevance
Reviewed technical accuracy, attack surface framing, and security architecture language for precision.
aisecurity.llc research
The State of AI Security Engineering Report 2026
This report provides job-description intelligence and aggregate benchmark signals. Findings reflect role-language evidence, not company-level maturity proof.