Co-author
Alex Eisen
Advises on AI risk, incident response readiness, and research-informed product security priorities.
Relevance
Applied security-research and AI-risk framing to the control-plane sections.
THE STATE OF AI SECURITY ENGINEERING
2026 ANNUAL REPORT
A structured analysis of 293,846+ job descriptions across 5,350 companies — validated by direct practitioner surveys, arXiv research momentum, open-source builder activity, and industry media signals. The most comprehensive benchmark of the AI security engineering labor market available.
Required reading for CISOs scoping programs, hiring managers writing requisitions, recruiters sourcing candidates, and practitioners navigating a discipline being defined in real time.
About the authors and editors
These bios are intentionally brief. They identify the people who shaped the manuscript and the narrow reason each one is included here.
Co-authors
Primary manuscript authors and research framing.
Co-author
Advises on AI risk, incident response readiness, and research-informed product security priorities.
Relevance
Applied security-research and AI-risk framing to the control-plane sections.
Co-author
Strategy, product framing, and advisory translation for teams that need a usable operating model.
Relevance
Shaped report structure, executive translation, and public-safe positioning.
Editors
Editorial review for clarity, precision, and publication-safe language.
Editor
Risk-informed security strategy and operating-model guidance grounded in product and enterprise experience.
Relevance
Reviewed risk language and operating-model guidance for practical clarity.
Editor
Legal and policy coverage for teams that need privacy, security, and terms pages updated without losing contractual precision.
Relevance
Reviewed policy language, contract boundaries, and public-safe wording.
Contents
A new security discipline is being staffed before it has been defined. AI Security Engineering has crossed from experimentation into institutional demand — but the organizational infrastructure required to absorb that demand is running years behind. This report documents the gap between what companies say they need and what the market can actually deliver, drawn from 293,846+ job descriptions, direct practitioner surveys, academic research, open-source builder activity, industry media coverage, and vulnerability intelligence. Across seven independent signal layers, the conclusion is the same.
Five numbers that define the 2026 market
| Number | What it means | What this means for your next hire |
|---|---|---|
| 290× | Growth in AI security hiring from 2022 to 2026. Every position is calibrated for senior experience in a discipline that did not exist four years ago. | Every experienced AI security candidate has 2–4 competing offers right now. Scoped single-domain roles close 2–3× faster than chimera specs. Over-scoped roles sit open. |
| 39:1 | Legacy compliance frameworks (GDPR, HIPAA, SOC 2) vs AI-native governance frameworks (NIST AI RMF, EU AI Act, ISO 42001) in hiring language. The compliance reflex is structural, not incidental. | Your JD is almost certainly calibrated for a compliance program manager, not an AI security engineer. If it mentions GDPR but not NIST AI RMF, you're hiring the wrong profile. |
| 8:1 | Traditional security tool vocabulary (detection, SIEM, AppSec scanners) vs AI-native evaluation and observability tools in the same hiring corpus. The tooling stack is calibrated for compliance audit, not AI behavioral risk. | If your JD requires Splunk but not LangSmith, you're screening for infrastructure security, not AI behavioral risk. You will select for the wrong profile. |
| 0.24% | Share of all 2026 postings that contain agentic attack surface language (prompt injection, function calling, tool calling). Agentic deployments are at scale; agentic security hiring is at near-zero. | If you've shipped an AI agent, you have agentic security exposure right now. The hire you need doesn't exist at scale in the market — fractional or contract coverage is likely the only near-term option. |
| 57:1 | Job postings that reference agentic attack surfaces vs postings with agentic control design language. The market is learning the vocabulary of the problem faster than it is staffing for the solution. | Your JD can name the risk — but hiring for the control architect who can actually solve it requires a different archetype than what most postings describe. See the decision tree in §04. |
The consequence of these five numbers is not subtle: companies are posting role language that implies team-shaped capability while budgeting and interviewing as if they are hiring one specialist contributor. The result is a hiring market defined by Chimera Specs — one salary, five professions — and a discipline being invented at the top of the org chart with almost no junior entry pathways. Governance language has outrun evidence language. Compliance vocabulary has crowded out AI-native control language. And the hiring market is running three years behind the deployment curve on the agentic attack surfaces that matter most.
Three structural shifts would change this trajectory. First: separate ownership domains from hiring units — one role, one primary function, one accountability domain. Second: require evidence artifacts at the point of governance obligation creation — every compliance framework reference in a job description should come paired with a named evidence artifact the hire will produce. Third: invest in practical assessment infrastructure now — adversarial AI labs, scenario-based evaluation environments, and archetype-specific interview rubrics are not future-state investments; they are the prerequisite for building any repeatable hiring signal in a discipline without shared standards. The organizations that solve these three problems first will not just hire better — they will define the operating model that the rest of the market eventually adopts.
These conclusions are not derived from a single source. The hiring corpus is the primary evidence layer, but it does not stand alone. Primary practitioner survey research confirms the ownership gap: 27% of respondents report no clear AI security owner in their organization. Self-reported program maturity averages 2/5 — squarely in the emerging band, consistent with executive awareness without engineering delivery. 57% of practitioners recognize AI Security Engineering as an emerging distinct discipline, yet are watching their organizations hire as if it is a subspecialty of AppSec. The most cited risk — Data leakage via AI (38% of respondents) — is precisely the class of threat that agentic deployment creates and that 0.24% of job descriptions currently screen for. Meanwhile, arXiv research shows academic output in agentic attack surfaces accelerating faster than any other AI security domain. GitHub builders are funding the same taxonomy with open-source activity. Industry media is amplifying governance framing to board audiences who interpret it as compliance progress rather than control absence. And public vulnerability disclosures — 1,458 AI-relevant CVEs, 3 confirmed in CISA KEV — confirm that the attacks are not hypothetical: the exploits exist and are being used. The hiring market, the practitioners, the researchers, the builders, the press, and the CVE feed are all observing the same structural condition from different vantage points — and all of them are describing a gap between deployment and defense.
"The hiring market, the practitioners, the researchers, the builders, the press, and the CVE feed are all observing the same structural condition from different vantage points — and all of them are describing a gap between deployment and defense."
Key findings — where the signals converge
Finding #1 · Agentic Attack Surface
Agentic deployments are live; agentic security hiring is near-zero.
511 job postings reference agentic attack surfaces in 2026. Only 9 reference agent control design language — a 57:1 gap. arXiv confirms agentic attack surface research is the fastest-accelerating academic domain. The hire the market needs doesn't exist at scale.
Finding #2 · Compliance Displacement
Legacy compliance frameworks are crowding out AI-native engineering requirements.
39:1 ratio of legacy (GDPR, HIPAA, SOC 2) vs AI-native (NIST AI RMF, EU AI Act, ISO 42001) framework mentions in 2026 hiring language. Organizations are hiring compliance program managers for AI security engineering roles — and selecting for the wrong archetype.
Finding #3 · LLM Framework Vulnerability Density
AI/ML product and framework vulnerabilities are the dominant disclosed attack surface.
1,458 AI-relevant CVEs classified from 26K+ total records. The top bucket — ai ml framework library vulnerability — accounts for 378 disclosures. 3 appear in CISA KEV (active exploitation confirmed). The vulnerabilities exist; the roles to remediate them are barely hired.
Finding #4 · Entry Pipeline Collapse
290× hiring growth with zero junior pathways is building a structural staffing collapse.
AI security hiring grew 290× from 2022 to 2026. Every role targets senior experience in a discipline that did not exist four years ago. There is no OSCP equivalent for AI security, no standardized assessment infrastructure, no certifiable junior curriculum. The pipeline cannot refill without deliberate investment now.
Finding #5 · Signal Convergence Confirms the Structural Diagnosis
Six independent data sources — none sharing methodology or institutional source — describe the same gap.
The ATS corpus (employer language), practitioner survey (first-person experience), arXiv (academic attention), GHArchive (builder behavior), media (editorial coverage), and CVE disclosures (exploit pressure) are each independently classified against the same AI security taxonomy. Each produces a different slice of evidence. All six describe the same structural gap between deployment and defense. When independent systems converge this consistently, the convergence is the argument — not any single signal.
Fifteen market findings derived from job description signal analysis across 293,846+ postings. Each finding is a reusable market concept supported by corpus evidence, with explicit claim boundaries and audience-specific action implications.
| # | Finding | Key signal | Your next step |
|---|---|---|---|
| 01 | The Frankenstein Role | Avg breadth 6.79 — Gov & Defense leads all industries | Run a role-architecture workshop before opening any AI security req |
| 02 | Skill Washing | 100K+ AI-labeled roles in Manufacturing with lowest AI specificity | Rewrite JD around named controls and evidence artifacts, not title keywords |
| 03 | The Unicorn Index | Every security role family averages 5.9–6.5 capability breadth | Scope each role to one primary ownership domain — use the checklist in §05 |
| 04 | The Probability Pivot | 57:1 agentic attack surface vs control design language in 2026 | Add probabilistic failure-mode scenarios to every AI security interview loop |
| 05 | The Evidence Gap | GDPR 5,461 jobs (0.9% evidence language) vs NIST AI RMF 91 (13.2%) | Map every governance obligation to a named evidence artifact before posting |
| 06 | Agentic Anarchy | 511 agentic attack surface signals vs 9 agent control postings — 2026 | Assign Agent Security Engineer ownership before your next agentic deployment |
| 07 | The vCISO Vacuum | Mid-market AI exposure before full-time staffing maturity — no designed model | Define a staged operating model — fractional or MSSP — before the first hire |
| 08 | Boardroom-to-Backlog Gap | Privacy framework 8,555 jobs vs AI Governance 295 jobs — 29:1 | Require every board AI risk narrative to map to a named backlog item with an owner |
| 09 | Skills Validation Gap | No OSCP equivalent for AI security — generic loops, archetype-specific roles | Build archetype-specific interview rubrics before the first screen — see §04 |
| 10 | Model Supply Chain Blind Spot | Runtime signals dominate; artifact integrity and provenance under-specified | Add model lifecycle and provenance control ownership to every AI security scope |
| 11 | Entry-Level Extinction | 290× growth 2022–2026; zero junior pathways; senior-only into new discipline | Create explicit junior-to-mid transition pathways now — not after the pipeline collapses |
| 12 | The Red Team Misnomer | "AI red team" applied to governance, product, platform, and abuse testing equally | Name three specific adversarial exercise types in every red team posting |
| 13 | The Compliance Reflex | Legacy frameworks 12.4K jobs vs AI-native frameworks 317 — 39:1 | Add one AI-native framework reference for every legacy compliance requirement in your JD |
| 14 | The Tool Incumbency Trap | Detection/AppSec (4.2K jobs) vs AI-native eval + observability (552 jobs) — 8:1 | Audit AI security tooling against AI-specific threat coverage, not compliance coverage |
| 15 | The Agentic Surface Emergence | 0.24% of postings mention agentic controls; trajectory steep from near-zero | Add prompt injection and function calling scenarios to interview loops now |
Finding 01 · Talent & role-design crisis
AI Security Engineer postings increasingly bundle five historically separate capability families into one requisition — at one salary.
AI feature velocity, buyer scrutiny, and compliance pressure converged before organizational design matured. Companies needed someone who could simultaneously own product security, model governance, adversarial testing, regulatory evidence, and agentic controls. They wrote that into one job description instead of designing a team.
Average role breadth score by industry — 2026
Score reflects number of distinct capability families bundled in role language. Higher score = more Frankenstein. Scale: 0–10.
The corpus-level average role breadth score is 5.21 — across all 294K+ analyzed postings. Government and Defense leads all industries at 6.79, followed by Retail and Ecommerce at 6.29. Financial Services — with 38.8K jobs, the largest single scoped-security market — sits at 5.44. Manufacturing leads in volume (100K postings) but has the weakest AI-native specificity. Even Telecommunications, the most conservative sector, posts roles averaging 3.73 capability families. No industry is writing clean, scoped AI security roles.
"No industry is writing clean, scoped AI security roles. The Frankenstein pattern is not a failure of individual job descriptions — it is a failure of organizational role architecture across an entire discipline."
What leaders are misreading
Leaders treat this as a talent shortage problem alone. It is primarily a role-design failure. The scarcity is not of capable people — it is of organizations willing to do the architectural work of separating ownership domains before opening requisitions.
Failure mode if unaddressed
Chronic mis-hiring, scope collapse after onboarding, low tenure stability, and perpetual re-opening of the same role.
What this changes now
Finding 02 · Title/substance mismatch
AI-labeled security titles often outpace the AI-specific control, testing, and evidence language inside the same posting.
Title demand outpaced discipline standardization. Recruiting teams adopted AI prefix language as a market signal before organizations developed meaningful AI-specific control requirements to pair with it. The title changed; the job description content did not.
AI security-labeled job volume by industry — 2026
Job count for roles with AI security signals in title or description. High volume does not imply AI-specific control language depth.
Manufacturing Industrial and OT leads by a wide margin at 100K AI security-labeled roles in 2026 — but this sector also has the weakest AI-native specificity signals. Financial Services posts 38.8K such roles and Government/Defense 30.5K, sectors where legacy compliance language dominates. The AI label is applied broadly; the AI-specific control substance is applied narrowly.
"The AI label is applied to the title; the AI-specific control substance is applied to almost nothing. The gap between these two numbers is not a measurement artifact — it is an organizational design failure at industry scale."
What leaders are misreading
AI in the title is treated as proof of AI security scope. It is not. Title density is a market demand signal, not a capability evidence signal.
Failure mode if unaddressed
Teams hire for legacy security profiles while believing they have staffed AI risk. The coverage gap is invisible until an incident makes it legible.
What this changes now
Finding 03 · Team-shaped requirements
The market prices one role while describing team-level capability breadth. Every role family is affected — none is exempt.
Immediate pressure to cover product, governance, and customer assurance simultaneously led hiring managers to compress team-shaped requirements into single-contributor budgets.
Average role breadth score by security role family — 2026
Breadth score across all roles assigned to each security capability family. The unicorn problem is not confined to AI roles.
Data Security roles lead with a 6.53 average breadth score, followed by Application Security at 6.49 and Product Security at 6.44. AI Security-specific roles sit at 6.42 — high, but not uniquely high. The unicorn problem is structural across security hiring, and AI has not made it worse — it has made it visible.
"Every security role family sits above 5.9 on the breadth scale. The Unicorn problem is not uniquely an AI security problem — AI has simply made it impossible to ignore. You cannot hire your way out of a role architecture failure."
What leaders are misreading
Compensation is treated as the only lever. Scope architecture is the lever that actually matters. Paying more for the unicorn does not change the impossibility of the spec.
Failure mode if unaddressed
Open roles stay unfilled or are filled with scope collapse: the hire arrives, renegotiates scope in week four, and the role re-opens within eighteen months.
What this changes now
Finding 04 · Systems reasoning shift
AI security demands a cognitive shift from deterministic defect reasoning to probabilistic systems reasoning — but neither hiring loops nor interview rubrics have adapted to evaluate it.
Traditional security is built on deterministic failure logic: a buffer overflows or it does not. A CVE is present or patched. AI systems fail probabilistically: the same prompt produces different outputs across temperature settings, context window states, and retrieval results. An adversarial input may work 40% of the time, not 100%. A control that reduces attack success from 80% to 20% is meaningful even though it does not eliminate the vector. This is an entirely different reasoning mode — and it is not what standard AppSec or penetration-testing interview loops are designed to evaluate.
Jobs mentioning agentic attack surfaces — by year
Rapid growth in attack surface language vs near-zero growth in agent control language. The discipline is becoming aware of the problem faster than it is staffing for the solution.
Agentic attack surface signals appeared in effectively zero job postings before 2024. By 2025: 48 postings. By 2026: 511 — a 10× year-over-year jump in a single year. The same 2026 dataset shows only 9 postings with meaningful agent control language — a 57:1 gap between attack surface awareness and control design language. This gap is not a data artifact. It reflects a genuine organizational pattern: teams recognize the agentic attack surface exists before they know how to architect controls for it. Interview loops still screen for static AppSec knowledge — deterministic vulnerability identification, CVE analysis, exploit reproduction. None of these evaluate the probabilistic tradeoff reasoning that AI security actually requires.
What leaders are misreading
Security interview performance on traditional vulnerability questions is used as a proxy for AI security capability. Strong AppSec candidates may fail at probabilistic reasoning; strong probabilistic reasoners may not have AppSec backgrounds. The screen is measuring the wrong variable.
Failure mode if unaddressed
Hiring systematically selects for deterministic-thinking security candidates in roles that require probabilistic-reasoning AI security judgment. The discipline builds a talent cohort optimized for the wrong problem class.
What this changes now
Finding 05 · Governance-to-execution gap
Governance language appears before engineering evidence language in AI security hiring. Organizations can describe the policy obligation but cannot yet describe the proof.
Policy and framework adoption moved faster than productionized control instrumentation. Boards demanded governance narratives; governance teams wrote policies; engineering teams were not yet staffed to produce the evidence that would validate those policies.
Framework adoption in AI security hiring — 2026
Cyan = legacy compliance & privacy frameworks · Violet = AI-native governance frameworks. GDPR (5,461 jobs, 0.9% with evidence language) vs NIST AI RMF (91 jobs, 13.2% with evidence language) — the inverse specificity paradox.
GDPR appears in 5.5K job postings — the single largest framework signal in 2026. Only 0.9% of those contain evidence-producing language such as control attestations or telemetry requirements. HIPAA: 3.1K postings, 0% evidence language. SOC 2: 1.9K postings, 0% evidence language. The signal inverts with AI-native frameworks: NIST AI RMF appears in just 91 postings, but 13.2% contain evidence-producing language. ISO/IEC 42001: 15 postings, 13.3% evidence language. Where organizations are deliberately staffing AI-native governance, they write more rigorous requirements — but almost none are doing so yet.
What leaders are misreading
Policy completion is interpreted as risk reduction. Governance programs that produce narrative confidence without operational proof are not risk reduction programs — they are risk documentation programs.
Failure mode if unaddressed
Governance programs accumulate policy artifacts while control behavior remains unmeasured. The board believes the posture is improving; the actual posture is static.
What this changes now
Finding 06 · Delegated action risk
Agent security is delegated action security. The market is still framing it as chatbot security — a category error with operational consequences.
Tool-calling and workflow automation expanded AI blast radius from response quality to action execution. A chatbot that says the wrong thing is a reputational risk. An agent that takes the wrong action is an operational one. The control architectures for these two scenarios are fundamentally different.
Agentic attack surface signals — job mention frequency
Count of job postings mentioning each agentic attack surface. Function calling, prompt injection, and tool calling are the dominant emerging signals.
The agentic attack surface vocabulary is present but sparse. Function Calling appears in 278 postings; Prompt Injection in 258; Tool Calling in 236. Combined, these three represent under 800 job postings against a 2026 corpus of 242K — less than 0.35%. Meanwhile, agentic deployments are accelerating across every vertical. The hiring market is running approximately three years behind the deployment curve on agentic control vocabulary.
"A chatbot that says the wrong thing is a reputational risk. An agent that takes the wrong action is an operational one. The control architectures for these scenarios are not the same."
What leaders are misreading
Prompt-layer defense is treated as sufficient control architecture for agentic systems. It is not. Prompt hardening addresses what the model says. Action authorization addresses what the agent does — and action authorization is almost entirely absent from the hiring vocabulary.
Failure mode if unaddressed
Permitted agent actions become high-impact misuse pathways. Authorization, rollback, and audit capabilities are absent at launch because no one owned them.
What this changes now
Finding 07 · Mid-market exposure gap
Many organizations are too small to hire the AI security unicorn but too exposed to defer. Mid-market companies are the gap the market has not designed for.
AI exposure emerges before dedicated staffing maturity. A 200-person SaaS company shipping AI features faces the same model behavior, agentic control, and data-boundary obligations as a 20,000-person enterprise — without the headcount budget to staff a full AI security program. The discipline has no designed operating model for this segment.
AI security job volume by industry — 2026
Hiring volume by sector. High-volume markets with low role specificity (Manufacturing) have the most acute vCISO vacuum — exposure without staffing architecture to address it.
The vCISO vacuum is most acute at the intersection of three conditions: organizations large enough to face real AI security exposure, too small to hire a dedicated AI security team, and operating in sectors without established third-party support models. Manufacturing leads on volume (100K postings) but has the weakest AI-specific control language — which signals that most of those organizations are posting aspirational language without the internal expertise to execute it. Financial Services faces a different version: the governance vocabulary exists, but mid-market firms (500–5,000 employees) cannot absorb a unicorn AI security hire at the compensation level required. MSSPs and fractional AI security services represent the structurally correct answer to this gap — a managed capability model where the expertise is amortized across clients, and the operating model is designed for staged delivery rather than a single hire.
"A 200-person SaaS company shipping AI features faces the same agentic control and data-boundary obligations as a 20,000-person enterprise — without the headcount to staff a response. The market has not designed an operating model for this gap. That is an MSSP opportunity, not a hiring problem."
What leaders are misreading
Staffing constraints are treated as justification for deferral. They are the reason to adopt a managed or fractional operating model — not the reason to defer risk exposure.
Failure mode if unaddressed
AI exposure accumulates without named controls or evidence artifacts. When an incident occurs, the organization discovers simultaneously that it has no controls, no evidence of prior posture, and no internal expertise to respond.
What this changes now
Finding 08 · Execution translation failure
Executive AI risk narratives fail to translate into named engineering controls, accountable owners, and evidence artifacts — the machinery to answer the board's AI risk questions does not exist.
Board pressure on AI risk arrived before organizations built the execution infrastructure to respond to it. The result is a performative loop: the board asks, the CISO narrates, governance teams document, and engineers are not yet staffed or empowered to produce the evidence artifacts that would validate any of it. The gap between the boardroom question and the backlog item with an owner is where AI risk exposure actually lives.
Framework category job volume — 2026
Privacy and Compliance framework hiring dominates. AI Governance hiring (295 jobs) is 29× smaller than Privacy framework hiring — but AI Governance roles write more rigorous evidence language when they do appear.
The boardroom-to-backlog gap is visible in two ways in the hiring data. First: Privacy framework hiring (8.6K jobs) and Compliance hiring (3.8K jobs) dwarf AI Governance hiring (295 jobs) — organizations are staffing compliance narrative, not AI control execution. Second: fewer than 1% of Privacy and Compliance framework job postings contain evidence-producing language. The hiring architecture reflects the governance architecture: policy is produced; proof is not. This is not a CISO failure — it is an organizational design failure. The board deck exists; the backlog item with a named owner and an evidence requirement does not.
What leaders are misreading
Strategy articulation is treated as execution readiness. Board confidence in the CISO's AI risk narrative is confused with board visibility into control posture. These are different things, and the difference will surface in the next incident.
Failure mode if unaddressed
Risk narratives repeat unchanged across four to eight board cycles. Each cycle the narrative grows more detailed; the underlying control posture remains unmeasured. When an incident occurs, the organization discovers it cannot demonstrate what it claimed to the board.
What this changes now
Finding 09 · Assessment maturity lag
The market demands AI security engineering capability before it has standardized practical evaluation pathways.
Role demand accelerated before assessment models matured. There are no standardized AI security certifications, no widely-accepted lab environments for AI security skill demonstration, and no practical exam pathway equivalent to OSCP or equivalent for AI-specific attack surfaces. Organizations are running generic security interviews against AI security requirements — selecting candidates through loops calibrated for a different discipline.
AI security job volume by security role family — 2026
Each role family requires distinct assessment approaches. ML/AI Engineering hiring uses engineering screens calibrated for model building, not model security. Governance/GRC hiring uses policy screens calibrated for compliance, not control evidence.
The skills validation gap is structural. Organizations are hiring across 13+ security role families under AI security labels, each with distinct competency profiles that generic interview loops cannot differentiate. A standard AppSec interview tests for vulnerability identification in code — it does not evaluate agentic control design, RAG boundary enforcement, or probabilistic failure-mode reasoning. There is no OSCP equivalent for AI security. No standardized adversarial AI lab examination. No practical certification pathway that tests whether a candidate can exploit a RAG retrieval pipeline, bypass a prompt injection filter, or design a delegated-action authorization policy under adversarial conditions. The emerging solution pathway is scenario-based practical assessment: hands-on cyber range environments where candidates demonstrate real judgment — not recall — against deployed AI system configurations. Exercises that mirror real attack scenarios: prompt injection chains against agentic workflows, RAG boundary manipulation to extract context-window data, model behavior evaluation under distribution-shift attacks, authorization logic design for tool-calling systems. This mirrors how traditional security assessment evolved from certification to practice-based evaluation — a shift the AI security discipline is only beginning to make. The organizations that standardize practical AI security assessment infrastructure first will build the only reliable hiring signal in the market, and create the first reusable evaluation standard for the discipline.
"There is no OSCP equivalent for AI security. No standardized adversarial AI lab. No practical exam for prompt injection exploitation, agentic control design, or RAG boundary enforcement. The market is hiring at scale for a discipline the assessment ecosystem cannot yet validate."
What leaders are misreading
Credential density (certifications, degree programs, tool familiarity) is treated as competency evidence. It is a prior-discipline filter, not an AI security evaluation. The skills that matter most — ambiguity tolerance, probabilistic failure reasoning, control design under non-deterministic conditions — are not assessed by any current credential pathway.
Failure mode if unaddressed
Selection quality becomes noisy and non-repeatable across interviewers. False positive rate is high, false negative rate is equally high (good candidates rejected for the wrong signals). Scope collapse occurs after onboarding because the interview loop measured credentials, not capability.
What this changes now
Finding 10 · Lifecycle control deficit
Model provenance, artifact integrity, dependency management, and deployment gates are systematically under-specified in AI security role language.
Organizations focus first on model behavior and user interaction risks. The assumption is that if the model behaves correctly at inference time, the system is secure. Supply chain compromise does not attack inference — it attacks the artifact pipeline upstream of it.
Attack surface signal job mention frequency
Frequency of attack surface vocabulary in AI security job postings. Model supply chain signals (model weights, data poisoning) lag behind runtime and agentic signals.
The attack surface vocabulary in AI security hiring is dominated by runtime and agentic signals: function calling, prompt injection, tool calling. Model supply chain signals — model weights, data poisoning, artifact integrity — appear far less frequently, suggesting that lifecycle security ownership is not yet a primary hiring criterion. Organizations actively building AI security programs are staffing runtime controls; lifecycle controls remain an afterthought. This creates a specific risk class: an attacker who poisons the training dataset, corrupts model weights during packaging, or injects backdoors into a fine-tuning pipeline bypasses all runtime controls because the compromise occurred upstream of every inference-time defense.
"Runtime controls address what happens when the model runs. They do not address what happens when the model was built from a poisoned dataset, packaged from a compromised artifact, or deployed through a backdoored fine-tuning pipeline. The supply chain attack surface is outside every runtime security dashboard in the current stack."
What leaders are misreading
Runtime controls are treated as complete security posture. They address what happens when the model runs. They do not address what happens when the model was built, packaged, or deployed from a compromised artifact.
Failure mode if unaddressed
Silent supply chain risk accumulates outside visible incident pathways. When it surfaces, it bypasses all runtime controls because the compromise occurred upstream.
What this changes now
Finding 11 · Talent supply crisis
AI Security Engineering is being invented at the top of the org chart. The market is hiring senior-only into an unproven discipline, with almost no junior pathways.
Immediate risk pressure and budget constraints favor experienced hiring language. Organizations need someone who can own the domain on day one. The consequence is a discipline with no talent pipeline — which is sustainable for approximately one hiring cycle.
AI security-adjacent job posting volume — by year
Year-over-year growth of AI security role postings. Explosive recent growth with no junior pipeline creates a structural supply problem.
The growth trajectory is extraordinary: 134 postings in 2022, 526 in 2024, 7.8K in 2025, 39.3K in 2026. This is a 290× expansion in four years. Every position in this expansion is calibrated for senior experience in a discipline that did not exist four years ago. The pipeline problem is not hypothetical — it is already present, and the organizations hiring aggressively today are consuming the very talent pool they will depend on in three years.
"290× growth in four years. Every position calibrated for senior experience in a discipline that did not exist four years ago. The pipeline problem is not coming — it is already here."
What leaders are misreading
Senior-only staffing appears efficient in the short term. It consumes available experienced candidates without building the pipeline that will replace them. The efficiency is borrowed from the future.
Failure mode if unaddressed
Future mid-level talent pipeline collapses. Organizations that did not invest in junior pathways will face a structural shortage of experienced AI security engineers at exactly the moment when they have the budget and maturity to hire them.
What this changes now
Finding 12 · Role language confusion
"AI red team" is used as a catch-all for governance reviews, product assessments, platform controls, and abuse testing — diluting the term to meaninglessness.
The phrase carries market credibility and executive legibility. It is used as shorthand for broad AI risk work because it is understood by leadership and valued in the market, regardless of what the actual role delivers.
Job volume by role classification bucket — 2026
AI Security-specific roles represent a small fraction of total security hiring but carry the highest average role breadth score — the clearest signal of the Frankenstein problem concentrated at the AI-labeled tier.
AI Security-specific roles number just 201 in the 2026 dataset against 3,894 traditional security roles and 28,768 software engineering roles. But those 201 roles carry the highest average Frankenstein score — 6.52 — of any role bucket. Organizations using precise AI security language are simultaneously writing the most over-scoped roles. The "AI red team" label is a primary driver: it is applied to adversarial prompt testing, product risk assessment, governance review, platform security architecture, and abuse testing — interchangeably, in the same posting, against a single hire. A real AI red team exercise is a hands-on adversarial evaluation against a deployed model or agent system: crafting inputs that elicit unsafe behavior under realistic operational conditions, testing authorization boundaries at inference time, exploiting RAG retrieval pipelines to extract out-of-scope context, mapping tool-calling attack paths through multi-step agentic workflows. This requires lab environments, reproducible finding formats, and evaluators who understand probabilistic failure modes — not governance documentation skills. What most "AI red team" postings actually describe is risk program management or product security review with an adversarial framing — a categorically different function requiring a categorically different profile.
"A real AI red team exercise means crafting adversarial inputs against a deployed model, testing authorization boundaries at inference time, and exploiting RAG retrieval pipelines under realistic attack conditions. Most 'AI red team' postings describe governance review with a red-team brand applied to it."
What leaders are misreading
Label precision is assumed from title vocabulary. "Red team" is assumed to mean active adversarial evaluation. It frequently means risk review with a red-team brand. The candidate who gets hired is selected for governance fluency; the organization then discovers it cannot conduct an adversarial AI exercise.
Failure mode if unaddressed
Organizations build "AI red team" programs that produce governance documentation rather than adversarial findings. The program exists; the adversarial capability does not. Security posture remains unmeasured at exactly the layer the label promised to test.
What this changes now
Finding 13 · Legacy framework dominance
Legacy compliance frameworks dominate AI security hiring language by 39:1 versus AI-native governance frameworks — GDPR alone outweighs all AI-native frameworks combined.
Regulatory readiness programs were already funded and staffed when AI security emerged as a hiring need. Organizations reached for the framework vocabulary they already had — GDPR, HIPAA, SOC 2, FedRAMP — rather than developing AI-native governance language.
Named framework adoption in AI security hiring — 2026
Cyan = legacy compliance and privacy frameworks · Violet = AI-native governance frameworks. GDPR (5,461) alone exceeds EU AI Act + NIST AI RMF + MITRE ATLAS + ISO 42001 combined (317) by ~17×.
Named framework data makes the disparity concrete: GDPR appears in 5.5K job postings, HIPAA in 3.1K, SOC 2 in 1.9K, FedRAMP in 1.2K, PCI DSS in 694. Against this: EU AI Act appears in 189 postings, NIST AI RMF in 91, MITRE ATLAS in 22, ISO/IEC 42001 in 15. The 39:1 ratio is a direct measure of which vocabulary organizations actually use when staffing AI security. Legacy compliance frameworks produce familiar evidence artifacts — audit reports, attestation letters — that organizations already know how to generate. AI-native governance frameworks require evidence types most organizations cannot yet produce.
"GDPR in 5,461 postings. NIST AI RMF in 91. Organizations are using the vocabulary they already own to staff a risk they do not yet understand."
What leaders are misreading
Framework density in hiring language is interpreted as AI security readiness. A posting requiring GDPR and SOC 2 expertise is not an AI security posting. It is a compliance posting with an AI prefix.
Failure mode if unaddressed
Teams ship AI systems with governance narratives built on legacy framework compliance, while leaving AI-specific control surfaces — model behavior, agent authorization, prompt security — unmeasured and unevidenced.
What this changes now
Finding 14 · Incumbent tooling lock-in
Traditional security tooling appears in AI security hiring at 8:1 versus AI-native evaluation and observability tooling — the tooling stack mirrors the compliance reflex exactly.
Organizations procure through existing trust paths and repurpose familiar tooling rather than adopting AI-specific testing stacks. Each individual decision is rational. The collective result is a security program whose tool vocabulary is calibrated for compliance audit, not AI behavioral risk.
Named tool adoption in AI security hiring — 2026
Cyan = traditional security tools (SIEM, detection, AppSec) · Violet = AI-native evaluation and observability tools. Combined AI-native total (479) trails combined incumbent total by 4:1.
The named tool picture is unambiguous: Splunk appears in 1.1K AI security job postings, Falco in 485, Semgrep in 141, Snyk in 137. Against these: LangSmith in 260, Langfuse in 145, Ragas in 49, DeepEval in 25. At the category level the gap is larger — Detection and Response tooling totals 3.7K job mentions versus 552 for all AI-native evaluation and observability tools combined — 8:1. Splunk detects attacker behavior in logs. It does not evaluate model output distribution, detect prompt injection at inference time, or assess delegated-action authorization posture. These are structurally different capabilities requiring different tooling — and that tooling is almost absent from AI security hiring language.
What leaders are misreading
Tool familiarity is treated as AI risk coverage. SIEM and AppSec scanners address what attackers do to infrastructure. They do not address what AI systems do under adversarial prompt conditions or what agents do when tool-call authorization is absent.
Failure mode if unaddressed
Security programs build audit-ready portfolios with familiar tooling while systematically underinvesting in AI behavioral evaluation. The coverage gap is structurally invisible inside existing security tool dashboards — it will surface first as an incident.
What this changes now
Finding 15 · Early but accelerating risk surface
Prompt injection, function calling, and tool calling security signals are still under 0.3% of all postings, but rising quickly from a near-zero baseline.
Deployment of tool-calling and function-calling systems is outpacing hiring-market adaptation. AI agents are being built and shipped; the hiring market does not yet have language for the controls those agents require.
Agentic attack surface signal job frequency
Total job postings containing each agentic attack surface signal. Combined total is under 1,300 mentions against 294K+ analyzed postings — but the growth rate is the signal.
Combined agentic attack surface signals total 1205 job mentions — approximately 0.24% of the analyzed corpus. Function Calling leads at 278, Prompt Injection at 258, Tool Calling at 236. Jailbreak (75), Model Drift (64), and Data Poisoning (54) follow. These numbers look small. They are the leading edge of a surface that is growing faster than the hiring market can name it. The appropriate response is not to wait until the numbers are large — by then the deployment gap will be measured in years, not months.
"Agentic systems in production are already at scale. Agentic security hiring is at near-zero. The deployment-to-hiring gap is measured in years, not quarters."
What leaders are misreading
Low absolute share is interpreted as low urgency. The urgency signal is in the deployment-to-hiring gap, not the absolute numbers. Agentic systems in production are already at scale; agentic security hiring is at near-zero.
Failure mode if unaddressed
Delegated-action controls lag deployment and become latent high-impact risk. By the time the hiring market normalizes agentic security vocabulary, the first generation of agentic systems will have been operating for years without appropriate controls.
What this changes now
Ten industry verticals analyzed across AI security hiring signal, role breadth, and control specificity. Each profile represents the pattern visible in job description language — not company-level security posture.
AI security role breadth score by industry — 2026
Government and Defense leads in role breadth (6.79), meaning job descriptions bundle the most capability families. Manufacturing has the highest volume (100K+ jobs). Financial Services has the largest scoped-security hiring market.
Signal: Control frameworks, model risk validation, regulatory vocabulary, auditability.
Blind spot: Weak AI-specific testing workflows and tooling language.
Hire for: Governance Evidence Lead + AI Product Security Engineer pairing.
Model RiskRegulatoryEvidence-firstSignal: RMF, ATO requirements, assurance controls, mission resilience, formal risk language.
Blind spot: Low specificity on contemporary AI evaluation and agentic control practices.
Hire for: AI Security Architect + ML Security Engineer pairing.
Compliance-heavyAssuranceRMF/ATOSignal: PHI controls, HIPAA, clinical-risk vocabulary, sensitive data handling.
Blind spot: Low operational detail on adversarial testing, RAG validation, and safety/security integration.
Hire for: Hybrid profile: privacy control depth + practical AI abuse testing.
PHI/HIPAAClinical RiskPrivacy-denseSignal: Reliability, continuity, OT context, safety and mission-impact language.
Blind spot: Generic LLM-attack narratives have low relevance; lifecycle and operational controls dominate.
Hire for: AI Security Architect with industrial systems depth.
OT/ICSOperational SafetyResilienceSignal: RAG access, agents, model APIs, evals, safety controls, rapid deployment.
Blind spot: Explicit mapping to formal control frameworks and evidence standards.
Hire for: Agent Security Engineer + Governance Evidence Lead early.
AgenticRAGFast-movingSignal: RAG access boundaries, tenant isolation, customer data controls, enterprise trust.
Blind spot: Role inflation mixing product security ownership with governance program design.
Hire for: AI Product Security Engineer + RAG Security Engineer as distinct mandates.
Multi-tenantProduct SecuritySaaS TrustSignal: Evaluation rigor, detection quality, customer assurance, field credibility.
Blind spot: Conflating adversarial testing, product assurance, and solutions-facing roles.
Hire for: Field-deployable AI security experts who can build and explain.
CredibilityAdversarialProduct AssuranceSignal: Governance, decision integrity, fraud risk, compliance obligations.
Blind spot: Low specificity on adversarial testing and practical red-team workflows.
Hire for: Model Risk Security Partner with explicit technical validation support.
Decision RiskFraudModel GovernanceSignal: Fraud prevention, account abuse, customer-data protection, operational detection.
Blind spot: Architecture-level delegated-action controls under-specified.
Hire for: Agent Security Engineer + AI Product Security Engineer collaboration.
Fraud/AbuseCustomer TrustAgenticSignal: Detection, automation, response, network resilience, high-volume control operations.
Blind spot: Under-specified governance-to-product execution pathways.
Hire for: AI AppSec Engineer + Agent Security Engineer with operations maturity.
ScaleDetectionOperationsNine distinct AI security engineering archetypes, each with explicit mission scope, trigger conditions, boundary definitions, danger signals, and first-90-day deliverables. Use these as role design inputs — not job description templates.
AI security job volume by role family — 2026
How current hiring distributes across role families. ML/AI Engineering leads by volume (30K+) but carries lower AI-security specificity. The 9 archetypes below map to these families — each with a clean scope that the market rarely writes cleanly.
Which archetype do you need?
Answer one question about your primary use case. The right archetype follows directly.
| Your primary use case | The one question to ask | Right archetype |
|---|---|---|
| Shipping AI features in a product | Are AI features secure before they reach production? | AI Product Security Engineer |
| Embedding AI security in the dev process | Do developers know how to write AI-safe code? | AI AppSec Engineer |
| Proving your AI systems are exploitable (before attackers do) | Can you run an adversarial exercise against your deployed model today? | AI Red Team Engineer |
| Securing AI agents that take real actions | Who owns authorization and rollback for delegated-action workflows? | Agent Security Engineer |
| Controlling what data RAG can surface | Can a query return data the user isn't authorized to see? | RAG Security Engineer |
| Producing evidence for governance obligations | Can you show the board a control that proves the policy works? | Governance Evidence Lead |
| Securing model development and deployment | Who owns artifact integrity and training pipeline security? | ML Security Engineer |
| Converting model risk into security controls | Do model risk assessments produce enforceable control requirements? | Model Risk Security Partner |
| Defining AI security standards across teams | Is there a shared trust model and control ownership map across all AI teams? | AI Security Architect |
Risk domain ownership matrix
● Primary domain · ○ Contributing · — Not in scope. Use to identify coverage gaps and avoid ownership conflicts.
| Archetype | Prompt Security | RAG / Retrieval | Agent / Action Auth | Model Lifecycle | Gov Evidence | Adversarial Testing | Product / SDLC | Architecture |
|---|---|---|---|---|---|---|---|---|
| AI Product Security Engineer | ● | ● | ○ | — | ○ | ○ | ● | ○ |
| AI AppSec Engineer | ● | ○ | — | — | — | ● | ● | — |
| AI Red Team Engineer | ● | ● | ● | — | — | ● | — | — |
| Agent Security Engineer | ● | — | ● | — | ○ | ● | ○ | ○ |
| RAG Security Engineer | ○ | ● | — | — | ○ | ● | — | — |
| Governance Evidence Lead | — | — | — | ○ | ● | — | — | ○ |
| ML Security Engineer | — | — | ○ | ● | ○ | — | ○ | ○ |
| Model Risk Security Partner | — | ○ | — | ● | ● | ○ | — | — |
| AI Security Architect | ○ | ○ | ○ | ○ | ● | — | ○ | ● |
AI Product Security Engineer
High demandSenior ICSecure AI-enabled product capabilities from design through release and post-release operation.
Boundary
Does not own enterprise-wide governance strategy or model lifecycle outside product scope.
Anti-pattern to avoid
Overloaded with policy ownership and customer-assurance narrative without implementation authority.
First 90-day outputs
Product threat model set, AI feature control backlog, release-gate checklist, customer assurance pack.
AI AppSec Engineer
High demandMid to Senior ICIntegrate AI abuse patterns and controls into secure SDLC practice.
Boundary
Does not own broad AI governance program design.
Anti-pattern to avoid
Measured on generic vulnerability volume instead of AI-specific control outcomes.
First 90-day outputs
AI abuse-case library, secure coding guardrails, review workflow updates, remediation playbook.
AI Red Team Engineer
Scarce — criticalSenior IC / StaffExecute adversarial evaluation against AI systems and corresponding controls.
Boundary
Does not own governance reviews or architecture assessments by default.
Anti-pattern to avoid
Labeled "red team" but scoped as policy review or general risk management.
First 90-day outputs
Adversarial scenario suite, reproducible finding format, retest protocol, control hardening recommendations.
Agent Security Engineer
Critical — emergingSenior IC / StaffSecure delegated-action pathways for tool-calling and autonomous workflows.
Boundary
Does not own all conversational safety policy and UX moderation concerns.
Anti-pattern to avoid
Confined to prompt-layer defenses while action authorization is left undefined.
First 90-day outputs
Delegated-action threat model, authorization matrix, approval and rollback design, audit trail requirements.
RAG Security Engineer
High — growingSenior ICEnforce retrieval integrity, context boundaries, and data-access control in RAG systems.
Boundary
Does not own full enterprise data-governance operations.
Anti-pattern to avoid
Treated as a prompt-engineering specialist instead of a retrieval-control engineer.
First 90-day outputs
Retrieval control architecture, access boundary tests, leakage detection checks, incident triage flow.
Governance Evidence Lead
High — regulatory pressureSenior IC / ManagerTranslate governance requirements into verifiable engineering evidence.
Boundary
Does not own implementation of every control — owns the evidence standard.
Anti-pattern to avoid
Assigned reporting responsibility without authority to enforce evidence quality.
First 90-day outputs
Policy-to-control matrix, artifact taxonomy, reporting cadence, assurance quality gates.
ML Security Engineer
Growing — nicheSenior ICSecure model development, packaging, deployment, and serving pathways.
Boundary
Does not own enterprise risk narrative or board reporting by default.
Anti-pattern to avoid
Reduced to inference-endpoint hardening while artifact lifecycle remains unmanaged.
First 90-day outputs
Model lifecycle control map, artifact integrity checks, deployment gate definitions, monitoring baselines.
Model Risk Security Partner
High in regulated verticalsSenior IC / DirectorConvert model risk language into enforceable security control requirements.
Boundary
Does not own standalone product engineering roadmap.
Anti-pattern to avoid
Trapped in governance language without implementation pathways.
First 90-day outputs
Risk-to-control mappings, validation criteria, escalation thresholds, evidence requirements.
AI Security Architect
High at scaleStaff / PrincipalDefine secure reference architecture across AI systems, data paths, and delegated-action components.
Boundary
Does not own day-to-day control operations.
Anti-pattern to avoid
Architecture ownership without decision rights over implementation standards.
First 90-day outputs
Architecture baseline, trust-boundary definitions, control ownership map, design-review rubric.
Four audiences. Each with a distinct failure mode and a distinct set of decisions that change posture. The common thread: evidence artifacts are the connective tissue between strategy and execution.
Audience
CISO
Common mistakes
Three decisions in 90 days
Minimum evidence artifacts
Early warning signal
Risk narratives repeat in board decks while the control evidence scorecard remains static quarter-over-quarter.
Audience
Hiring Managers
Common mistakes
Three decisions in 90 days
Minimum evidence artifacts
Early warning signal
Interview debrief feedback is high-variance across interviewers and not comparable. The loop is measuring different things.
Audience
Recruiters
Common mistakes
Three decisions in 90 days
Minimum evidence artifacts
Early warning signal
High top-of-funnel volume with low technical conversion rate at first interview. The screening criteria are not calibrated to the actual role.
Audience
Practitioners
Common mistakes
Three decisions in 90 days
Minimum evidence artifacts
Early warning signal
Strong technical screens, weak role-fit decisions. The gap is executive communication and organizational translation — and the inability to demonstrate AI-specific work beyond claimed familiarity.
For service providers & MSSPs
The 15 findings in this report describe a hiring market with a structural gap between exposure and staffing capacity. MSSPs and managed AI security service providers are positioned to fill this gap — but only with service delivery models explicitly designed for AI-specific control surfaces, not extensions of legacy compliance audit or traditional SOC offerings.
| Client Segment | Primary Exposure | Service Model |
|---|---|---|
| Mid-market (500–5K employees) | AI feature deployment without dedicated AI security staffing | Fractional AI Security Lead + quarterly control evidence review |
| AI-native companies (pre-IPO) | Rapid agentic deployment, RAG boundary risk, no governance evidence | Agent Security Engineering retainer + evidence artifact program |
| Enterprise (existing program) | Compliance-reflex programs missing AI-native control coverage | AI security program gap assessment + AI-native framework overlay |
| Regulated industries (FSI, Healthcare) | Evidence gap — governance language without evidence artifacts | Governance evidence production + board-reportable control scorecard |
The service provider hiring signal: Only 4 job postings in the 2026 corpus explicitly reference security training platform experience. Practical skills demonstration platforms (adversarial AI labs, cyber range environments, hands-on LLM security exercises) represent an emerging assessment infrastructure gap — the first providers to standardize AI security practical assessment will define the evaluation standard for the discipline.
What you wrote vs. what you actually need
Common job description language and what it actually signals about the role you're trying to fill.
| What you wrote in the JD | What you actually need | Right archetype |
|---|---|---|
| "Own the AI security program end-to-end" | A team charter, not a single role. Define one anchor domain first. | Start with AI Security Architect, then add specialists |
| "AI red team and governance program lead" | Two incompatible full-time profiles compressed into one requisition. | AI Red Team Engineer + Governance Evidence Lead — separate reqs |
| "Prompt engineering and AI security testing" | Behavioral red-teaming is not prompt engineering. The skillsets don't overlap. | AI Red Team Engineer with adversarial lab methodology |
| "GDPR, HIPAA, SOC 2 compliance for AI systems" | Legacy compliance ≠ AI security engineering. You're building a compliance posture, not an AI control posture. | Governance Evidence Lead — or reclassify as a compliance role |
| "Secure our LLM-powered chatbot" | Depends on "secure" — data access risk vs conversational safety vs agentic actions are different scopes. | RAG Security Engineer (data leak) or Agent Security Engineer (actions) |
| "ML pipeline security and model governance" | Lifecycle control and risk reporting are separate ownership surfaces. | ML Security Engineer + Model Risk Security Partner — sequential or parallel |
| "Experience with Splunk and AI threat modeling" | Detection tooling and AI behavioral risk are different coverage planes requiring different profiles. | Two roles: traditional security + AI Product Security Engineer |
| "5+ years experience in AI security" | The discipline is 3 years old at scale. You're describing a market that doesn't exist yet. | Rewrite as capability-domain experience, not time-in-title |
Role design red flags — quick reference
Any single job description containing three or more of the following signals is exhibiting the Frankenstein pattern. Use this as a pre-posting checklist.
| Red Flag Signal | What it indicates | The design fix |
|---|---|---|
| Five or more "and" connectives in requirements | Chimera Spec — multiple capability families compressed into one role | Define one primary ownership domain. Move secondary capabilities to "preferred" or adjacent team scope. |
| "AI red team" without named exercise types | Red Team Misnomer — label applied to undefined scope | Name three specific adversarial exercise types the hire must execute. If you can't name them, you need a governance reviewer, not a red teamer. |
| GDPR + HIPAA + SOC 2 with no AI-native framework | Compliance Reflex — legacy vocabulary without AI-specific coverage | Add at least one AI-native governance reference (NIST AI RMF, EU AI Act, MITRE ATLAS) or reframe as a compliance role, not AI security. |
| "Experience with Splunk/SIEM" as primary AI security tool | Tool Incumbency Trap — runtime detection coverage mistaken for AI behavioral risk coverage | Add at least one AI-native evaluation or observability tool requirement. Distinguish infrastructure security from AI behavioral security. |
| 5+ years experience in a discipline that is 2 years old | Unicorn Index — experience requirements impossible to satisfy at stated scope | Rewrite experience requirements around capability domains and demonstrated outputs, not time-in-market. |
| "Responsible for building and maintaining the AI security program" as a single-contributor role | vCISO Vacuum in reverse — team-shaped mandate on individual budget | Scope the first hire to one anchor deliverable. Use a phased staffing plan, MSSP support, or fractional coverage for the remaining program surface. |
Tool-market mapping complements job-description intelligence by showing the ecosystem teams actually evaluate and deploy. This section provides public-safe aggregate coverage metrics and directional product-landscape context.
The tools layer is ecosystem intelligence, not endorsement. Coverage is designed to support CISO vendor scanning, hiring-manager tooling literacy, and practitioner comparison workflows. For interactive exploration, use the tools directory at /tools.
A market thesis derived from a single source is an opinion. This section is the evidence layer: seven independent signal sources — practitioner surveys, academic research, open-source builder activity, industry media, public knowledge codification, vulnerability intelligence, and framework-control intelligence — each arriving at the same conclusion through entirely different mechanisms. The hiring corpus tells you what companies say they need. These signals tell you what researchers are studying, what builders are shipping, what practitioners are experiencing, what the press is amplifying, where exploit pressure is visible in public disclosure, and how control frameworks map in practice. When all seven describe the same structural gap, the convergence is the argument.
Evidence layer at a glance
Primary Research — Practitioner Survey
The survey layer is the only signal in this report that asks practitioners directly: what are you experiencing? Four survey instruments — CISOs and security leaders, AI security practitioners, recruiters and hiring managers, and adjacent security engineers — plus a flash assessment collected first-person responses across the same dimensions the hiring corpus measures at a distance. The results are striking for how closely they mirror the corpus signal. 27% report no clear AI security owner in their organization — the practitioner's version of the ownership vacuum Finding 07 documents in job description language. Self-reported program maturity sits at 2/5 (emerging band), confirming that executive awareness has outrun engineering delivery, not the reverse. 57% of respondents recognize AI Security Engineering as an emerging distinct discipline, yet describe hiring practices built on AppSec and GRC assumptions. The most-cited practitioner risk — Data leakage via AI at 38% — maps directly to the agentic attack surface language in 0.24% of job postings. Practitioners can name the threat. The hiring market has not yet built the role that owns it.
Top practitioner-cited AI security risks — cross-persona survey
arXiv — Research Momentum
Academic research is an independent leading indicator: what researchers publish today becomes practitioner vocabulary in 12–24 months, and hiring language in 24–36. The arXiv signal measures where that leading edge is. A seeded metadata pull across eight AI security taxonomy buckets — analyzing 2,730 papers by title, abstract, and category — surfaces the same domain topology as the hiring corpus, without any shared data. Top matched term: privacy-preserving (67 papers). Largest classified research bucket: prompt and generation security. The most recent month in the dataset (2026-05) shows 1036 papers — part of a sustained acceleration trajectory. This is not coincidence. Researchers are studying the problems that practitioners cite as critical risks and that hiring language is only beginning to name. The gap between academic term frequency and hiring-language adoption is the predictive signal: the concepts with high arXiv velocity but low hiring-corpus density are the next generation of required skills.
arXiv matched term frequency — seeded AI security pull
GHArchive — Builder Ecosystem
Open-source activity is the market's most honest signal: builders allocate time to things they believe matter, independent of employer mandate or press framing. The GHArchive signal tracks 2500 classified GitHub repositories across eight AI security domains — 26,662 total event proxies (stars, forks, pull requests, issues) from Unknown-dominated tooling. The result: active builder ecosystems exist across every AI security taxonomy bucket without exception. No domain is purely academic. The top topic tag is vellum-ai/vellum-assistant, consistent with Finding 06 (Agentic Anarchy) and Finding 15 (The Agentic Surface Emergence) — the threat the hiring market has barely noticed is the one that builders have been building defenses for. Governance/assurance engineering and detection/runtime monitoring show the highest unique-actor density, which also maps to the governance hiring signal dominance documented in Finding 08 (Boardroom-to-Backlog Gap). Builders are ahead of the hiring market in every domain where the hiring market lags. That is the normal order — until organizations start hiring to catch up.
Unique GitHub contributors by AI security domain — GHArchive
Active GHArchive repos by month
GHArchive event type mix (AI-security-scoped stream)
Collaboration intensity by bucket (avg actors per repo)
Review pressure by bucket (review-to-push ratio)
Event concentration risk by bucket (top-10 share)
Classifier evidence strength by bucket
Control artifact signals by bucket
Release cadence by bucket
Media — Industry Coverage
Industry media shapes board-level mental models before any hiring signal reaches them. 613,416 items from aggregated RSS/Atom feeds — major tech media, security outlets, AI lab blogs — classified against the same AI security taxonomy. Of the classified volume (6.1% of total; the remainder is general tech coverage without AI security signal), AI model research and AI cyber defense dominate, with secure AI SDLC and governance following. This distribution explains a specific failure mode documented in Finding 13 (The Compliance Reflex) and Finding 08 (Boardroom-to-Backlog Gap): when boards read about AI security, they read about model capability risks and regulatory frameworks — not about control engineering, evidence artifacts, or ownership structures. The media vocabulary that reaches boardrooms is calibrated for awareness, not for the operational decisions that CISOs and hiring managers actually need to make. The result is governance language that drives policy creation without driving control creation. The hiring corpus reflects the same bias: governance-shaped role language crowding out engineering-shaped role language, in the same proportion media shapes board perception.
Media volume by AI security theme — classified items only
Wikimedia — Knowledge Codification
Public knowledge codification is a discipline-maturity clock. When a concept acquires a Wikipedia article, a Wikidata entity, a taxonomy entry in public knowledge graphs, it has crossed from practitioner jargon into institutional vocabulary. The codification lag — the delay from practice emergence to public encoding — is an independent proxy for how far behind educational infrastructure, recruiting knowledge, and junior candidate preparation sit relative to the frontier. AI security subfields with strong codification coverage signal that knowledge infrastructure exists to train and credential the next generation. Subfields with thin codification are where hiring managers are demanding senior experience for problems that have no curriculum, no certifications, and no standardized vocabulary yet — which is the direct precondition for Finding 11 (Entry-Level Extinction). The discipline cannot build junior pipelines for concepts that are not yet in the knowledge infrastructure that junior candidates use to learn and recruiters use to screen.
Vulnerability Intelligence — CVE & Advisory Disclosures
Public vulnerability disclosures are the market's most unambiguous signal: when a CVE is published against an AI/ML product or framework, the attack surface is no longer theoretical. This layer aggregates 26K+ records from NIST NVD, GitHub Advisory Database (GHSA), and OSV.dev, classified for AI/ML relevance using a two-stage pipeline: product/package name matching against a 35+ package dictionary, followed by keyword-weighted scoring across 21 semantic buckets from the MITRE ATLAS taxonomy. Of 26K+ total records, 1,458 were classified as AI-relevant at confidence ≥ 0.5. 3 appear in the CISA KEV — confirming active exploitation in the wild. The dominant classification bucket is ai ml framework library vulnerability (378 records), directly mapping to the LLM application frameworks that practitioners cite as their top risk surface and that arXiv research focuses on most intensely.
The vulnerability signal is structurally different from the other signal layers: it does not measure what people say they need, what researchers are studying, or what builders are shipping. It measures where attackers have already found exploitable weaknesses. When the CVE distribution aligns with practitioner-cited risks and arXiv research focus across independent classification taxonomies, the convergence moves from directional to evidential. The exploits exist. The question is whether the organizations deploying AI have staffed the roles capable of remediating them — and the hiring corpus documents the answer.
Framework Intelligence — Control and Mapping Coverage
Framework intelligence adds a different validation lens: not attack pressure, but control-language interoperability. This layer tracks public framework assets across MITRE ATLAS, NIST AI RMF, OWASP LLM Top 10, and related governance references; currently 8 frameworks with 42 directional crosswalk mappings. The coverage split — 3 machine-readable vs 5 document-only frameworks — is itself operationally important. Teams cannot automate control validation where framework assets are only narrative text. Crosswalk density and domain coverage show where organizations can build control traceability today versus where taxonomy translation is still heuristic. This is directional signal, not official equivalence mapping.
Framework crosswalk coverage by control domain
Framework source retrieval status
What convergence means for claim confidence
Each of these seven signal layers is independent. They share a taxonomy, but not a dataset, a methodology, or an institutional source. The hiring corpus comes from employer language. The survey comes from practitioner experience. arXiv comes from academic attention. GHArchive comes from builder behavior. Media comes from editorial selection. Wikimedia comes from knowledge community consensus. Vulnerability intelligence comes from public CVE/advisory disclosures. Framework intelligence comes from public control-framework assets and directional crosswalk analysis. When seven independent systems all map the same topology of problems — agentic surface exposure with no control ownership, governance vocabulary without engineering delivery, discipline emergence without junior pipeline — that convergence is not confirmation bias. It is structural evidence. The central claim of this report is not that the hiring corpus suggests a gap. It is that seven independent systems are measuring the same gap from seven different angles, and the measurements agree.
Twelve AI security terms every hiring team needs to know. These are the vocabulary gaps between what hiring managers write and what AI security engineers actually do.
This section defines what this report can and cannot claim, how each signal layer was constructed, and the limits within which its evidence is reproducible. These are not disclaimers — they are operational guardrails. A finding is only as useful as the clarity about what data produced it and what inferences it can support.
Data sources & multi-signal approach
This report triangulates across seven independent signal layers: (1) ATS job corpus — 293,846+ job descriptions from 5,350 companies spanning 2013–2026, with primary analysis weight on the 241,553 postings from 2026; (2) Primary survey research — four survey instruments across CISOs, practitioners, recruiters, and adjacent engineers; (3) arXiv research momentum — seeded metadata from 903+ academic papers; (4) GHArchive builder ecosystem — 2,500 classified repos and 26,662 scoped event proxies in the current ingest window; (5) Media/news corpus — 776K+ items from aggregated RSS/Atom feeds; (6) Wikimedia knowledge codification — concept maturation tracking via public knowledge artifacts; (7) Vulnerability intelligence — normalized CVE/advisory streams with AI-relevance classification and KEV cross-reference. Each signal layer is independently classified against the same AI security taxonomy.
ATS corpus & primary data collection
The ATS job corpus is based on structured analysis of 293,846+ job descriptions (5,350 companies) collected from major applicant tracking systems (ATS) spanning 2013–2026, with primary analysis weight on the 241,553 postings from the 2026 period where AI security signal density is highest. The corpus includes roles across the full AI security adjacent spectrum: explicitly labeled AI security roles, traditional security roles (AppSec, cloud security, penetration testing), software engineering roles with security signals, and general software engineering roles used as a baseline comparator. Job descriptions were deduplicated by content hash and normalized for signal extraction.
AI-native companies, cybersecurity vendors, and financial services are over-represented in the hiring corpus relative to their share of total employment — reflecting their higher job posting volume and ATS adoption rates. Findings account for this concentration where relevant.
Signal extraction & taxonomy
| Signal Domain | What it measures | Coverage |
|---|---|---|
| Role breadth (Frankenstein score) | Distinct capability families bundled in a single posting — product security, governance, adversarial testing, lifecycle control, agentic controls | All 294K+ postings |
| Framework mentions | Named governance, compliance, and AI-native security frameworks cited in role language (GDPR, NIST AI RMF, EU AI Act, MITRE ATLAS, ISO 42001, etc.) | All 294K+ postings |
| Tool mentions | Named security and AI-native tools cited in role language — detection, AppSec, LLM observability, model evaluation, guardrails | All 294K+ postings with quality filters |
| Agentic surface signals | Vocabulary specific to agent-layer attack surfaces: prompt injection, function calling, tool calling, RAG boundary, jailbreak, model drift, data poisoning | All 294K+ postings |
| Evidence language | Governance obligation language vs. evidence-producing language (telemetry requirements, eval output requirements, remediation proof, attestation) | Framework-tagged postings |
| Seniority distribution | Role-family breadth scores stratified by seniority level | AI security-labeled roles |
Tool signal quality note: Tool mention signals use substring dictionary matching. Signals with high false-positive risk (where the tool name appears as a substring of common English words) are excluded from named tool analysis. Category-level tool signals aggregate across multiple tools and are more robust than individual tool name signals.
Claim boundaries
| Claim Level | Definition | Use |
|---|---|---|
| Public Claim Ready | Direct quantitative signal from job description corpus. Reproducible from the extraction taxonomy. | External report, media, decks. Cite with stated corpus caveats. |
| Public Claim with Caveat | Corpus pattern analysis. Directional signal, not precise measurement. Individual-company variation is real. | Directional assertions, with explicit "based on hiring language" qualifier. |
| Internal Hypothesis Only | Inferred or extrapolated beyond corpus evidence. Not validated by job description signal. | Research agenda, further study. Not suitable for external publication. |
What this data can support: Market-level patterns in hiring language · Framework and tool adoption signals · Role breadth and capability-family trends · Year-over-year signal growth rates
What this data cannot support: Company-level security maturity assessments · Individual practitioner capability evaluations · Proof that any company has or lacks any specific security control · Claims about actual deployed AI systems or security incidents
Sponsor independence: Research methodology, signal taxonomy, and findings are independent of sponsor involvement. Sponsors receive access to findings; they do not influence finding selection, framing, or data interpretation.
aisecurity.llc research
The State of AI Security Engineering
This report provides job-description intelligence and aggregate benchmark signals. Findings reflect role-language evidence, not company-level maturity proof.