The AI attack surface is every entry point, component, and data flow an adversary can target in an AI system. The average enterprise AI deployment now has 14.3 distinct attack surface components, up from 3.2 in 2023 (Gartner AI Security Survey, 2025). That is a 347% expansion in two years. Traditional application security is necessary but not sufficient. AI introduces new vulnerability classes that most security teams have never tested for.

The McKinsey Lilli breach (February 2026, 46.5 million messages exposed) exploited unauthenticated API endpoints. CVEs issued in 2025 for Claude Code (CVSS 8.7), GitHub Copilot (RCE), and Cursor (arbitrary code execution) prove that the tools used to build AI applications are themselves attack surfaces. This guide maps every component: APIs, RAG pipelines, system prompts, vector stores, agent tools, AI coding assistants, and supply chain dependencies.

AI API Security

AI APIs are the primary external interface for most AI deployments, serving as the gateway between users (or other systems) and the underlying models, data stores, and agent capabilities. They represent the most immediately exploitable component of the AI attack surface.

Unauthenticated Endpoints

The most devastating AI API vulnerability is the simplest: exposing functional endpoints without authentication. The McKinsey Lilli breach was enabled by unauthenticated API endpoints that provided direct access to backend databases and OpenAI API integrations. According to CodeWall’s 2026 AI Security Report, 41% of enterprise AI deployments have at least one unauthenticated API endpoint that exposes sensitive functionality.

Why this happens in AI deployments:

  • AI applications are often built rapidly by data science teams without security engineering review
  • Internal-facing AI tools are later exposed externally without authentication being added
  • Microservice architectures create internal APIs that are inadvertently exposed through misconfigurated API gateways or load balancers
  • Prototype AI chatbots deployed “temporarily” without authentication become permanent fixtures

Testing approach:

  1. Enumerate all API endpoints through discovery (swagger/OpenAPI docs, JavaScript analysis, traffic interception)
  2. Test each endpoint without authentication credentials
  3. Test with invalid, expired, or other users’ credentials
  4. Check for API endpoints that bypass authentication through alternative paths (direct model endpoint access, WebSocket connections, GraphQL introspection)

Defensive recommendations:

  • Require authentication on every endpoint, including internal and “read-only” endpoints
  • Implement API gateway authentication at the perimeter, not at the application level
  • Use short-lived tokens with automatic rotation
  • Conduct regular API inventory audits to identify shadow AI APIs

Rate Limiting and Abuse Prevention

AI API endpoints are expensive to operate (each request consumes GPU compute and incurs per-token costs) and can be exploited for denial of service, cost manipulation, or data extraction at scale.

Common vulnerabilities:

  • No rate limiting on AI inference endpoints (enabling model extraction through massive query volumes)
  • Per-user rate limits that can be bypassed through account creation or credential sharing
  • No cost caps on AI API usage, enabling billing attacks
  • Rate limits applied to the API gateway but not to WebSocket or streaming endpoints

Impact examples:

  • An attacker querying an unprotected embedding endpoint millions of times can extract a functionally equivalent copy of the embedding model
  • Without cost caps, a prompt injection that triggers recursive tool calls can generate thousands of dollars in API charges within minutes
  • Rate-limit-free endpoints enable large-scale data extraction through enumeration queries

API Key and Secret Management

AI applications often require API keys for multiple services: LLM providers (OpenAI, Anthropic), vector databases, embedding services, and tool integrations. Each key represents a potential attack vector.

Common vulnerabilities:

  • API keys hardcoded in client-side code, configuration files, or CI/CD pipelines
  • Shared API keys across environments (development, staging, production)
  • API keys with excessive permissions (admin-level keys used for inference-only applications)
  • No key rotation policy or expiration dates
  • API keys exposed through AI-generated code (LLMs trained on code containing actual API keys)

Real-world example: In the McKinsey Lilli breach, access to unauthenticated API endpoints exposed the OpenAI API integration, potentially allowing the attacker to make API calls using McKinsey’s OpenAI organization credentials — a direct financial and data security exposure.

RAG Pipeline Vulnerabilities

Retrieval-Augmented Generation (RAG) has become the standard architecture for grounding LLM responses in organizational knowledge. However, RAG introduces a complex pipeline — document ingestion, embedding, vector storage, retrieval, and context injection — where each stage presents unique attack vectors.

Cross-Tenant Data Leakage

In multi-tenant RAG deployments, inadequate isolation between tenants can allow one user to access another’s documents through carefully crafted queries.

Attack mechanism:

  1. Attacker crafts queries that are semantically similar to the target tenant’s documents
  2. The vector similarity search retrieves documents from the target tenant’s namespace
  3. The LLM includes retrieved content in its response, leaking cross-tenant data

Why this happens:

  • Vector databases often use namespace or partition-based isolation, which can be misconfigured
  • Embedding models may create similar vectors for semantically related content across tenants
  • Metadata-based filtering (tenant ID) can be bypassed if the filter is applied at the application layer rather than the database layer
  • Shared embedding models mean cross-tenant documents exist in the same vector space

Testing approach:

  1. Create documents with known content in one tenant
  2. Query from a different tenant using semantically similar queries
  3. Test with direct vector IDs if the API exposes them
  4. Test metadata filter bypass techniques

Document Injection and Poisoning

Attackers who can upload or modify documents in a RAG knowledge base can inject content that manipulates AI responses for all users.

Attack scenarios:

  • Injecting documents containing prompt injection payloads that override the system prompt when retrieved
  • Uploading documents optimized to rank highly in similarity searches for specific queries
  • Modifying existing documents to inject misinformation or malicious instructions
  • Uploading documents that trigger downstream vulnerabilities (XSS, SSRF) when their content is included in AI responses

McKinsey Lilli impact: The breach exposed 3.68 million RAG chunks — the entire internal knowledge base. Had an attacker injected poisoned documents before the breach was discovered, those documents could have influenced AI responses for all 57,000 users.

Defensive recommendations:

  • Implement strict access controls on document ingestion pipelines
  • Validate and sanitize all documents before embedding
  • Monitor RAG knowledge base for unauthorized modifications
  • Implement document provenance tracking and integrity verification
  • Use role-based access control for document retrieval, not just ingestion

Embedding Extraction and Inversion

Vector embeddings stored in RAG systems can potentially be extracted and inverted to reconstruct the original documents.

Attack mechanism:

  1. Access the vector database through API or direct database connection
  2. Extract embedding vectors for target documents
  3. Use embedding inversion techniques to reconstruct approximate document content

Research status: Embedding inversion attacks have demonstrated partial reconstruction of source text from embeddings with 60-80% fidelity (Morris et al., 2023). While not perfect, this is sufficient to extract sensitive information, key phrases, and document structure.

McKinsey Lilli impact: 266,000+ OpenAI vector store entries were accessible — each representing embedded representations of internal documents that could potentially be inverted to reconstruct the original content.

System Prompt Extraction and Manipulation

The system prompt is the foundation of an AI application’s behavior, containing its instructions, constraints, persona definition, tool access configurations, and often sensitive implementation details.

Extraction Techniques

System prompts can be extracted through multiple vectors:

Conversational extraction: Asking the model to reveal its instructions through various framings — direct requests, role-playing, translation, summarization, and multi-turn approaches. Research from Perez and Ribeiro (2022) found that basic conversational extraction succeeds against 60-80% of LLM applications.

Behavioral inference: Even when direct extraction fails, the system prompt’s content can be inferred by testing the model’s behavioral boundaries — what it will and won’t do, what format it responds in, what topics it avoids.

Side-channel extraction: Analyzing response patterns, latency variations, and token probabilities to infer system prompt content.

What System Prompts Reveal

Extracted system prompts frequently expose:

  • Internal API endpoint URLs and authentication patterns
  • Database schema information and query patterns
  • Business logic and decision criteria
  • Safety guardrail implementation details (which enable targeted bypass)
  • Tool/function calling configurations
  • Internal role definitions and access control logic
  • References to internal systems, codenames, and proprietary methods

Manipulation Through Injection

Beyond extraction, attackers can manipulate the effective system prompt through:

  • Prompt injection: Overriding or supplementing system instructions through user input
  • Context window displacement: Pushing the system prompt out of the effective context window through long inputs
  • Configuration file injection: In AI coding tools, modifying configuration files that function as system prompts (e.g., .claude files, .cursorrules, Copilot workspace settings)

Vector Store Exposure

Vector databases (Pinecone, Weaviate, Qdrant, Chroma, pgvector) are the backbone of RAG systems, storing embedded representations of an organization’s knowledge base. They represent a high-value target because they contain semantically searchable representations of all documents the AI system has access to.

Direct Database Access

If vector databases are exposed without proper authentication or network isolation, attackers can:

  • Enumerate all stored vectors and metadata
  • Extract embedding vectors for offline analysis
  • Modify or delete vectors to poison the knowledge base
  • Access metadata (document titles, sources, timestamps) that reveals organizational information

McKinsey Lilli: 266,000+ Vector Store Entries

The McKinsey Lilli breach exposed over 266,000 OpenAI vector store entries. These entries represented embedded versions of internal McKinsey documents, client materials, and proprietary research. The vector store access alone — even without the additional 3.68 million RAG chunks and 46.5 million chat messages — represented a massive intellectual property exposure.

Lessons for vector store security:

  • Vector databases must be treated with the same security controls as primary data stores
  • Network isolation is essential — vector databases should not be accessible from the internet
  • Authentication and authorization must be enforced at the database level, not just the application level
  • Encryption at rest and in transit is mandatory for vector stores containing sensitive data
  • Regular access audits should monitor who and what is querying the vector database

Vector Store Enumeration Attacks

Even without direct database access, attackers can enumerate vector store contents through the AI application:

  1. Submit systematically varied queries to map the knowledge base contents
  2. Analyze retrieval results to identify document boundaries and categories
  3. Use semantic similarity to navigate from known to unknown content areas
  4. Extract metadata (document names, sources, dates) from AI responses

AI Coding Tools as Attack Surface

AI coding tools — Claude Code, GitHub Copilot, Cursor, Windsurf, and others — represent a rapidly expanding and fundamentally new attack surface. These tools have direct access to local file systems, can execute commands, modify code, interact with APIs, and operate with the full permissions of the user running them.

The Scale of the Problem

Claude Code is not just a coding tool — companies use it for LinkedIn scraping, CRM automation, financial data processing. Non-developers make up >50% of usage at major companies. This means that AI coding tools with system-level access are being used by individuals who may not understand the security implications of the permissions they are granting.

According to GitHub’s 2025 Octoverse report and McKinsey’s 2024 Global AI Survey, the majority of developers in enterprise environments now use AI coding assistants. These tools operate with the developer’s full system permissions — read/write access to all files, ability to execute arbitrary commands, access to environment variables containing API keys and credentials, and connectivity to internal networks.

CVE-2025-59536: Claude Code Configuration Injection (CVSS 8.7)

Affected versions: Claude Code prior to 1.0.17 Vulnerability type: Prompt injection via configuration files and project context Attack vector: Malicious .claude configuration files, CLAUDE.md instruction files, or crafted project files

Attack scenario:

  1. Attacker creates a repository containing malicious configuration files
  2. Files contain prompt injection payloads embedded in code comments, documentation, or configuration
  3. Victim clones the repository and runs Claude Code
  4. Claude Code ingests the malicious configuration as part of its context
  5. Injected instructions cause Claude Code to execute arbitrary commands, exfiltrate credentials, or modify files

Impact: Arbitrary command execution on the victim’s machine, credential theft, file modification, data exfiltration.

Fix: Claude Code 1.0.17 introduced permission prompts for file operations and command execution, plus sandboxing improvements. However, the fundamental risk — that AI coding tools process untrusted project files as context — remains an architectural challenge.

CVE-2025-53773: GitHub Copilot Remote Code Execution

Vulnerability type: Prompt injection leading to arbitrary code execution Attack vector: Crafted code comments, documentation, or project files

Code comments, docstrings, README files, and issue descriptions can contain prompt injection payloads that manipulate Copilot’s code suggestions. In certain IDE configurations with auto-execute capabilities, malicious code suggestions translate directly to code execution on the developer’s machine.

Broader implication: Every open-source repository, Stack Overflow answer, documentation page, and issue thread that a developer references while using Copilot is a potential injection vector.

CVE-2026-21852: Cursor AI Agent Mode Arbitrary Code Execution

Affected component: Cursor IDE Agent Mode Vulnerability type: Prompt injection enabling arbitrary code execution through AI agent mode Attack vector: Malicious project files, documentation, or .cursorrules configurations

Cursor’s Agent Mode operates with elevated permissions to create, modify, and execute files autonomously. The CVE demonstrated that crafted project files could inject instructions that Agent Mode would follow, leading to arbitrary code execution with the user’s full system permissions.

Key difference from previous CVEs: Cursor’s Agent Mode is explicitly designed to take autonomous actions — creating files, installing packages, running commands — which means successful prompt injection immediately translates to system compromise without any intermediate step.

Supply Chain Risks from AI Tool Configurations

AI coding tools introduce a new supply chain attack vector through configuration files:

ToolConfiguration FileRisk
Claude Code.claude, CLAUDE.mdProject-level instruction injection
Cursor.cursorrules, .cursor/ directoryRule injection, agent mode manipulation
GitHub Copilot.github/copilot-instructions.mdSuggestion manipulation
Windsurf.windsurfrulesAgent behavior override
Aider.aider.conf.ymlTool configuration manipulation

These configuration files are typically:

  • Committed to version control (and thus present in cloned repositories)
  • Not audited by standard security scanning tools
  • Trusted by default by the AI coding tools that read them
  • Written in natural language (making malicious instructions indistinguishable from legitimate ones)

Attack pattern:

  1. Attacker contributes a pull request to an open-source project that adds or modifies an AI tool configuration file
  2. The PR appears to contain helpful AI assistant instructions
  3. The configuration file contains embedded prompt injection that activates when developers use AI tools in the project
  4. Every developer who clones the project and uses the affected AI tool is exposed

Defensive recommendations:

  • Treat AI tool configuration files as security-sensitive (like .bashrc or Makefile)
  • Review AI configuration files in code review with the same scrutiny as code changes
  • Use AI coding tools with minimal permissions and explicit approval for high-risk actions
  • Maintain allow-lists of trusted AI tool configurations
  • Never clone untrusted repositories and immediately run AI coding tools on them

For a detailed analysis of AI coding tools as an attack surface, see the RedTeamPartner.com analysis of AI coding tool security.

AI Agent Frameworks

AI agents — systems that can plan, use tools, browse the web, execute code, and take actions autonomously — represent the highest-risk expansion of the AI attack surface. Each capability granted to an agent is a capability that an attacker can potentially hijack through prompt injection or other manipulation.

Agent Tool-Use Exploitation

When an AI agent has access to tools (APIs, databases, file systems, code execution), prompt injection can trigger unauthorized tool use:

Tool CapabilityExploitation ScenarioImpact
Database queriesPrompt injection triggers data exfiltration via SQLData breach
File system accessInjected instructions read/modify sensitive filesData theft, code tampering
Code executionManipulated agent executes malicious codeSystem compromise
Web browsingAgent navigates to attacker-controlled sitesCredential phishing, malware
Email sendingInjected instructions send phishing or data to attackerSocial engineering, data exfiltration
API callsAgent makes unauthorized calls to internal/external APIsPrivilege escalation, lateral movement

Multi-Agent Chain Vulnerabilities

Complex AI deployments use multiple agents that collaborate on tasks — a planning agent delegates to specialized agents for research, coding, communication, and data analysis. This creates chain-of-trust vulnerabilities:

  1. Prompt injection in one agent’s input can propagate instructions to downstream agents
  2. Output from a compromised agent becomes trusted input for subsequent agents
  3. Each agent in the chain may have different permissions, enabling privilege escalation through the chain
  4. Monitoring and logging may not capture inter-agent communication

Agentic RAG Vulnerabilities

Agentic RAG systems — where an AI agent dynamically decides what documents to retrieve, how to process them, and what actions to take based on retrieval results — compound the risks of both RAG vulnerabilities and agent tool-use:

  • The agent decides which queries to send to the vector database (attackable through query manipulation)
  • Retrieved documents can contain prompt injection that redirects the agent’s subsequent actions
  • The agent may have write access to the knowledge base, enabling self-reinforcing poisoning loops

Full AI Attack Surface Map

The following table provides a complete inventory of AI attack surface components:

ComponentAttack VectorsExample CVE/IncidentRisk Level
User-facing chat interfacePrompt injection, jailbreakingUniversalHigh
AI API endpointsAuthentication bypass, rate limit abuseMcKinsey Lilli (2026)Critical
System promptExtraction, manipulation, overrideUniversalHigh
RAG knowledge baseDocument injection, cross-tenant leakageMcKinsey Lilli (2026)Critical
Vector databaseDirect access, enumeration, extractionMcKinsey (266K+ entries)Critical
Embedding modelExtraction, inversion, adversarial inputsResearch-stageMedium
LLM modelJailbreaking, extraction, poisoningUniversalHigh
Agent toolsUnauthorized tool use, privilege escalationMcKinsey (SQL write access)Critical
AI coding tool configsConfiguration injectionCVE-2025-59536, CVE-2026-21852High
Code suggestionsMalicious code generationCVE-2025-53773High
Training pipelineData poisoning, backdoor injectionResearch + real-worldHigh
Fine-tuning dataTargeted poisoningResearch-stageMedium
Plugin/extension ecosystemMalicious plugins, supply chainMultiple reportsHigh
Monitoring/loggingLog injection, monitoring evasionEmergingMedium
Model hosting infrastructureTraditional infra attacksStandard infra CVEsMedium

Key Takeaways

  1. The AI attack surface has expanded 347% since 2023 (Gartner, 2025), with the average enterprise AI deployment now having 14.3 distinct attack surface components.

  2. Unauthenticated AI API endpoints remain the most immediately exploitable vulnerability, as the McKinsey Lilli breach demonstrated.

  3. RAG pipelines introduce cross-tenant leakage, document poisoning, and embedding extraction risks that traditional application security does not address.

  4. AI coding tools (Claude Code, Copilot, Cursor) are a fundamentally new attack surface category, with critical CVEs demonstrating arbitrary code execution through prompt injection in configuration files and project context.

  5. AI coding tools are used far beyond coding — companies use them for scraping, CRM automation, and data processing, with non-developers comprising over 50% of usage at major enterprises.

  6. Vector stores must be secured with the same rigor as primary databases, as the McKinsey breach’s 266,000+ exposed vector store entries demonstrated.

  7. AI agent frameworks represent the highest-risk attack surface expansion because successful prompt injection translates directly to unauthorized tool use and system compromise.

  8. Defense requires mapping every component of the AI attack surface and applying appropriate controls to each — there is no single solution that addresses all vectors.

Sources and References

  • Gartner. “AI Security Survey: State of Enterprise AI Protection.” 2025.
  • CodeWall. “AI Security Report 2026.” 2026.
  • Olsen, Chris (xyzeva). “McKinsey Lilli: Technical Analysis.” February 28, 2026.
  • Anthropic. “Claude Code Security Advisory: CVE-2025-59536.” 2025.
  • GitHub. “GitHub Copilot Security Advisory: CVE-2025-53773.” 2025.
  • Cursor. “Security Advisory: CVE-2026-21852.” 2026.
  • GitHub. “Octoverse 2025: The State of Open Source and AI.” 2025.
  • Morris, John X. et al. “Text Embeddings Reveal (Almost) As Much As Text.” 2023.
  • Perez, Fábio and Ribeiro, Ian. “Ignore This Title and HackAPrompt.” 2022.
  • OWASP. “OWASP Top 10 for Large Language Model Applications, v2.0.” 2025.
  • MITRE. “ATLAS: Adversarial Threat Landscape for Artificial-Intelligence Systems.” 2025.