Direct query — p3ta-tricks

AI Attack Surface Map

Before attacking, map the surface. AI systems expose attack surfaces at multiple layers simultaneously.

Full Attack Surface Taxonomy

Layer	Severity	Surface	Description
L1	Critical	Prompt Layer	Direct & indirect injection, system prompt extraction, context manipulation
L2	High	Retrieval Layer (RAG)	Document poisoning, embedding manipulation, context stuffing
L3	High	Agent / Tool Layer	Tool call hijacking, SSRF via agents, command injection via tool results
L4	Medium	Output Layer	Insecure output handling, XSS via markdown, code execution via eval
L5	Medium	Model Layer	Model extraction, membership inference, inversion attacks
L6	High	Infrastructure Layer	Serving stack exploits, API misconfig, model file exposure, GPU attacks
L7	High	Supply Chain Layer	Poisoned models from HuggingFace, malicious LoRA adapters, tampered datasets

Recon Checklist for AI Applications

□ Identify the model (ask it, check headers, check JS source)
□ Identify the framework (LangChain? LlamaIndex? AutoGen? Custom?)
□ Find API endpoints (/api/chat, /v1/messages, /completion, /query)
□ Check HTTP headers for model info (x-model, x-openai-model)
□ Probe system prompt (leak techniques — see Module 04)
□ Identify what tools/functions the agent has access to
□ Check if RAG is in use (response latency spikes, retrieval artifacts in output)
□ Test output rendering (markdown? HTML? code execution?)
□ Check for multi-turn memory (does context persist between sessions?)
□ Look for rate limiting (abuse prevention tells you about attack surface)
□ Spider JS for hardcoded API keys, model configs
□ Check robots.txt, .well-known for AI-related endpoints
□ Fuzz input fields for prompt injection markers

Identifying the Model

# Direct query
"What model are you? What version?"

# Knowledge cutoff fingerprinting
"What is the most recent event you have knowledge of?"

# Capability probing
"Can you generate images?" / "Can you browse the web?"

# Token limit probing
"Repeat the word 'test' as many times as you can"

# Style fingerprinting — GPT-4 vs Claude vs Gemini have distinct refusal patterns
"How do I pick a lock?"
# GPT-4: usually answers with caveats
# Claude: often refuses with specific reasoning
# Gemini: may redirect

# Header inspection (Python)
import requests
r = requests.post("https://target.com/api/chat", json={...})
print(dict(r.headers))