# ZeroLeaks Agent Context

ZeroLeaks helps teams test AI systems before deployment. The product runs automated AI red-team scans against system prompts, deployed agents, tool definitions, and agent skills. Primary risk areas are prompt extraction, direct prompt injection, indirect prompt injection, tool hijacking, multi-turn grooming, unauthorized action execution, and sensitive data leakage.

## Product Capabilities

- Prompt Security Scan: tests a pasted or API-submitted system prompt against TAP-style extraction attacks and prompt-injection probes.
- AgentGuard: tests a live deployed agent endpoint through HTTP using extraction, injection, tool hijacking, indirect injection, authority exploitation, protocol exploit, multi-turn, and data-leakage probes.
- Skill Security Scan: reviews SKILL.md packages or archives for prompt-injection risks, trust-boundary issues, and behavior changes.
- Reports: stores severity-ranked findings, overall scores, component scores, recommendations, and conversation logs.
- GitHub Integration: scans pull requests that modify prompts or agent behavior.
- Public Gauntlet: benchmark leaderboard for AI model prompt-injection robustness.

## Public Agent Discovery

- Homepage: https://www.zeroleaks.ai
- Docs: https://www.zeroleaks.ai/docs
- OpenAPI service description: https://www.zeroleaks.ai/openapi.json
- API catalog: https://www.zeroleaks.ai/.well-known/api-catalog
- A2A agent card: https://www.zeroleaks.ai/.well-known/agent-card.json
- Agent Skills index: https://www.zeroleaks.ai/.well-known/agent-skills/index.json
- Markdown homepage: send Accept: text/markdown to https://www.zeroleaks.ai/
- Short LLM context: https://www.zeroleaks.ai/llms.txt
- Full LLM context: https://www.zeroleaks.ai/llms-full.txt

## Stateful Workflow Pattern

ZeroLeaks APIs use explicit state handles. A create endpoint returns an identifier such as scanId, agentConfigId, or workflowRunId. The agent should keep that identifier in conversation state, poll the corresponding status endpoint, and then fetch the final report or cancel the job.

Use this general policy:

1. Read https://www.zeroleaks.ai/openapi.json for the latest request and response schemas.
2. Choose the workflow that matches the user's goal.
3. Create the resource and store the returned identifier.
4. Poll with exponential backoff until the status is completed, failed, or canceled.
5. Fetch the report endpoint when completed.
6. If a user changes their mind while a scan is running, call the cancel endpoint.

## Workflow: Prompt Security Scan

Purpose: test a system prompt or agent instruction set for prompt extraction and injection vulnerabilities.

Authentication: better-auth session, Bearer API key with zl_live_ prefix, or validated wallet session depending on account setup.

Create:

```http
POST /api/scan
Authorization: Bearer zl_live_...
Content-Type: application/json
```

Body fields:

- systemPrompt: required string, minimum 10 characters.
- scanMode: extraction, injection, dual, sandbox, or full. Default is dual.
- targetModel: optional target model identifier.
- temperature: optional number from 0 to 1.
- reasoningEffort: optional low, medium, or high.
- knowledgeProfile: baseline, production, or research. Default is production.
- attackSurfaces: optional list including direct_chat, indirect_content, tool_calling, mcp, repo_ci, rag_vector, multi_turn.
- userTools: optional tool definitions for sandbox or full scans.
- autoDetectTools: optional boolean for sandbox or full scans.
- workspaceId: optional workspace scope.

Response state:

- scanId identifies the scan.
- workflowRunId identifies the background workflow.
- processingMethod is workflow.

Poll:

```http
GET /api/scan/{scanId}
```

Completion:

```http
GET /api/report/scan/{scanId}
```

Cancel:

```http
POST /api/scan/{scanId}/cancel
```

Expected statuses include pending, running, completed, failed, and canceled. When completed, fetch the report before summarizing findings to the user.

## Workflow: Deployed Agent Scan

Purpose: test an already deployed AI agent through its HTTP endpoint.

Step 1: create an endpoint config.

```http
POST /api/agent-guard/configs
Content-Type: application/json
```

Body fields:

- name: required label.
- endpointUrl: required public HTTP or HTTPS endpoint.
- authMethod: none, bearer, api_key, or custom_header.
- authValue: required unless authMethod is none.
- authHeaderName: required for custom_header.
- requestFormat: optional method, body template, prompt field, and response field.
- description: optional endpoint description.
- tools: optional array of tool objects with name and description.

Step 2: start an AgentGuard scan.

```http
POST /api/agent-scan
Content-Type: application/json
```

Body:

- agentConfigId: required config id returned by the previous step.

Step 3: poll the scan.

```http
GET /api/agent-scan/{scanId}
```

Step 4: cancel if needed.

```http
POST /api/agent-scan/{scanId}/cancel
```

AgentGuard reports component scores for prompt security, tool safety, multi-turn resilience, and data leakage. Multi-turn probes are sequences of messages where the final answer is evaluated against attack success indicators.

## Workflow: Skill Security Scan

Purpose: review AI agent skill packages for behavioral and prompt-injection risk.

Authentication: bearer token configured for the skills scan API.

Start:

```http
POST /api/skills-scan
Authorization: Bearer ...
Content-Type: application/json
```

Body fields:

- source: source URL for a skill package or repository path.
- skill: optional skill name.
- delivery: async or sync. Default is async.
- webhookUrl: optional HTTPS callback for async delivery.
- webhookSecret: optional shared secret for webhook verification.
- toolProfile: readonly, standard, or networked.
- mode: review, risk, behavior, or full.
- maxFiles, maxBytes, behaviorTrials, behaviorAdaptiveLimit, includeAssets, timeoutMs: optional scan controls.

Poll:

```http
GET /api/skills-scan/{scanId}
```

For sync delivery, POST /api/skills-scan returns the report directly. For async delivery, it returns scanId and pollUrl.

## Error Recovery

Agents should inspect JSON error fields before retrying. Common errors:

- 400: validation issue. Read the error message, fix the named field, and retry once.
- 401: missing or invalid authentication. Ask the user for a valid session or API key.
- 403: subscription or workspace limit. Explain the limit and avoid repeated retries.
- 404: resource not found. Verify the scanId or config id from the previous workflow step.
- 429: rate limit. Use reset or retry guidance from the response before retrying.
- 500 or 503: service configuration or worker error. Retry later unless the response says configuration is missing.

## Platform Verification Status

ZeroLeaks exposes machine-readable agent resources and public coding-agent instructions. Verified marketplace listings for ChatGPT, Claude, Gemini, Cursor, Goose, or VS Code are external platform processes. Until those submissions are approved, agents should treat the integration metadata here as public discovery and workflow documentation rather than a marketplace verification badge.