ZeroLeaks
ZeroLeaks Package

Attack Agents

Strategist, Attacker, Evaluator, Mutator, Inspector, Orchestrator. Each has a create function exported.

Attack Agents

zeroleaks uses six specialized agents that coordinate during a scan. Each agent has a create function you can use to instantiate it directly for custom integrations.

Strategist

Role: Selects attack strategy based on target analysis, conversation history, and findings.

The Strategist analyzes the target's responses, tracks leak status, and recommends which attack strategy to use next. It also manages phase transitions (reconnaissance, profiling, soft probe, escalation, exploitation, persistence) and can request conversation resets when stuck.

import { createStrategist, type Strategist } from "zeroleaks";

const strategist = createStrategist();

const output = await strategist.selectStrategy({
  turn: 5,
  history: conversationHistory,
  findings: [],
  leakStatus: "none",
  lastEvaluatorFeedback: "...",
});

// output.selectedStrategy, output.shouldReset, output.phaseTransition

Exports: createStrategist, Strategist

Attacker

Role: Generates attack prompts based on strategy, defense profile, and evaluator feedback.

The Attacker uses TAP-style attack generation. It selects probes from the library, adapts to the defense profile, and can use vector memory to avoid repeating failed attacks. Supports Best-of-N variation generation via the Mutator.

import { createAttacker, type Attacker } from "zeroleaks";

const attacker = createAttacker({
  maxBranchingFactor: 3,
  maxTreeDepth: 4,
  pruningThreshold: 0.3,
});

const output = await attacker.generateAttack({
  history: conversationHistory,
  strategy: selectedStrategy,
  defenseProfile: defenseProfile,
  phase: "escalation",
  evaluatorFeedback: "...",
  previousAttackNode: lastNode,
});

// output.attack.prompt, output.attack.technique, output.attack.category

Exports: createAttacker, Attacker

Evaluator

Role: Analyzes target responses for information leakage and compliance with attacker intent.

The Evaluator determines whether the target leaked system prompt content, followed injected instructions, or revealed sensitive information. It returns leak status, confidence, extracted content, and recommendations for the next attack.

import { createEvaluator } from "zeroleaks";

const evaluator = createEvaluator();

const result = await evaluator.evaluate({
  attackPrompt: "...",
  targetResponse: "...",
  conversationHistory: [],
  systemPrompt: "...",
  attackNode: attackNode,
});

// result.status, result.confidence, result.extractedContent, result.recommendation

Exports: createEvaluator, Evaluator

Mutator

Role: Produces Best-of-N variations of attack prompts.

The Mutator generates semantic variations of an attack prompt. The engine evaluates each variation in parallel (when enabled) and selects the one that performed best.

import { createMutator, type Mutator } from "zeroleaks";

const mutator = createMutator();

const mutations = await mutator.bestOfN(attackPrompt, 3);

// mutations.variations: string[]

Exports: createMutator, Mutator

Inspector

Role: TombRaider-style defense fingerprinting. Analyzes target responses to identify known defense systems and recommend bypass techniques.

The Inspector compares target response patterns against a database of known defenses (e.g. Azure Prompt Shield, OpenAI Moderation). When a match is found, it suggests techniques with documented success rates.

import { createInspector, DEFENSE_DATABASE } from "zeroleaks";

const inspector = createInspector();

const output = await inspector.analyze({
  conversationHistory: [],
  recentResponses: ["..."],
});

// output.fingerprint, output.suggestedBypasses, output.confidence

Exports: createInspector, Inspector, DEFENSE_DATABASE

Orchestrator

Role: Coordinates multi-turn attack sequences (Siren, Echo Chamber, TombRaider patterns).

The Orchestrator manages predefined multi-turn sequences that simulate human jailbreak behaviors. It uses adaptive temperature scheduling (AutoAdv-inspired) and provides step-by-step prompts for gradual escalation.

import {
  createOrchestrator,
  SIREN_SEQUENCE,
  ECHO_CHAMBER_SEQUENCE,
  TOMBRAIDER_SEQUENCE,
  DEFAULT_TEMPERATURE_CONFIG,
} from "zeroleaks";

const orchestrator = createOrchestrator();

const state = await orchestrator.getNextStep({
  sequence: SIREN_SEQUENCE,
  currentStep: 2,
  conversationHistory: [],
  temperatureState: { ... },
});

// state.prompt, state.nextStep, state.shouldEvaluate

Exports: createOrchestrator, MultiTurnOrchestrator, SIREN_SEQUENCE, ECHO_CHAMBER_SEQUENCE, TOMBRAIDER_SEQUENCE, DEFAULT_TEMPERATURE_CONFIG

Agent Summary

AgentCreate functionPrimary output
StrategistcreateStrategist()Strategy selection, phase transition, reset decision
AttackercreateAttacker(config)Attack prompt, technique, category
EvaluatorcreateEvaluator()Leak status, confidence, recommendation
MutatorcreateMutator()Best-of-N prompt variations
InspectorcreateInspector(model?)Defense fingerprint, bypass suggestions
OrchestratorcreateOrchestrator(config?)Multi-turn step prompt, temperature state

On this page