Testing Guide

Comprehensive guide for testing the Aragora codebase.

Quick Start

# Run all tests
pytest tests/

# Run with verbose output
pytest tests/ -v

# Run specific test file
pytest tests/test_debate_convergence_comprehensive.py

# Run specific test class
pytest tests/test_debate_convergence_comprehensive.py::TestJaccardBackend

# Run specific test
pytest tests/test_debate_convergence_comprehensive.py::TestJaccardBackend::test_identical_texts

Test Tiers

Use scripts/test_tiers.sh for common tiers:

Tier	Command	Notes
`fast`	`scripts/test_tiers.sh fast`	Skip slow/load/e2e for rapid feedback
`ci`	`scripts/test_tiers.sh ci`	Mirrors the main CI test run
`lint`	`scripts/test_tiers.sh lint`	Black + Ruff checks
`typecheck`	`scripts/test_tiers.sh typecheck`	Mypy checks
`frontend`	`scripts/test_tiers.sh frontend`	Jest/RTL in `aragora/live`
`e2e`	`scripts/test_tiers.sh e2e`	Playwright E2E in `aragora/live`

Integration Baseline Runner

Use scripts/run_integration_baseline.py for tiered integration baselines that match CI expectations and provide quick local feedback:

# Smoke tests (fastest)
python scripts/run_integration_baseline.py smoke

# PR-safe integration baseline (mocked dependencies)
python scripts/run_integration_baseline.py offline

# Knowledge/CDC integration focus
python scripts/run_integration_baseline.py knowledge

# Full integration suite
python scripts/run_integration_baseline.py full

Environment controls:

ARAGORA_BASELINE_PARALLEL - worker count (default: auto)
ARAGORA_BASELINE_TIMEOUT - per-test timeout seconds (default: 60)

CI Mapping

CI workflows and what they cover:

Workflow	Purpose
`.github/workflows/test.yml`	Pytest matrix + smoke tests + frontend build
`.github/workflows/lint.yml`	Black, Ruff, mypy, ESLint, Bandit
`.github/workflows/e2e.yml`	Full Playwright E2E + Python E2E harness tests
`.github/workflows/integration.yml`	E2E harness, integration, and control plane tests
`.github/workflows/load-tests.yml`	Scheduled load tests and memory checks

TestFixer Automation

Automated fix loop for the first failing test:

Workflow: .github/workflows/testfixer-auto.yml
Trigger: on failed Tests workflow or manual dispatch
Requires: OPENAI_API_KEY, ANTHROPIC_API_KEY secrets
Output: PR with proposed fix, plus .testfixer/attempts.jsonl artifact
Artifacts: .testfixer/runs/<timestamp>_<runid>_<sha>/ (stdout/stderr, exit code, env/resources, kernel logs when available)

You can also run it locally:

aragora testfixer . --test-command "pytest tests/ -q --maxfail=1"

# Custom artifact location or disable diagnostics
aragora testfixer . --artifacts-dir /tmp/testfixer-runs
aragora testfixer . --no-diagnostics

Test Organization

tests/
├── conftest.py                    # Shared fixtures and setup
├── test_*.py                      # Main test files (400+)
├── benchmarks/                    # Performance benchmarks
├── e2e/                           # End-to-end tests with test harness
│   ├── conftest.py               # E2E-specific fixtures
│   ├── harness.py                # E2E test harness implementation
│   └── test_full_flow.py         # Full system flow tests
├── integration/                   # Integration tests
│   ├── conftest.py               # Integration-specific fixtures
│   ├── test_api_workflow.py      # API workflow tests
│   ├── test_debate_lifecycle.py  # Full debate lifecycle
│   └── test_websocket_events.py  # WebSocket event tests
├── skills/                        # Skills system tests
│   ├── builtin/                  # Built-in skill coverage
│   └── test_*.py                 # Registry/loader/base tests
├── security/                      # Security tests
│   ├── test_auth_boundaries.py   # Auth boundary tests
│   ├── test_cors.py              # CORS handling
│   ├── test_csrf_protection.py   # CSRF protection
│   ├── test_input_validation.py  # Input validation
│   ├── test_rate_limit_enforcement.py
│   └── test_sql_injection.py     # SQL injection prevention
└── storage/                       # Storage layer tests
    └── test_*.py                  # Database and persistence tests

Running Tests

Basic Commands

# Run all tests with coverage
pytest tests/ --cov=aragora --cov-report=html

# Run tests excluding slow tests
pytest tests/ -m "not slow"

# Run only integration tests
pytest tests/integration/ -m integration

# Run with specific timeout
pytest tests/ --timeout=30

# Run in parallel (requires pytest-xdist)
pytest tests/ -n auto

# Run with output capture disabled (see prints)
pytest tests/ -s

Auth/RBAC Checks

By default, handler permission decorators are bypassed during tests. Set ARAGORA_TEST_REAL_AUTH=1 to exercise real RBAC checks in a test run.

If RBAC decorators block integration endpoints, mock an AuthorizationContext in the integration tests (see tests/integration/test_*_api.py).

OpenAPI Route Validation

Validate handler routes against the OpenAPI spec:

python scripts/validate_openapi_routes.py

Test Markers

Tests use markers to categorize them:

Marker	Description	Usage
`slow`	Tests that take >5 seconds	`-m "not slow"` to skip
`load`	Load/stress tests	`-m load` to run only
`integration`	Integration tests	`-m integration`
`integration_minimal`	Minimal integration baseline (no external services)	`-m integration_minimal`
`knowledge`	Knowledge Mound tests	`-m knowledge`
`e2e`	End-to-end tests	`-m e2e`
`no_auto_auth`	Opt out of autouse auth mocking in handler tests	`-m no_auto_auth`

# Run only fast tests
pytest tests/ -m "not slow and not load"

# Run integration tests only
pytest tests/ -m integration

# Run minimal integration baseline
pytest tests/integration/ -m integration_minimal

# Run everything except e2e
pytest tests/ -m "not e2e"

Environment Variables

Some tests require specific environment configuration:

# Force specific similarity backend (avoid slow model loading)
ARAGORA_CONVERGENCE_BACKEND=jaccard pytest tests/test_debate_convergence_comprehensive.py

# Set test API keys (use mock values for unit tests)
ANTHROPIC_API_KEY=test-key pytest tests/

# Enable debug logging
ARAGORA_DEBUG=1 pytest tests/ -v

Writing Tests

Test File Naming

Test files: test_<module_name>.py
Test classes: Test<ClassName>
Test methods: test_<behavior_description>

# tests/test_debate_convergence.py
class TestJaccardBackend:
    def test_identical_texts_return_similarity_of_one(self):
        ...

Using Fixtures

Common fixtures from conftest.py:

import pytest

def test_with_temp_database(temp_db):
    """Uses temporary SQLite database that's cleaned up after test."""
    from aragora.ranking.elo import EloSystem
    elo = EloSystem(db_path=temp_db)
    # ... test code

def test_with_mock_storage(mock_storage):
    """Uses pre-configured mock DebateStorage."""
    debates = mock_storage.list_debates()
    assert len(debates) == 2

def test_with_mock_agents(mock_agents):
    """Uses list of 3 mock agents: claude, gemini, gpt4."""
    assert len(mock_agents) == 3

def test_requiring_api_keys(mock_api_keys):
    """Sets mock ANTHROPIC_API_KEY, OPENAI_API_KEY, GOOGLE_API_KEY."""
    import os
    assert os.getenv("ANTHROPIC_API_KEY") == "test-anthropic-key"

def test_with_clean_environment(clean_env):
    """All API keys removed for clean slate testing."""
    import os
    assert os.getenv("ANTHROPIC_API_KEY") is None

Async Tests

Use pytest-asyncio for async tests:

import pytest

@pytest.mark.asyncio
async def test_async_debate():
    """Async tests are automatically detected."""
    result = await some_async_function()
    assert result is not None

class TestAsyncClass:
    @pytest.mark.asyncio
    async def test_async_method(self):
        """Async methods in classes work too."""
        pass

Mocking External Services

from unittest.mock import Mock, patch, AsyncMock

def test_api_call_mocked():
    """Mock external API calls."""
    with patch("aragora.agents.anthropic.AnthropicAgent._call_api") as mock:
        mock.return_value = {"content": "mocked response"}
        # ... test code

@pytest.mark.asyncio
async def test_async_api_mocked():
    """Mock async API calls."""
    with patch("module.async_function", new=AsyncMock(return_value="result")):
        result = await some_function()
        assert result == "result"

Testing Error Handling

import pytest

def test_raises_on_invalid_input():
    """Test that exceptions are raised correctly."""
    with pytest.raises(ValueError, match="Invalid input"):
        function_that_should_raise("bad input")

def test_custom_exception():
    """Test custom exception types."""
    from aragora.exceptions import DebateError
    with pytest.raises(DebateError):
        start_invalid_debate()

Fixtures Reference

Auto-Used Fixtures

These run automatically for every test:

Fixture	Purpose
`reset_circuit_breakers`	Resets CircuitBreaker state
`clear_handler_cache`	Clears handler TTL cache
`reset_supabase_env`	Clears Supabase env vars
`reset_lazy_globals`	Resets module-level lazy globals

Available Fixtures

Fixture	Description
`temp_db`	Temporary SQLite database path
`temp_dir`	Temporary directory (Path)
`temp_nomic_dir`	Temporary nomic state directory
`mock_storage`	Mock DebateStorage with sample data
`mock_elo_system`	Mock EloSystem with sample rankings
`mock_agent`	Single mock agent
`mock_agents`	List of 3 mock agents
`mock_environment`	Mock Environment for arena testing
`mock_emitter`	Mock event emitter
`mock_auth_config`	Mock AuthConfig
`handler_context`	Complete handler context dict
`elo_system`	Real EloSystem with temp database
`continuum_memory`	Real ContinuumMemory with temp database
`clean_env`	Clears all API key env vars
`mock_api_keys`	Sets mock API keys
`sample_debate_messages`	Sample debate message list
`sample_critique`	Sample critique dict

Coverage

Running Coverage

# Basic coverage
pytest tests/ --cov=aragora

# HTML report
pytest tests/ --cov=aragora --cov-report=html
open htmlcov/index.html

# Terminal report
pytest tests/ --cov=aragora --cov-report=term-missing

# Fail if below threshold
pytest tests/ --cov=aragora --cov-fail-under=70

# Coverage for specific modules
pytest tests/ --cov=aragora/debate --cov=aragora/server

Current Coverage Targets

Module	Target	Priority
`aragora/debate/`	80%	Critical
`aragora/server/handlers/`	70%	High
`aragora/agents/`	60%	Medium
`aragora/memory/`	70%	High
`aragora/billing/`	80%	Critical

E2E Test Harness

The E2E test harness provides a complete integration testing environment for testing full system workflows including the control plane, task scheduling, and debate orchestration.

Harness Overview

The harness (tests/e2e/harness.py) spins up:

ControlPlaneCoordinator
TaskScheduler
Mock agents with configurable behaviors
Optional Redis/PostgreSQL connections
Metrics and tracing support

Running E2E Tests Locally

# Run all E2E tests
pytest tests/e2e/ -v

# Run with Redis (requires Redis running locally)
REDIS_URL=redis://localhost:6379 pytest tests/e2e/ -v

# Run with verbose CI-style logging
ARAGORA_CI=true pytest tests/e2e/ -v

# Run specific test class
pytest tests/e2e/test_full_flow.py::TestDebateIntegration -v

CI-safe E2E subset

Some E2E suites exercise external services (Redis, third-party connectors) and may require local dependencies. For a stable CI-friendly subset that stays in-memory, run:

pytest \
  tests/e2e/test_full_flow.py \
  tests/e2e/test_control_plane_workflows.py \
  tests/e2e/test_debate_crash_recovery.py \
  tests/e2e/test_chat_result_routing.py \
  tests/e2e/test_api_rate_limiting.py \
  tests/e2e/test_complete_user_signup_flow.py -v

Using the Harness in Tests

Basic usage with context manager:

import pytest
from tests.e2e.harness import e2e_environment, E2ETestConfig

@pytest.mark.asyncio
async def test_task_workflow():
    async with e2e_environment() as harness:
        # Submit a task
        task_id = await harness.submit_task(
            task_type="analysis",
            payload={"input": "test data"},
            required_capabilities=["analysis"],
        )

        # Wait for completion (harness auto-processes with mock agents)
        result = await harness.wait_for_task(task_id)

        assert result is not None
        assert result.status.value == "completed"

Using fixtures from conftest.py:

@pytest.mark.asyncio
async def test_with_fixture(e2e_harness):
    # Fixture provides pre-configured harness with 3 agents
    task_id = await e2e_harness.submit_task("test", {"data": "value"})
    result = await e2e_harness.wait_for_task(task_id)
    assert result is not None

Running Debates Through the Harness

@pytest.mark.asyncio
async def test_debate():
    async with e2e_environment() as harness:
        # Run a debate directly
        result = await harness.run_debate(
            topic="Should we use microservices?",
            rounds=3,
        )

        assert result is not None

        # Or run via control plane (full task lifecycle)
        result = await harness.run_debate_via_control_plane(
            topic="API design best practices",
            rounds=2,
        )

        assert result["consensus_reached"] is True

Custom Agent Configuration

from tests.e2e.harness import E2ETestConfig, MockAgentConfig, e2e_environment

@pytest.mark.asyncio
async def test_custom_agents():
    config = E2ETestConfig(
        num_agents=5,
        agent_capabilities=["code", "review"],
        fail_rate=0.1,  # 10% simulated failures
    )

    async with e2e_environment(config) as harness:
        # Create additional specialized agent
        agent = await harness.create_agent(
            agent_id="specialist",
            capabilities=["security", "audit"],
        )

        # Test with specialized capability
        task_id = await harness.submit_task(
            task_type="audit",
            payload={},
            required_capabilities=["security"],
        )

        result = await harness.wait_for_task(task_id)

Available E2E Fixtures

Fixture	Description
`e2e_harness`	Basic harness with 3 agents, in-memory storage
`e2e_harness_with_redis`	Harness with Redis backend
`debate_harness`	Debate-focused harness with 4 agents
`load_test_harness`	Load testing harness with 10 agents
`harness_config`	Customizable configuration object
`mock_agent_factory`	Factory for creating mock agents

CI Integration

E2E tests run in CI via .github/workflows/integration.yml:

# What CI runs
pytest tests/e2e/ -v --tb=short --timeout=180

# With environment
REDIS_URL=redis://localhost:6379
ARAGORA_CI=true

The harness automatically adjusts timeouts when running in CI:

Default timeout: 30s (local) / 60s (CI)
Task timeout: 10s (local) / 30s (CI)

Specialized Harnesses

DebateTestHarness - For debate-focused testing:

from tests.e2e.harness import DebateTestHarness

harness = DebateTestHarness()
await harness.start()

# Run tracked debates
await harness.run_debate_with_tracking("Topic 1", rounds=2)
await harness.run_debate_with_tracking("Topic 2", rounds=2)

# Get metrics
rate = harness.get_consensus_rate()
results = harness.get_debate_results()

await harness.stop()

LoadTestHarness - For load/performance testing:

from tests.e2e.harness import LoadTestHarness

harness = LoadTestHarness()
await harness.start()

# Submit many tasks concurrently
task_ids = await harness.submit_concurrent_tasks(count=100)

# Measure throughput
metrics = await harness.measure_throughput(task_count=50)
print(f"Tasks/second: {metrics['tasks_per_second']}")

await harness.stop()

CI/CD Integration

Tests run automatically on CI:

# .github/workflows/test.yml (simplified)
jobs:
  test:
    runs-on: ${{ matrix.os }}
    strategy:
      matrix:
        os: [ubuntu-latest, windows-latest, macos-latest]
        python-version: ["3.10", "3.11", "3.12"]
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: ${{ matrix.python-version }}
      - run: pip install -e ".[dev,research]" && pip install pytest-cov pytest-timeout
      - run: pytest tests/ -v --timeout=60 --cov=aragora --cov-report=xml --cov-report=term-missing -x --tb=short

Test Commands in CI

# Full test suite
pytest tests/ -v --timeout=60 --cov=aragora --cov-report=xml --cov-report=term-missing -x --tb=short

# CLI smoke (demo)
aragora ask "Smoke test a demo debate" --demo --rounds 1

# Server smoke (non-default ports to avoid collisions)
aragora serve --api-port 8090 --ws-port 8766 --host 127.0.0.1

Debugging Tests

Using pdb

# Drop into debugger on failure
pytest tests/test_file.py --pdb

# Drop into debugger on first failure
pytest tests/test_file.py --pdb -x

# Start debugger at specific line
# Add: import pdb; pdb.set_trace() in code
pytest tests/test_file.py -s

Verbose Output

# Show test names as they run
pytest tests/ -v

# Show full diff on assertion failures
pytest tests/ -vv

# Show local variables in tracebacks
pytest tests/ --tb=long

# Show only the first N failures
pytest tests/ --maxfail=3

Logging

import logging

def test_with_logging(caplog):
    """Capture log output during tests."""
    with caplog.at_level(logging.DEBUG):
        function_that_logs()
    assert "expected message" in caplog.text

Performance Testing

Benchmarks

# Run benchmarks
pytest tests/benchmarks/ -v

# Run task ROI/quality benchmark harness (offline demo mode)
python benchmarks/task_bench.py --mode demo --profile fast
# Include Pulse trending context (network required)
python benchmarks/task_bench.py --mode demo --profile fast --enable-trending

Sample artifacts live in `examples/task_bench/demo`.

# Run with timing info
pytest tests/ --durations=10

# Profile slow tests
pytest tests/ --profile

Load Tests

# Run load tests only
pytest tests/ -m load -v

# Example load test
pytest tests/integration/test_server_under_load.py -v

Best Practices

Test Isolation

Use fixtures for setup/teardown - Don't rely on test order
Mock external dependencies - Don't make real API calls
Use temporary databases - Clean state for each test
Reset global state - Use autouse fixtures

Test Quality

One assertion per test (when practical)
Descriptive test names - test_<action>_<expected_result>
Test edge cases - Empty inputs, None values, boundaries
Test error paths - Exceptions, invalid inputs

Performance

Mock slow operations - API calls, model loading
Use @pytest.mark.slow - For tests >5 seconds
Parallelize with -n auto - When tests are isolated
Cache expensive fixtures - Use scope="session" when safe

Domain-Specific E2E Tests

In addition to the system-level E2E harness, Aragora includes domain-specific E2E test suites for compliance, privacy, and security features.

Compliance E2E Tests

Location: tests/e2e/test_compliance_e2e.py

Tests SOC 2, GDPR, and audit compliance workflows:

# Run compliance E2E tests
pytest tests/e2e/test_compliance_e2e.py -v

# Run specific compliance area
pytest tests/e2e/test_compliance_e2e.py::TestSOC2ReportGeneration -v
pytest tests/e2e/test_compliance_e2e.py::TestGDPRExport -v
pytest tests/e2e/test_compliance_e2e.py::TestRightToBeForgotten -v

Coverage includes:

SOC 2 report generation
GDPR data export (Article 15)
Right-to-be-Forgotten (Article 17) with grace periods
Audit log query and export
Audit integrity verification

Privacy E2E Tests

Location: tests/e2e/test_privacy_e2e.py

Tests consent management, data retention, and anonymization:

# Run privacy E2E tests
pytest tests/e2e/test_privacy_e2e.py -v

# Run specific privacy area
pytest tests/e2e/test_privacy_e2e.py::TestConsentFlow -v
pytest tests/e2e/test_privacy_e2e.py::TestDataRetention -v
pytest tests/e2e/test_privacy_e2e.py::TestAnonymization -v

Coverage includes:

Consent lifecycle (grant, check, revoke)
Data retention policies (delete, archive, anonymize)
HIPAA anonymization (redaction, hashing)
K-anonymity verification
Differential privacy validation

DSAR/ABAC E2E Tests

Location: tests/e2e/test_dsar_abac_lifecycle.py

Tests Data Subject Access Requests and Attribute-Based Access Control:

# Run DSAR/ABAC tests
pytest tests/e2e/test_dsar_abac_lifecycle.py -v

# Run specific areas
pytest tests/e2e/test_dsar_abac_lifecycle.py::TestGDPRArticle15Export -v
pytest tests/e2e/test_dsar_abac_lifecycle.py::TestABACTimeConditions -v
pytest tests/e2e/test_dsar_abac_lifecycle.py::TestABACIPConditions -v

Coverage includes:

GDPR Article 15 data export workflow
RTBF lifecycle (grace periods, legal holds, cancellation)
Data portability (machine-readable format)
Multi-tenant data isolation
Time-based access control (business hours, date ranges)
IP-based restrictions (allowlist, CIDR, blocklist)
Resource ownership conditions
Tag-based access control
Combined ABAC conditions

Secrets Rotation E2E Tests

Location: tests/e2e/test_secrets_rotation_lifecycle.py

Tests secrets management and rotation for SOC 2 CC6.2 compliance:

# Run secrets rotation tests
pytest tests/e2e/test_secrets_rotation_lifecycle.py -v

# Run specific areas
pytest tests/e2e/test_secrets_rotation_lifecycle.py::TestSecretRotation -v
pytest tests/e2e/test_secrets_rotation_lifecycle.py::TestVerificationAndRollback -v
pytest tests/e2e/test_secrets_rotation_lifecycle.py::TestComplianceReporting -v

Coverage includes:

Secret registration (API keys, JWT, database, OAuth, encryption keys)
Full rotation lifecycle (rotate, verify, complete)
Grace period handling
Verification and rollback scenarios
Custom rotation handlers
Compliance reporting
Due secrets detection and processing

Audit Retention E2E Tests

Location: tests/e2e/test_audit_retention_lifecycle.py

Tests audit log retention and compliance export for SOC 2 CC6.3 and CC7.1:

# Run audit retention tests
pytest tests/e2e/test_audit_retention_lifecycle.py -v

# Run specific areas
pytest tests/e2e/test_audit_retention_lifecycle.py::TestRetentionEnforcement -v
pytest tests/e2e/test_audit_retention_lifecycle.py::TestComplianceExport -v
pytest tests/e2e/test_audit_retention_lifecycle.py::TestIntegrityVerification -v

Coverage includes:

Audit entry creation and hash computation
Retention policy management (90-day, 7-year SOC 2)
Retention enforcement (removal of expired entries)
Compliance export (SOC 2 Type II, ISO 27001, Syslog)
Audit query and filtering (action, actor, time range)
Hash chain integrity verification (tamper detection)
Sequence number continuity checks

SOC 2 Test Organization

Domain E2E tests map to SOC 2 controls:

Control	Test File	Key Areas
CC6.2 Credential Management	`test_secrets_rotation_lifecycle.py`	Key rotation, grace periods
CC6.3 Change Management	`test_audit_retention_lifecycle.py`	Config change audit trails
CC7.1 System Monitoring	`test_audit_retention_lifecycle.py`	Audit retention, integrity
CC7.2 Security Monitoring	`test_compliance_e2e.py`	Audit logs, SOC 2 reports
P1-02 Data Subject Rights	`test_dsar_abac_lifecycle.py`	DSAR export, RTBF
CC6.1 Access Control	`test_dsar_abac_lifecycle.py`	ABAC conditions
P3-01 Privacy Notices	`test_privacy_e2e.py`	Consent management

Running All Domain E2E Tests

# Run all domain-specific E2E tests
pytest tests/e2e/test_compliance_e2e.py tests/e2e/test_privacy_e2e.py \
       tests/e2e/test_dsar_abac_lifecycle.py tests/e2e/test_secrets_rotation_lifecycle.py \
       tests/e2e/test_audit_retention_lifecycle.py -v

# Quick compliance validation
pytest tests/e2e/ -k "compliance or privacy or dsar or secrets or audit" -v --tb=short

# Full E2E suite (system + domain)
pytest tests/e2e/ -v

Troubleshooting

Common Issues

Tests hang loading models:

# Force fast backend
ARAGORA_CONVERGENCE_BACKEND=jaccard pytest tests/

Tests fail with missing env vars:

# Use mock API keys fixture
@pytest.fixture(autouse=True)
def setup_env(mock_api_keys):
    yield

Tests pollute each other:

# Add to conftest.py
@pytest.fixture(autouse=True)
def reset_state():
    yield
    # Reset after each test
    clear_global_state()

Async tests timeout:

@pytest.mark.asyncio
@pytest.mark.timeout(30)  # Explicit timeout
async def test_slow_async():
    ...

Getting Help

Check existing test files for patterns
Review conftest.py for available fixtures
Run with -v --tb=long for detailed errors
Use --pdb to drop into debugger

Quick Start​

Test Tiers​

Integration Baseline Runner​

CI Mapping​

TestFixer Automation​

Test Organization​

Running Tests​

Basic Commands​

Auth/RBAC Checks​

OpenAPI Route Validation​

Test Markers​

Environment Variables​

Writing Tests​

Test File Naming​

Using Fixtures​

Async Tests​

Mocking External Services​

Testing Error Handling​

Fixtures Reference​

Auto-Used Fixtures​

Available Fixtures​

Coverage​

Running Coverage​

Current Coverage Targets​

E2E Test Harness​

Harness Overview​

Running E2E Tests Locally​

CI-safe E2E subset​

Using the Harness in Tests​

Running Debates Through the Harness​

Custom Agent Configuration​

Available E2E Fixtures​

CI Integration​

Specialized Harnesses​

CI/CD Integration​

Test Commands in CI​

Debugging Tests​

Using pdb​

Verbose Output​

Logging​

Performance Testing​

Benchmarks​

Load Tests​

Best Practices​

Test Isolation​

Test Quality​

Performance​

Domain-Specific E2E Tests​

Compliance E2E Tests​

Privacy E2E Tests​

DSAR/ABAC E2E Tests​

Secrets Rotation E2E Tests​

Audit Retention E2E Tests​

SOC 2 Test Organization​

Running All Domain E2E Tests​

Troubleshooting​

Common Issues​

Getting Help​

Quick Start

Test Tiers

Integration Baseline Runner

CI Mapping

TestFixer Automation

Test Organization

Running Tests

Basic Commands

Auth/RBAC Checks

OpenAPI Route Validation

Test Markers

Environment Variables

Writing Tests

Test File Naming

Using Fixtures

Async Tests

Mocking External Services

Testing Error Handling

Fixtures Reference

Auto-Used Fixtures

Available Fixtures

Coverage

Running Coverage

Current Coverage Targets

E2E Test Harness

Harness Overview

Running E2E Tests Locally

CI-safe E2E subset

Using the Harness in Tests

Running Debates Through the Harness

Custom Agent Configuration

Available E2E Fixtures

CI Integration

Specialized Harnesses

CI/CD Integration

Test Commands in CI

Debugging Tests

Using pdb

Verbose Output

Logging

Performance Testing

Benchmarks

Load Tests

Best Practices

Test Isolation

Test Quality

Performance

Domain-Specific E2E Tests

Compliance E2E Tests

Privacy E2E Tests

DSAR/ABAC E2E Tests

Secrets Rotation E2E Tests

Audit Retention E2E Tests

SOC 2 Test Organization

Running All Domain E2E Tests

Troubleshooting

Common Issues

Getting Help