Skip to main content

API Rate Limits

Audience: This guide is for API consumers who need to understand and work with rate limits. For implementation details and developer documentation, see RATE_LIMITING.md.

Aragora implements rate limiting to ensure fair usage and service stability. This document describes the rate limiting system and how to work within its constraints.

Rate Limit Tiers

Rate limits are based on your subscription tier:

TierRequests/MinuteBurst SizeNotes
Free1060Unauthenticated or free tier
Starter50100Basic paid plan
Professional200400Standard paid plan
Enterprise10002000Custom limits available

What is "Burst Size"?

The burst size allows short spikes above your sustained rate. For example, a free tier user can make up to 60 requests in quick succession, but sustained usage must stay under 10 req/min.

Default Limits

For endpoints without tier-specific limits:

Limit TypeDefaultEnvironment Variable
Per-endpoint60 req/minARAGORA_RATE_LIMIT
Per-IP (global)120 req/minARAGORA_IP_RATE_LIMIT
Burst multiplier2xARAGORA_BURST_MULTIPLIER

Response Headers

Rate-limited responses include these headers:

X-RateLimit-Limit: 60          # Your rate limit
X-RateLimit-Remaining: 45 # Requests remaining in window
X-RateLimit-Reset: 1705123456 # Unix timestamp when limit resets
Retry-After: 30 # Seconds to wait (only on 429)

Rate Limit Responses

When rate limited, the API returns:

HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 30

{
"error": "Rate limit exceeded",
"retry_after": 30,
"limit": 60,
"remaining": 0
}

Best Practices

1. Implement Exponential Backoff

When you receive a 429 response, wait and retry with increasing delays:

import time
import random

def make_request_with_backoff(url, max_retries=5):
for attempt in range(max_retries):
response = requests.get(url)

if response.status_code != 429:
return response

# Get retry delay from header or calculate
retry_after = int(response.headers.get('Retry-After', 0))
if not retry_after:
retry_after = min(60, (2 ** attempt) + random.random())

time.sleep(retry_after)

raise Exception("Max retries exceeded")

2. Monitor Your Usage

Check the X-RateLimit-Remaining header to see how many requests you have left:

response = requests.get(f"\{API_BASE\}/api/debates")
remaining = int(response.headers.get('X-RateLimit-Remaining', 0))

if remaining < 10:
print(f"Warning: Only \{remaining\} requests remaining")

3. Batch Operations When Possible

Instead of making many small requests, use batch endpoints:

# Instead of this (10 requests):
for debate_id in debate_ids:
response = requests.get(f"\{API_BASE\}/api/debates/\{debate_id\}")

# Do this (1 request):
response = requests.post(f"\{API_BASE\}/api/debates/batch", json={
"debate_ids": debate_ids
})

4. Cache Responses

Cache responses that don't change frequently:

  • Debate results (after completion)
  • Agent rankings and ELO scores
  • Historical analytics

5. Use Webhooks for Real-Time Updates

Instead of polling for debate status, register webhooks:

requests.post(f"\{API_BASE\}/api/webhooks", json={
"url": "https://your-app.com/webhook",
"events": ["debate.completed", "debate.verdict"]
})

Per-User Rate Limiting

For authenticated users, Aragora implements per-user rate limiting based on user ID rather than IP address. This provides fairer limits when users share IPs (e.g., corporate networks) and prevents abuse via IP rotation.

Action-Based Limits

Different operations have different limits per authenticated user:

ActionLimit (req/min)Description
default60Default for authenticated requests
debate_create10Creating new debates
vote30Voting on proposals
agent_call120Calling agent APIs
export5Exporting data
admin300Admin operations

How It Works

  1. Authenticated requests are rate-limited by user_id
  2. Unauthenticated requests fall back to IP-based limiting
  3. Each action has its own bucket (limits don't share)
  4. Burst multiplier (2x) allows short spikes

Response Headers

Per-user rate-limited responses include additional context:

X-RateLimit-Limit: 10           # Your limit for this action
X-RateLimit-Remaining: 7 # Requests remaining
X-RateLimit-Reset: 1705123456 # Reset timestamp
X-RateLimit-Action: debate_create # Which action was limited

Decorator Usage

Backend handlers use the @user_rate_limit decorator:

from aragora.server.handlers.utils.rate_limit import user_rate_limit

class DebatesHandler(BaseHandler):
@user_rate_limit(action="debate_create")
def _create_debate(self, handler):
# Limited to 10 debates/minute per user
...

@user_rate_limit(action="vote")
def _submit_vote(self, handler):
# Limited to 30 votes/minute per user
...

Checking Your Status

To see your current rate limit status across all actions:

curl -H "Authorization: Bearer $TOKEN" \
https://api.aragora.ai/api/rate-limit/status

Response:

{
"user_id": "user-123",
"limits": {
"debate_create": { "remaining": 8, "limit": 10, "retry_after": 0 },
"vote": { "remaining": 30, "limit": 30, "retry_after": 0 }
}
}

Endpoint-Specific Limits

Some endpoints have stricter limits to prevent abuse:

EndpointLimitNotes
POST /api/debate10/minDebate creation is expensive
POST /api/gauntlet5/minStress tests are resource-intensive
GET /api/health120/minHigher limit for monitoring
GET /api/debates60/minStandard list endpoint

IP-Based Limits

In addition to tier limits, there's a global per-IP limit of 120 req/min across all endpoints. This prevents a single source from overwhelming the API even with multiple accounts.

Trusted Proxies

If you're behind a load balancer or proxy, configure ARAGORA_TRUSTED_PROXIES so rate limits apply to the real client IP:

export ARAGORA_TRUSTED_PROXIES="10.0.0.0/8,172.16.0.0/12,192.168.0.0/16"

Need Higher Limits?

Enterprise customers can request custom rate limits. Contact support or upgrade your tier at /billing.

Debugging Rate Limits

To check your current rate limit status:

curl -v https://api.aragora.ai/api/health 2>&1 | grep -i ratelimit

Or in your code:

response = requests.get(f"\{API_BASE\}/api/health")
print(f"Limit: {response.headers.get('X-RateLimit-Limit')}")
print(f"Remaining: {response.headers.get('X-RateLimit-Remaining')}")
print(f"Resets at: {response.headers.get('X-RateLimit-Reset')}")