Skip to main content

Rate Limits

Understand and work within ZenSearch API rate limits.

Current Limits

API rate limits on ZenSearch are sized to your enterprise deployment and configured as part of your engagement. Defaults err on the side of generous; production limits — including burst caps and per-endpoint allowances — are negotiated with your team.

For questions about your current limits or to request a temporary increase, contact your account team or [email protected].

By Endpoint

EndpointLimit
/searchStandard
/chatStandard
/chat/streamStandard
/agents/*/execute/stream50% of standard
Bulk operations10% of standard

Rate Limit Headers

Every response includes rate limit information:

X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1705766400
HeaderDescription
X-RateLimit-LimitMax requests per window
X-RateLimit-RemainingRemaining requests
X-RateLimit-ResetUnix timestamp when limit resets

Rate Limit Response

When exceeded:

{
"error": {
"code": "rate_limit_exceeded",
"message": "Rate limit exceeded. Try again in 45 seconds.",
"retryAfter": 45
}
}

HTTP Status: 429 Too Many Requests

Headers:

Retry-After: 45

Handling Rate Limits

Exponential Backoff

async function requestWithBackoff(fn, maxRetries = 5) {
for (let i = 0; i < maxRetries; i++) {
try {
return await fn();
} catch (error) {
if (error.status === 429) {
const delay = Math.pow(2, i) * 1000;
await sleep(delay);
continue;
}
throw error;
}
}
throw new Error('Max retries exceeded');
}

Using Retry-After

async function requestWithRetry(fn) {
try {
return await fn();
} catch (error) {
if (error.status === 429) {
const retryAfter = error.headers['retry-after'];
await sleep(retryAfter * 1000);
return await fn();
}
throw error;
}
}

Best Practices

Optimize Requests

  1. Batch operations when possible
  2. Cache responses to reduce calls
  3. Use pagination efficiently
  4. Filter server-side instead of client-side

Monitor Usage

  1. Track X-RateLimit-Remaining
  2. Set alerts for low remaining
  3. Log rate limit events
  4. Review usage patterns

Architecture

  1. Implement request queuing
  2. Use connection pooling
  3. Distribute load over time
  4. Consider dedicated API keys per service

Requesting Higher Limits

For higher limits:

  1. Contact your account team to revisit limits as part of your engagement
  2. Email [email protected] for new deployments
  3. Open a support ticket for temporary increases

Next Steps