Rate Limits
Understand and work within ZenSearch API rate limits.
Current Limits
API rate limits on ZenSearch are sized to your enterprise deployment and configured as part of your engagement. Defaults err on the side of generous; production limits — including burst caps and per-endpoint allowances — are negotiated with your team.
For questions about your current limits or to request a temporary increase, contact your account team or [email protected].
By Endpoint
| Endpoint | Limit |
|---|---|
/search | Standard |
/chat | Standard |
/chat/stream | Standard |
/agents/*/execute/stream | 50% of standard |
| Bulk operations | 10% of standard |
Rate Limit Headers
Every response includes rate limit information:
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 999
X-RateLimit-Reset: 1705766400
| Header | Description |
|---|---|
X-RateLimit-Limit | Max requests per window |
X-RateLimit-Remaining | Remaining requests |
X-RateLimit-Reset | Unix timestamp when limit resets |
Rate Limit Response
When exceeded:
{
"error": {
"code": "rate_limit_exceeded",
"message": "Rate limit exceeded. Try again in 45 seconds.",
"retryAfter": 45
}
}
HTTP Status: 429 Too Many Requests
Headers:
Retry-After: 45
Handling Rate Limits
Exponential Backoff
async function requestWithBackoff(fn, maxRetries = 5) {
for (let i = 0; i < maxRetries; i++) {
try {
return await fn();
} catch (error) {
if (error.status === 429) {
const delay = Math.pow(2, i) * 1000;
await sleep(delay);
continue;
}
throw error;
}
}
throw new Error('Max retries exceeded');
}
Using Retry-After
async function requestWithRetry(fn) {
try {
return await fn();
} catch (error) {
if (error.status === 429) {
const retryAfter = error.headers['retry-after'];
await sleep(retryAfter * 1000);
return await fn();
}
throw error;
}
}
Best Practices
Optimize Requests
- Batch operations when possible
- Cache responses to reduce calls
- Use pagination efficiently
- Filter server-side instead of client-side
Monitor Usage
- Track
X-RateLimit-Remaining - Set alerts for low remaining
- Log rate limit events
- Review usage patterns
Architecture
- Implement request queuing
- Use connection pooling
- Distribute load over time
- Consider dedicated API keys per service
Requesting Higher Limits
For higher limits:
- Contact your account team to revisit limits as part of your engagement
- Email [email protected] for new deployments
- Open a support ticket for temporary increases
Next Steps
- Authentication - API key setup
- Errors - Error handling