Rate Limiting
The CLE Engine API uses rate limiting to ensure fair usage and service stability. This guide explains how rate limits work and how to handle them.
Overview
Rate limiting is applied at two levels:
- Monthly Quota: Total requests allowed per billing period
- Per-Minute Rate: Maximum requests per minute to prevent abuse
Pricing Tiers
| Tier | Monthly Quota | Rate Limit | Use Case |
|---|---|---|---|
| Free | 100 requests | 10/minute | Testing and evaluation |
| Starter | 1,000 requests | 60/minute | Small law firms, solo practitioners |
| Professional | 10,000 requests | 120/minute | Mid-size firms, CLE providers |
| Enterprise | Unlimited | 300/minute | Large organizations, custom integrations |
How It Works
Monthly Quota
- Quota resets on the first day of each month at 00:00 UTC
- Each successful API call counts as one request toward your quota
- Failed requests (4xx/5xx) also count toward quota
- Unused quota does not roll over to the next month
Per-Minute Rate Limiting
- Measured using a sliding window algorithm
- Requests beyond the limit return
429 Too Many Requests - Wait at least 1 minute before retrying after hitting rate limit
Response Headers
All API responses include rate limit information:
HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 766
X-RateLimit-Reset: 1738368000
| Header | Description |
|---|---|
X-RateLimit-Limit | Your tier's monthly quota |
X-RateLimit-Remaining | Requests remaining this month |
X-RateLimit-Reset | Unix timestamp when quota resets |
Quota Exceeded Response
When you exceed your monthly quota:
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
{
"detail": "quota exceeded"
}
Best Practices
1. Monitor Your Usage
Check the rate limit headers on each response to track remaining quota:
response = requests.post(url, headers=headers, json=data)
remaining = int(response.headers.get("X-RateLimit-Remaining", 0))
if remaining < 100:
print(f"Warning: Only {remaining} requests remaining this month")
2. Implement Exponential Backoff
When you receive a 429 response, wait before retrying:
import time
def make_request_with_backoff(url, headers, data, max_retries=5):
for attempt in range(max_retries):
response = requests.post(url, headers=headers, json=data)
if response.status_code == 200:
return response.json()
elif response.status_code == 429:
wait_time = min(2 ** attempt, 60) # Max 60 seconds
print(f"Rate limited. Waiting {wait_time}s before retry...")
time.sleep(wait_time)
else:
response.raise_for_status()
raise Exception("Max retries exceeded")
3. Batch Your Requests
If you need to check multiple jurisdictions, space out your requests:
import time
jurisdictions = ["CA", "NY", "TX", "FL", "IL"]
for jurisdiction in jurisdictions:
result = api.compute(jurisdiction=jurisdiction)
process_result(result)
time.sleep(0.1) # 100ms delay between requests
4. Cache Results
CLE deadlines don't change frequently. Cache results to reduce API calls:
import json
from datetime import datetime, timedelta
CACHE_FILE = "cle_cache.json"
CACHE_TTL = timedelta(days=1)
def get_cle_deadline(jurisdiction, **kwargs):
cache = load_cache()
cache_key = f"{jurisdiction}:{json.dumps(kwargs, sort_keys=True)}"
# Check cache
if cache_key in cache:
cached = cache[cache_key]
if datetime.fromisoformat(cached["expires"]) > datetime.now():
return cached["data"]
# Make API request
result = api.compute(jurisdiction=jurisdiction, **kwargs)
# Cache result
cache[cache_key] = {
"data": result,
"expires": (datetime.now() + CACHE_TTL).isoformat()
}
save_cache(cache)
return result
5. Use Webhooks for Bulk Operations
For large-scale operations, consider:
- Spreading requests across multiple days
- Using off-peak hours (nights/weekends)
- Contacting support for bulk data exports
Monitoring Usage
Dashboard
View your current usage in the API Dashboard:
- Current month's request count
- Historical usage trends
- Per-endpoint breakdown
- Remaining quota
Usage API (Coming Soon)
GET /v1/usage
X-API-Key: cle_your_api_key_here
Response:
{
"tier": "starter",
"monthly_limit": 1000,
"current_usage": 234,
"remaining": 766,
"reset_date": "2026-02-01T00:00:00Z",
"usage_by_endpoint": {
"/v1/compute": 1200,
"/v1/stats": 34
}
}
Upgrading Your Plan
If you consistently hit rate limits, consider upgrading:
- Log in to cle-engine.com
- Navigate to Settings > Billing
- Select a higher tier
- Your new quota takes effect immediately
Enterprise Options
For high-volume users, we offer:
- Custom quotas: Tailored to your usage patterns
- Dedicated rate limits: Higher per-minute limits
- Priority support: Direct engineering access
- SLA guarantees: Uptime and response time commitments
- Bulk data exports: Download full jurisdiction datasets
Contact enterprise@cle-engine.com for details.
FAQs
Q: What happens when I exceed my quota?
Your API calls will return 429 Too Many Requests until the quota resets on the first of the next month. Upgrade your plan for immediate access.
Q: Do failed requests count toward my quota?
Yes, all requests that reach our servers count toward your quota, regardless of response status.
Q: Can I purchase additional quota for a single month?
Contact support for one-time quota increases. We recommend upgrading tiers for sustained higher usage.
Q: How do I know when my quota resets?
Check the X-RateLimit-Reset header for the Unix timestamp, or view the reset date in your dashboard.
Q: Is there a way to test without using my quota?
The /health endpoint doesn't require authentication and doesn't count toward quota. For /v1/compute testing, the Free tier provides 100 requests per month.
Q: Can multiple team members share one API key?
Yes, but all requests from a single key share the same quota. Enterprise plans offer multiple keys with independent tracking.