How to Fix HTTP 429 Too Many Requests Error
Quick Fix Summary
TL;DRImmediately implement exponential backoff with jitter in your client code and verify server-side rate limit headers.
HTTP 429 indicates your client has exceeded the rate limit set by the server or API. It's a protective mechanism to prevent abuse and ensure service stability.
Diagnosis & Causes
Recovery Steps
Step 1: Implement Exponential Backoff with Jitter
Immediately modify your client's retry logic to respect rate limits. Exponential backoff increases wait time between retries, and jitter adds randomness to prevent thundering herds.
import time
import random
def make_request_with_backoff(url, max_retries=5):
base_delay = 1 # seconds
for attempt in range(max_retries):
try:
response = requests.get(url)
if response.status_code == 429:
retry_after = int(response.headers.get('Retry-After', base_delay * (2 ** attempt)))
jitter = random.uniform(0, 0.1 * retry_after)
time.sleep(retry_after + jitter)
continue
return response
except Exception as e:
delay = base_delay * (2 ** attempt) + random.uniform(0, 1)
time.sleep(delay)
raise Exception('Max retries exceeded') Step 2: Inspect and Respect Rate Limit Headers
Parse the server's response headers to understand your current quota and adjust request pacing dynamically.
response = requests.get('https://api.example.com/data')
print(f"Status: {response.status_code}")
print(f"Rate Limit: {response.headers.get('X-RateLimit-Limit')}")
print(f"Remaining: {response.headers.get('X-RateLimit-Remaining')}")
print(f"Reset Time: {response.headers.get('X-RateLimit-Reset')}")
print(f"Retry After: {response.headers.get('Retry-After')}") Step 3: Implement Client-Side Request Throttling
Use a token bucket or leaky bucket algorithm to pace outgoing requests and stay within limits before the server rejects them.
from threading import Semaphore, Timer
import time
class RateLimiter:
def __init__(self, rate, per):
self.rate = rate # requests
self.per = per # seconds
self.semaphore = Semaphore(rate)
self.timer = None
def acquire(self):
self.semaphore.acquire()
if not self.timer:
self.timer = Timer(self.per, self._reset)
self.timer.start()
def _reset(self):
for _ in range(self.rate):
self.semaphore.release()
self.timer = None
limiter = RateLimiter(100, 60) # 100 requests per minute
limiter.acquire()
# Make your request here Step 4: Cache Frequent Responses
Reduce the number of live API calls by implementing a caching layer for idempotent GET requests.
from functools import lru_cache
import requests
@lru_cache(maxsize=128)
def get_cached_data(endpoint, params=None):
"""Cache GET requests to avoid hitting rate limits on repeated calls."""
return requests.get(endpoint, params=params).json() Step 5: Batch Requests Where Possible
If the API supports it, combine multiple logical operations into a single HTTP request to reduce call volume.
# Instead of N requests for N items...
# response_1 = api.get_item(1)
# response_2 = api.get_item(2)
# ...
# Use a batch endpoint in a single request.
batch_ids = [1, 2, 3, 4, 5]
response = requests.post('https://api.example.com/batch', json={'ids': batch_ids})
all_items = response.json() Step 6: Scale or Distribute Request Sources
If limits are per IP or API key, distribute traffic across multiple IPs (using a proxy pool) or rotate API keys.
import itertools
API_KEYS = ['key1', 'key2', 'key3']
key_cycle = itertools.cycle(API_KEYS)
def make_request_with_rotation(url):
current_key = next(key_cycle)
headers = {'Authorization': f'Bearer {current_key}'}
return requests.get(url, headers=headers) Architect's Pro Tip
"Always design clients to treat 429 as a normal, expected response—not an error. Log it as INFO, not ERROR, to avoid alert noise and trigger your backoff logic silently."
Frequently Asked Questions
What's the difference between HTTP 429 and 503?
429 means *your client* is sending too many requests. 503 means *the server* is overloaded or down, often due to aggregate traffic from all clients.
Should I use 'Retry-After' header or implement my own backoff?
Always prefer the server-provided 'Retry-After' header if present. If absent, fall back to your exponential backoff strategy with jitter.
How do I know if the rate limit is per user, IP, or API key?
Check the API documentation. If unspecified, test by making requests from different IPs or with different credentials while monitoring the rate limit headers.