Skip to main content
Reference

Rate Limits

XRNotify enforces rate limits on API requests per key using a token bucket algorithm. Limits vary by plan and are designed to support high-throughput integrations on paid plans.

Overview

Rate limits apply per API key, not per account. If you have multiple API keys, each operates its own independent token bucket. All API endpoints count toward the same rate limit bucket for a given key.

Webhook deliveries are not counted. Inbound event deliveries from XRNotify to your endpoint do not consume API rate limit tokens. Only your outbound requests to the XRNotify API are counted.

Limits by plan

PlanRequests / minuteRequests / dayBurst capacity
Free601,00020
Starter30010,00050
Pro1,000100,000200
EnterpriseCustomUnlimitedCustom

Token bucket algorithm

XRNotify uses a token bucket algorithm for rate limiting. Each API key starts with a bucket of tokens equal to its plan's burst capacity. Every API request consumes one token from the bucket. The bucket refills continuously at a rate of (requests_per_minute / 60) tokens per second, up to the burst capacity ceiling.

For example, a Pro plan key refills at roughly 16.7 tokens/second and can absorb a burst of 200 requests before the rate limit is enforced. This means short bursts of activity are handled gracefully without triggering 429 errors, as long as the average request rate over time stays within the per-minute limit.

# Token bucket refill rate formula:
refill_rate = requests_per_minute / 60   # tokens per second

# Example for Pro plan:
refill_rate = 1000 / 60 = ~16.7 tokens/second
burst_capacity = 200 tokens

# A burst of 200 requests is absorbed instantly.
# After the burst, the bucket refills at 16.7/s.
# Sustained rate above 1000 req/min will trigger 429.

Rate limit headers

Every API response includes the following headers so you can monitor your usage and slow down proactively before hitting the limit:

HeaderDescription
X-RateLimit-LimitYour plan's requests per minute limit.
X-RateLimit-RemainingNumber of tokens remaining in the current bucket.
X-RateLimit-ResetUnix timestamp when the bucket will be full again.
Retry-AfterSeconds to wait before retrying. Only present on 429 responses.

Handling 429 responses

When you receive a 429 response, read the Retry-After header and wait that many seconds before retrying. The following example implements a simple retry helper with respect for the server-provided delay:

async function apiCall(url, options, retries = 3) {
  const response = await fetch(url, options);

  if (response.status === 429 && retries > 0) {
    const retryAfter = parseInt(response.headers.get('Retry-After') || '5', 10);
    console.warn(`Rate limited. Waiting ${retryAfter}s before retry...`);
    await new Promise(resolve => setTimeout(resolve, retryAfter * 1000));
    return apiCall(url, options, retries - 1);
  }

  if (!response.ok) {
    const error = await response.json();
    throw new Error(`API error: ${error.error.code} — ${error.error.message}`);
  }

  return response.json();
}

// Usage:
const webhooks = await apiCall('https://api.xrnotify.io/v1/webhooks', {
  headers: { 'X-XRNotify-Key': process.env.XRNOTIFY_API_KEY },
});

Exponential backoff for persistent 429s

If you're consistently hitting the rate limit, use exponential backoff with jitter to spread retries over time and avoid a retry thundering herd:

async function apiCallWithBackoff(url, options, attempt = 0) {
  const MAX_ATTEMPTS = 5;
  const response = await fetch(url, options);

  if (response.status === 429 && attempt < MAX_ATTEMPTS) {
    // Use server's Retry-After if available, otherwise exponential backoff
    const serverDelay = parseInt(response.headers.get('Retry-After') || '0', 10);
    const exponential = Math.pow(2, attempt) * 1000; // 1s, 2s, 4s, 8s, 16s
    const jitter = Math.random() * 500;              // up to 500ms jitter
    const delay = Math.max(serverDelay * 1000, exponential + jitter);

    await new Promise(resolve => setTimeout(resolve, delay));
    return apiCallWithBackoff(url, options, attempt + 1);
  }

  return response;
}

Best practices

  • Cache GET responses. Webhook lists, delivery logs, and stats do not change on every request. Cache them for 30–60 seconds to dramatically reduce API call volume.
  • Batch operations. When creating multiple webhooks or retrying multiple deliveries, stagger your requests rather than firing them all simultaneously. A short sleep between requests protects your burst budget.
  • Monitor X-RateLimit-Remaining. When the remaining token count drops below 20% of your limit, slow your request rate proactively — before you hit 0 and start receiving 429 errors.
  • Use the Retry-After header. Always respect the server-provided delay. Using a fixed backoff shorter than the server's window will result in wasted requests that still fail.
  • Consider upgrading your plan. If you consistently need more than 60 requests/minute, the Starter or Pro plan may be more appropriate. Contact us for Enterprise limits.

Next steps