Handling rate limiting in JavaScript

Reading Time: 8 minutes

Most, if not all, REST APIs enforce rate limits to ensure reliability and performance for consumers. However, handling rate limiting responses is often overlooked or implemented in an ad-hoc manner. Inadequate handling of rate limiting responses often leads to poor user experiences and reliability issues. Developers need a solution that is robust, flexible and easy to adopt.

Whilst the Jira Cloud rate limiting and Confluence Cloud rate limiting guides both have pseudo code explaining how to handle rate limiting responses, this article provides more concrete guidance for a standard Node.js service. The article provides code that you can drop into a Jira Cloud Connect app or Confluence Cloud Connect app or perhaps an app interacting with other REST APIs. The code will need adaptation for use in Forge apps where the runtime limits execution duration and therefore the way in which retries must be attempted.

Sample Code

Throughout this article, you'll find code snippets for a JavaScript utility that can be used for making REST API invocations with support for handling rate limiting responses by applying retry policies.

The code style is meant to be as simple and self-documenting as possible, with a focus on reusability. I've used Typescript since it generally results in more robust and error free code than JavaScript, and the types convey useful information to readers. The code is available in an open source GitHub repository.

Execution contexts

Most apps or applications that call REST API will probably make the calls against a range of REST API endpoints and from multiple execution contexts such as:

  • UI context: A user interaction where a response to a user is dependent on the results of the REST API call.
  • Webhook context: A notification from the product or service owning the REST API.
  • External event context: A notification from a third party event or service.
  • Cron context: A notification from a cron like trigger service.

Configuring rate limit handling

The appropriate method of handling a rate-limited REST API response is dependent on how and why the API is being called. In UI contexts, the user will most likely expect the information within a reasonable period. This limits the amount of delay and number of retries that should be attempted, although mechanisms such as spinners can be employed to mitigate this. On the other hand, in webhook and cron contexts more generous retry delays and retry attempts may be appropriate. The number of appropriate retry delays and attempts in external event contexts is difficult to generalise because in some cases they could be invoked in the context of a third party UI or otherwise background processing.

To handle the range of possible execution contexts, our rate limit handling utility needs to be configurable. The following interface defines the configuration of the rate limiting handling:

export interface RateLimitingHandlingOptions {
  /**
   * The maximum number of fetch attempts
   */
  maxRetries: number;
  /**
   * The maximum delay in milliseconds before any fetch retry.
   */
  maxRetryDelayMillis: number;
  /**
   * A multiplier value to apply against the previous retry delay.
   */
  backoffMultiplier: number;
  /**
   * The initial retry delay in milliseconds.
   */
  initialRetryDelayMillis: number;
  /**
   * The maximum multiplier to jitter retry delays with. This must be greater than 1 since
   * the minimum jitter multiplier is 1 due to not wanting to jitter below the delay
   * instructed in any response headers.
   */
  maxJitterMultiplier: number;
}

Based on this, we can define a constant with appropriate values for handling rate limiting when the execution context does not need to return information to a user in a timely fashion:

export const nonUiContextRateLimitingHandlingOptionsDefaults: RateLimitingHandlingOptions = {
  maxRetries: 2,
  maxRetryDelayMillis: 60000,
  backoffMultiplier: 2,
  initialRetryDelayMillis: 5000,
  maxJitterMultiplier: 1.3,
} 

Creating a fetch utility

Our implementation starts by creating a class that takes the configuration as a constructor parameter:

export default class RateLimitingFetch {
  constructor(options: RateLimitHandlingOptions) {
    this.options = options;
  }
}

We then add a function to the class with the same signature as the node-fetch fetch function:

fetch = async (url: RequestInfo, init?: RequestInit): Promise<Response> => {
}

Managing retry state

Since we know our fetch operation will need to perform retries with delays, this fetch operation will delegate to another function that is re-entrant and will carry over the retry state:

public fetch = async (url: RequestInfo, init?: RequestInit): Promise<Response> => {
  return await this._fetch(this.options.maxRetries, 0, url, init);
}

private _fetch = async (remainingRetries: number, lastRetryDelayMillis: number, url: RequestInfo, init?: RequestInit): Promise<Response> => {
  // Re-entrant call to retry:
  return await this._fetch(remainingRetries - 1, thisDelayMillis, url, init);
}

Detecting which responses to retry

The pseudo code in the Jira rate limiting guide provides a good starting point to detect when it is appropriate to retry.

Although we have so far only discussed handling rate limiting responses, we should also be mindful that it is appropriate to also retry some kinds of server errors. Our utility will retry 429 Too Many Requests, 500 Internal Server Error and 503 Service Unavailable:

const response = await fetch(url, init);
const statusCode = response.status;
const responseNeedsRetry =
  statusCode === tooManyRequestsStatusCode ||
  statusCode === internalServerErrorStatusCode ||
  statusCode === serviceUnavailableStatusCode;
if (responseNeedsRetry) {
  // TODO: retry 
} else {
  return response;
}

Rate limiting response headers

Different REST APIs employ different types of headers in rate limiting responses, however, the most common header is Retry-After. The value of the Retry-After header indicates the number of seconds that the requester should delay before re-attempting the REST API request.

Note: To keep our utility simple, it will only use the retry information from Retry-After headers when available. If the API you are calling provides different rate limiting response headers, you may want to modify the code.

Computing retry delays

The retry delay can be retrieved from the Retry-After header if available, otherwise an exponential backoff algorithm is used where previous delays are multiplied by the backoffMultiplier value in the configuration options.

let unjitteredRetryDelayMillis = -1;
if (retryAfterHeader) {
  try {
    const retryAfterSeconds = parseInt(retryAfterHeader.trim());
    unjitteredRetryDelayMillis = 1000 * retryAfterSeconds;
  } catch (error) {
    console.warn(`Unable to parse Retry-After header: ${retryAfterHeader}`);
  }
} else {
  if (lastRetryDelayMillis > 0) {
    unjitteredRetryDelayMillis = Math.min(this.options.backoffMultiplier * lastRetryDelayMillis, this.options.maxRetryDelayMillis);
  } else {
    unjitteredRetryDelayMillis = this.options.initialRetryDelayMillis;
  }
}

Jittering retry delays

It is good practice to jitter retries to avoid the thundering herd problem:

const jitterMultiplier = this._randomInRange(
  this.options.minJitterMultiplier, this.options.maxJitterMultiplier);
...
private _randomInRange = (min: number, max: number): number => {
  return min + Math.random() * (max - min);
}

Using the utility

Using the same function signature makes it easy for existing code using node-fetch to be migrated to our utility:

import RateLimitingFetch, { nonUiContextRateLimitingHandlingOptionsDefaults } from '../rate-limit-util/RateLimitingFetch';

const rateLimitingFetch = new RateLimitingFetch(nonUiContextRateLimitingHandlingOptionsDefaults);
rateLimitingFetch.fetch(someUrl, fetchOptions);

Pluggable fetch implementations

There are a variety of reasons why it is a good idea to support different fetch implementations:

  • Different application architectures may support different fetch mechanisms.
  • Testing may require different behaviours of fetch to occur.

To overcome this challenge, we can abstract away from calling node-fetch directly and allow an alternate implementation to be injected. We need to provide a method allowing the client to inject the fetch implementation they wish to use.

export interface FetchInterface {
  fetch: (url: RequestInfo, init?: RequestInit) => Promise<Response>;
}

export class NodeFetch implements FetchInterface {

  fetch = async (url: RequestInfo, init?: RequestInit): Promise<Response> => {
    return await fetch(url, init);
  }

}

export default class RateLimitingFetch {

  fetchImplementation: FetchInterface = new NodeFetch();
  
  ...

  public setFetchImplementation = (fetchImplementation: FetchInterface): void => {
    this.fetchImplementation = fetchImplementation;
  }
  
  ...

}

Testing

The next challenge is to ensure this works, especially if there is no way to force the REST service we are calling to return rate limiting responses. This is the case for Jira Cloud and Confluence Cloud. Since the fetch implementation is pluggable, we can provide implementations suitable for different types of testing.

Mocking the fetch implementation for unit testing

We can provide an implementation of FetchInterface that is designed for use by unit tests where the next response from the fetch operation can be set by the test:

 export interface MockFetchInfo {
  responseStatusCode: number
  responseStatusText: string
  addResponseHeaders: (headers: Map<string, string>) => void
}

export interface MockFetchController {
  getMockFetchInfo: (timeOfLastRateLimit: number) => undefined | MockFetchInfo;
}

export class SimpleMockFetchController implements MockFetchController {
  nextMockFetchInfo: undefined | MockFetchInfo = undefined;
  setNextMockFetchInfo = (nextMockFetchInfo: undefined | MockFetchInfo): void => {
    this.nextMockFetchInfo = nextMockFetchInfo;
  }
  getMockFetchInfo = (timeOfLastRateLimit: number): undefined | MockFetchInfo => {
    return this.nextMockFetchInfo;
  }
}

export class MockingFetch implements FetchInterface {

  timeOfLastRateLimit = 0;
  mockFetchController: MockFetchController = new SimpleMockFetchController();

  public setMockFetchController = (mockFetchController: MockFetchController) => {
    this.mockFetchController = mockFetchController;
  }

  public fetch = async (url: RequestInfo, init?: RequestInit): Promise<Response> => {
    const mockFetchInfo = this.mockFetchController.getMockFetchInfo(this.timeOfLastRateLimit);
    this.timeOfLastRateLimit = new Date().getTime();
    if (mockFetchInfo) {
      const status: number = mockFetchInfo.responseStatusCode;
      const statusText = mockFetchInfo.responseStatusText;
      const _headers = new Map<string, string>();
      mockFetchInfo.addResponseHeaders(_headers);
      const headers: Headers = {
        get: function (name: string): string {
          return _headers.get(name);
        },
        ...
      }
      const rateLimitResponse: Response = {
        headers: headers,
        status: status,
        statusText: statusText,
        url: url.toString(),
        ...
      }
      return rateLimitResponse;
    } else {
      return await fetch(url, init);
    }
  }
}

The following code shows an example of how this MockingFetch class can be used:

// Create the instance of RateLimitingFetch...
const rateLimitingFetch = new RateLimitingFetch(nonUiContextRateLimitingHandlingOptionsDefaults);

// Set the fetch implementation to our own MockingFetch instance...
const mockingFetch = new MockingFetch();
rateLimitingFetch.setFetchImplementation(mockingFetch);

// Set the MockingFetch controller...
const simpleMockFetchController = new SimpleMockFetchController();
mockingFetch.setMockFetchController(simpleMockFetchController);

// Cause the next fetch to return a rate limitng response...
const addRetryAfterResponseHeader = (headers: Map<string, string>): void => {
  headers.set('Retry-After', '5');
}
let nextMockFetchInfo: undefined | MockFetchInfo = {
  responseStatusCode: 429,
  responseStatusText: 'Too Many Requests',
  addResponseHeaders: addRetryAfterResponseHeader
}
simpleMockFetchController.setNextMockFetchInfo(nextMockFetchInfo);

// Now do something that will cause fetch to be invoked and check the result...
const response = await rateLimitingFetch.fetch(url, options);
assertTrue(response.status === 429);

Random fetch implementation for integration testing

For integration testing, you may want to replace the fetch implementation such that it randomly returns a mixture of successful and rate limiting and error responses. This can be done with a separate implementation of MockFetchController and injecting it into MockingFetch:

export class RandomMockFetchController implements MockFetchController {

  getMockFetchInfo = (timeOfLastRateLimit: number): undefined | MockFetchInfo => {
    const now = new Date().getTime();
    const millisSinceLastRateLimit = now - timeOfLastRateLimit;
    let threshold = 1;
    if (millisSinceLastRateLimit < 100) {
      threshold = 0.7;
    } else if (millisSinceLastRateLimit < 1000) {
      threshold = 0.75;
    } else if (millisSinceLastRateLimit < 5000) {
      threshold = 0.8;
    } else if (millisSinceLastRateLimit < 10000) {
      threshold = 0.85;
    } else {
      threshold = 0.9;
    }
    console.log(`Rate limit randomness threshold = ${threshold} (duration since last rate limit = ${millisSinceLastRateLimit}).`);
    if (Math.random() <= threshold) {
      return undefined;      
    } else {
      const randomValue = Math.random();
      if (randomValue < 0.5) {
        const mockFetchInfo: MockFetchInfo = {
          responseStatusCode: 429,
          responseStatusText: 'Too Many Requests',
          addResponseHeaders: Math.random() < 0.5 ? this.addNoResponseHeaders : this.addRetryAfterResponseHeader
        }
        return mockFetchInfo;
      } else if (randomValue < 0.75) {
        const mockFetchInfo: MockFetchInfo = {
          responseStatusCode: 500,
          responseStatusText: 'Internal Server Error',
          addResponseHeaders: this.addNoResponseHeaders
        }
        return mockFetchInfo;
      } else {
        const mockFetchInfo: MockFetchInfo = {
          responseStatusCode: 503,
          responseStatusText: 'Service Unavailable',
          addResponseHeaders: Math.random() < 0.5 ? this.addNoResponseHeaders : this.addRetryAfterResponseHeader
        }
        return mockFetchInfo;
      }
    }
  }

  private addNoResponseHeaders = (headers: Map<string, string>): void => {
    // don't add any headers
  }

  private addRetryAfterResponseHeader = (headers: Map<string, string>): void => {
    headers.set('Retry-After', '5');
  }

}

Sharing the code via GitHub and NPM

To maximise re-use, the code has been packaged as a Typescript library and published to NPM. It can be added to your app or project as follows:

npm install handle-rate-limiting-js

With the library added to your project, you can import the code as in the following example:

import {
  nonUiContextRateLimitingHandlingOptionsDefaults,
  MockingFetch,
} from 'handle-rate-limiting-js';

The code has also been open sourced in the following GitHub repository:

dugaldmorrow/handle-rate-limiting-js

Limitations

Some application architectures or execution contexts may not support the re-entrant retry mechanism since they may not allow threads to run for long periods. This is the case of functions-as-a-service (FaaS) execution contexts such as Forge functions, AWS lambda and Google Cloud functions. To support FaaS execution contexts, the library can be enhanced to enable a different retry mechanism to be plugged in. For Forge, the retry mechanism would most likely employ the async events API.

Whilst the utility handles the Retry-After header, it could be enhanced with a pluggable approach to handling other kinds of rate limiting headers.

Wrapping up

With the help of a utility class to do most of the hard work, handling REST API rate limiting responses becomes far simpler.

Whether you're building an app for the Atlassian Marketplace or another use case, I hope this article helps you make your app more robust. If you have any questions about this or other topics, you can join the discussion in the Atlassian Developer Community.