Understand Rate Limit Burst Capability
Effective Date: 10 November 2021
Auth0 maintains rate limits and burst limits for its APIs. While the rate limit is the maximum sustainable amount of traffic the system will handle, the burst limit is the maximum short-term traffic volume the system will handle within one interval. Auth0 rate limits and burst limits work together to provide better limiting functionality for dynamic traffic volume.
Auth0 rate limits include calls made via Rules and are set by tenant rather than endpoint.
Burst limits
As already stated, the burst limit is the maximum short-term traffic volume the system will handle within one interval. Each Auth0 endpoint is configured with a "bucket" that defines a request limit and a rate limit window (for example, per second, per minute, per day). Let's look at a sample endpoint bucket configuration:
bucket:
size: x
per_minute: y
Was this helpful?
In this sample, the given bucket has a maximum request limit of x
per minute. For each minute that elapses, Auth0 adds permissions for y
requests. In other words, for each 60/y seconds, Auth0 adds one additional request to the bucket. This occurs automatically until the bucket contains the maximum permitted number of requests (x).
Real world use case
Keeping in mind how a sample endpoint "bucket" is configured, let's look at Auth0 Management API burst limits by subscription type:
Tenant Type | Sustained Requests per Second | Maximum Requests per Second | Bursts per Minute (Peak) |
---|---|---|---|
Free or Trial | 2 | 10 | 120 |
Self Service (Paid) | 16 | 50 | 1000 |
Enterprise (Production) | 16 | 50 | 1000 |
Enterprise (Non-production) | <1 | 2 | 10 |
In this table, we see:
Sustained Requests per Second: The rate limit in requests per second over a period of several minutes or longer. This is the most important limit to consider over time. If your application never exceeds this, the traffic will never be limited. When exceeded, the degree of excess determines how quickly the burst limit is reached.
Maximum Requests per Second: An absolute limit on the number of requests per second your tenant can process. You will never be allowed to exceed this limit.
Bursts per Minute (Peak): The size of the request limit "bucket".
If we examine the Enterprise limits, we see:
Sustained Requests per Second: 16 Maximum Requests per Second: 50 Bursts per Minute (Peak): 1000
In this example, the Bursts per Minute (Peak) is 1000, which means the "bucket" size
(x) is 1000 and the per_minute
rate (y) is also 1000. Because we know that for each minute that elapses, Auth0 adds permissions for y
requests (60/y), we calculate 60/1000 = 0.06, so Auth0 adds one request to the bucket every 0.06 seconds, which means that 16.67 requests are added to the bucket each second (1/.06 = 16.67).
So far, what we have learned is:
An Enterprise application cannot ever make more than 50 requests per second (rps).
An Enterprise application has a “bucket” of 1000 requests it can consume at up to 50 rps.
Over time, an Enterprise application must average no more than 16.67 rps.
As the rate of requests increases above 16.67 requests per second, the burst limit will be reached and the "bucket" will be depleted more quickly. But how quickly will the bucket be depleted? That depends on how many requests per second your application makes. Let's say your Enterprise application makes 30 requests per second, and you want to know how long this can continue before traffic is limited. You already know that:
the Enterprise application "bucket" contains 1000 requests.
Auth0 adds 16.67 requests to the "bucket" every second (+16.67 rps).
And you know that your application is causing 30 requests to be removed from the bucket every second (-30 rps). So the rate at which the bucket is being depleted is: 16.67 + (-30) = -13.33. Because this number is negative, this means the bucket is being depleted at a rate of 13.33 requests per second.
To determine how soon the bucket will be depleted, divide the requests in the "bucket" (x, or 1000) by the requests per second at which the bucket is depleted (13.33): 1000 requests/13.33 requests per second = 75 seconds, so the bucket will be depleted after 75 seconds of requests at a rate of 30 requests per second.
If we perform this calculation for various rates of request, we learn that:
at 16 rps, the "bucket" will never be depleted.
at 30 rps, the "bucket" will be depleted in ~75 seconds.
at 50 rps, the "bucket" will be depleted in ~30 seconds.
Once the burst limit is reached and the "bucket" is depleted, the effective rate limit will be limited to 16.67 requests per second until traffic drops below 16.67 rps for some period of time.
So for this example, the Enterprise application could make 50 requests per second, but in doing so, it would consume the number of requests in its "bucket" in about 30 seconds and would then be limited to approximately 16 requests per second. If, however, the application spaced its traffic out to exactly 16 requests per second, then the rate limit would never be reached.