What is rate limiting in ASP.NET Core?

It’s a middleware-based technique to control the number of requests a user can make to an API within a specific time frame. This protects your application from overuse, prevents resource exhaustion, and ensures fair usage for all clients.

What are the main rate limiting algorithms in ASP.NET Core?

ASP.NET Core provides four built-in algorithms: Fixed Window (simple request counts in a static time window), Sliding Window (a rolling window for smoother limiting), Token Bucket (allows for request bursts), and Concurrency (limits simultaneous requests).

How do I apply a rate limit to a specific API endpoint?

After configuring a named policy in Program.cs, you can apply it to a specific controller action or minimal API endpoint using the [EnableRateLimiting("your-policy-name")] attribute.

Can I create different rate limits for different users?

Yes, the middleware supports partitioning. You can create rate limit partitions based on user claims, IP addresses, API keys, or any other identifier available in the HttpContext, ensuring each user gets their own request limit.

Master ASP.NET Core Rate Limiting for Robust APIs

When you’re building APIs, especially those exposed to the public internet, you’re not just building features; you’re also building defenses. A high-traffic API without protection is vulnerable to everything from accidental infinite loops in a client application to deliberate Denial-of-Service (DoS) attacks. This is where rate limiting becomes an essential part of your application’s architecture.

Starting with .NET 7, ASP.NET Core introduced a powerful and flexible rate-limiting middleware right into the framework. Gone are the days of relying solely on third-party packages or complex manual implementations. Let’s dive into how you can use this middleware to make your APIs more resilient and reliable.

What Exactly is Rate Limiting?

At its core, rate limiting is a defensive mechanism that controls the amount of incoming traffic to your API from a specific source in a given period. It’s like a bouncer at a club who only lets a certain number of people in per minute to prevent overcrowding.

By implementing rate limiting, you can:

Prevent Resource Exhaustion: Stop a single user or service from overwhelming your servers, database, or other downstream dependencies.
Ensure Fair Usage: Guarantee that all clients get a fair share of the available resources.
Improve Security: Mitigate brute-force attacks on login endpoints and reduce the effectiveness of DoS attacks.
Manage Costs: If you rely on paid third-party services, rate limiting can prevent unexpected bills caused by excessive API calls.

Setting Up the Rate Limiting Middleware

Integrating the rate limiting middleware into your ASP.NET Core application is straightforward. It involves two simple steps in your Program.cs file.

Register the services: Call AddRateLimiter on the IServiceCollection.
Enable the middleware: Call UseRateLimiter on the IApplicationBuilder.

Here’s the most basic setup, which creates a global limiter for all incoming requests:

// Program.cs
using System.Threading.RateLimiting;

var builder = WebApplication.CreateBuilder(args);

// Add services to the container.
builder.Services.AddControllers();

// 1. Register the Rate Limiter services
builder.Services.AddRateLimiter(options =>
{
    options.AddFixedWindowLimiter(policyName: "fixed", fixedWindow =>
    {
        fixedWindow.PermitLimit = 10;
        fixedWindow.Window = TimeSpan.FromSeconds(10);
    });
    // The status code to return when a request is rejected.
    options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
});


var app = builder.Build();

// 2. Enable the Rate Limiter middleware
app.UseRateLimiter();

app.MapControllers();
app.Run();

In this example, we’ve configured a simple Fixed Window limiter. We’ll explore what this means next, but essentially, this policy allows a maximum of 10 requests every 10 seconds. If a client exceeds this, they’ll receive an HTTP 429 Too Many Requests response.

Callout: A Note on Middleware Order
Remember that middleware order matters in ASP.NET Core. You should place UseRateLimiter() early in the pipeline, typically after routing (UseRouting()) but before authentication (UseAuthentication()) and authorization (UseAuthorization()). This ensures that you reject excessive requests before they consume more valuable server resources.

Exploring the Built-in Algorithms

The real power of the ASP.NET Core rate limiting middleware lies in its different limiting strategies. The framework provides four powerful, out-of-the-box algorithms to suit various scenarios. Let’s break them down.

1. Fixed Window Limiter

This is the simplest algorithm. It defines a static time window and a request limit within that window. When the window expires, the count resets.

How it works: Imagine a counter that allows 100 requests from 1:00 PM to 1:01 PM. At 1:01 PM, the counter resets to zero for the next minute.
Use Case: Great for simple, general-purpose rate limiting where you want to enforce a hard limit over a consistent period.
Caveat: It can be susceptible to a “thundering herd” problem. A client could make all their allowed requests in the last second of a window and then another full set in the first second of the new window, creating a large burst of traffic.

// In Program.cs -> AddRateLimiter
options.AddFixedWindowLimiter(policyName: "fixed-by-ip", options =>
{
    options.PermitLimit = 10;
    options.Window = TimeSpan.FromSeconds(10);
    options.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
    options.QueueLimit = 2; // Allow 2 requests to be queued
});

2. Sliding Window Limiter

The Sliding Window algorithm solves the burst issue of the Fixed Window. It divides the time window into segments and tracks requests in a rolling fashion.

How it works: If you have a 1-minute window with 6 segments, the limiter checks the request count across the last 6 ten-second segments. This smooths out the request rate and prevents edge-of-window bursts.
Use Case: Ideal for APIs where a smooth and consistent flow of traffic is critical. It provides a more accurate representation of the recent request rate.

// In Program.cs -> AddRateLimiter
options.AddSlidingWindowLimiter(policyName: "sliding", options =>
{
    options.PermitLimit = 15;
    options.Window = TimeSpan.FromSeconds(30);
    options.SegmentsPerWindow = 3; // Window is divided into 3 segments of 10s each
    options.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
    options.QueueLimit = 0;
});

3. Token Bucket Limiter

This algorithm is excellent for handling bursts of traffic gracefully. It works with a “bucket” of tokens that are replenished at a steady rate.

How it works: Each request consumes one token from the bucket. If the bucket is empty, the request is rejected. The bucket is refilled with new tokens over time, up to a maximum bucket size. This allows a client to save up tokens and spend them in a burst.
Use Case: Perfect for APIs where you want to allow occasional bursts of activity (like a user uploading multiple files at once) while maintaining a sustainable average rate.

// In Program.cs -> AddRateLimiter
options.AddTokenBucketLimiter(policyName: "token", options =>
{
    options.TokenLimit = 100; // Max tokens in the bucket
    options.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
    options.QueueLimit = 5;
    options.ReplenishmentPeriod = TimeSpan.FromMinutes(1);
    options.TokensPerPeriod = 20; // Add 20 tokens every minute
});

4. Concurrency Limiter

Unlike the others, this limiter doesn’t care about the number of requests over time. Instead, it limits how many requests can be processed simultaneously.

How it works: It’s like having a limited number of service desks. If all desks are busy, new customers have to wait or are turned away. Once a request is complete, it frees up a slot for the next one.
Use Case: Excellent for protecting resource-intensive endpoints that cannot handle high levels of parallelism, such as endpoints that perform heavy computations or access a limited resource pool.

// In Program.cs -> AddRateLimiter
options.AddConcurrencyLimiter(policyName: "concurrency", options =>
{
    options.PermitLimit = 5; // Only 5 concurrent requests allowed
    options.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
    options.QueueLimit = 2;
});

Applying Policies and Advanced Configuration

Defining a global rate limit is a good start, but real-world applications often require more granularity.

Applying Policies to Specific Endpoints

You can define multiple named policies in Program.cs and apply them selectively to controllers or minimal API endpoints using attributes.

First, define your named policies:

// Program.cs
builder.Services.AddRateLimiter(options =>
{
    options.AddFixedWindowLimiter("api", opt => { /* ... */ });
    options.AddTokenBucketLimiter("uploads", opt => { /* ... */ });
});

Then, apply them where needed:

// In a Controller
[ApiController]
[Route("[controller]")]
// Apply the 'api' policy to the whole controller
[EnableRateLimiting("api")] 
public class ProductsController : ControllerBase
{
    [HttpGet]
    public IActionResult Get() => Ok("All products");

    [HttpPost("upload")]
    // Override with a more specific policy for this endpoint
    [EnableRateLimiting("uploads")] 
    public IActionResult UploadFile() => Ok("File uploaded");

    [HttpGet("health")]
    // This endpoint should not be rate-limited
    [DisableRateLimiting]
    public IActionResult HealthCheck() => Ok("Healthy");
}

Partitioning: Per-User Rate Limits

A global limit isn’t very useful in a multi-user system. What you really want is to limit each user or IP address individually. This is called partitioning. The middleware makes this incredibly easy by allowing you to define a partition key based on the HttpContext.

Here’s how you can create a rate limit based on the client’s IP address:

// Program.cs
builder.Services.AddRateLimiter(options =>
{
    options.AddFixedWindowLimiter("fixed-by-ip", opt =>
    {
        opt.PermitLimit = 10;
        opt.Window = TimeSpan.FromSeconds(10);
    })
    // Partition based on the remote IP Address
    .PartitionByIpAddress(); 
});

You can also create more complex partitions, for example, based on a user’s ID claim after they’ve authenticated:

// Program.cs
.PartitionByHttpHeader("X-Api-Key") // By API Key

// or by a custom rule
.PartitionByUser(context =>
{
    // Use a claim as the partition key
    var userId = context.User.FindFirst("user_id")?.Value ?? "anonymous";
    return RateLimitPartition.GetFixedWindowLimiter(userId, _ =>
        new FixedWindowRateLimiterOptions
        {
            PermitLimit = 20,
            Window = TimeSpan.FromMinutes(1)
        });
});

Callout: Experience-Driven Insight
Partitioning is the key to effective rate limiting in multi-tenant applications. While IP-based partitioning is a good start, authenticated users should almost always be partitioned by their user or client ID. This prevents scenarios where multiple users behind the same corporate NAT hit a shared IP-based limit, impacting their collective user experience.

Customizing the Rejection Response

When a request is rejected, you might want to return more than just a 429 status code. The OnRejected callback lets you customize the response, for example, by adding Retry-After headers.

// Program.cs
builder.Services.AddRateLimiter(options =>
{
    options.OnRejected = (context, cancellationToken) =>
    {
        if (context.Lease.TryGetMetadata(MetadataName.RetryAfter, out var retryAfter))
        {
            context.HttpContext.Response.Headers.RetryAfter =
                ((int)retryAfter.TotalSeconds).ToString();
        }

        context.HttpContext.Response.StatusCode = StatusCodes.Status429TooManyRequests;
        context.HttpContext.Response.WriteAsync("Too many requests. Please try again later.", cancellationToken);

        return new ValueTask();
    };
    
    // ... add limiters
});

Conclusion

Rate limiting is no longer an afterthought but a fundamental requirement for building secure, scalable, and reliable APIs. With the Microsoft.AspNetCore.RateLimiting middleware, ASP.NET Core provides a first-class, feature-rich solution that is both easy to configure for simple cases and powerful enough for complex, partitioned scenarios. By choosing the right algorithm and applying policies intelligently, you can effectively protect your application and ensure a great experience for all your users.

References

Official Microsoft Documentation: Rate limiting middleware in ASP.NET Core
Andrew Lock’s Blog: An in-depth look at the different rate limiting algorithms in ASP.NET Core
.NET Blog: Announcing Rate Limiting for .NET

What Exactly is Rate Limiting?#

Setting Up the Rate Limiting Middleware#

Exploring the Built-in Algorithms#

1. Fixed Window Limiter#

2. Sliding Window Limiter#

3. Token Bucket Limiter#

4. Concurrency Limiter#

Applying Policies and Advanced Configuration#

Applying Policies to Specific Endpoints#

Partitioning: Per-User Rate Limits#

Customizing the Rejection Response#

Conclusion#

References#

Posts in the "Advanced ASP.NET Core Middleware & APIs" series