When building APIs that handle large datasets, memory consumption can quickly become a critical bottleneck. The default behavior in ASP.NET Core is to buffer the entire response in memory before sending it to the client. For a small JSON object, this is fine. For a 500 MB CSV export, it’s a recipe for disaster, potentially leading to high memory usage and even OutOfMemoryException.

The solution is response streaming. Instead of building the entire payload at once, we write it to the client in chunks as it’s generated. This drastically reduces the server’s memory footprint and improves the time to first byte (TTFB), making your application feel more responsive.

In this post, we’ll explore how to build a clean, reusable pattern for enabling response streaming using custom ASP.NET Core middleware.


The Problem with Response Buffering

Let’s visualize the default behavior. Imagine an endpoint that generates a large report:

  1. A request comes in for /api/reports/annual-sales.
  2. Your application logic queries a database and processes a massive amount of data.
  3. It serializes this data into a single, large string or byte array in memory. Let’s say it’s 200 MB.
  4. Only after the entire 200 MB payload is ready does ASP.NET Core begin sending the response to the client.

If 10 users request this report simultaneously, your server’s memory usage spikes by 2 GB just for these responses. This doesn’t scale well. Streaming sends the data piece by piece, keeping memory usage low and constant.

A Declarative Approach with Middleware

While you can manage streaming manually within an endpoint, it can lead to repetitive code. A much cleaner solution is to use middleware to establish a policy. We can create a system where developers can simply “opt-in” to streaming for specific endpoints.

Our goal is to create a custom attribute that, when applied to an endpoint, signals our middleware to disable response buffering for that request.

Step 1: Create a Marker Attribute

First, let’s define a simple attribute. This class doesn’t need any logic; its only purpose is to act as a metadata marker that our middleware can look for.

// StreamResponseAttribute.cs
namespace StreamingApi.Attributes;

/// <summary>
/// When applied to an endpoint, this attribute signals that the response
/// should be streamed rather than buffered.
/// </summary>
[AttributeUsage(AttributeTargets.Method)]
public sealed class StreamResponseAttribute : Attribute
{
}

Step 2: Build the Streaming Middleware

Next, we’ll create the middleware that checks for this attribute. If the attribute is present on the requested endpoint, the middleware will find the IHttpResponseBodyFeature and call its DisableBufferingAsync method.

// StreamingMiddleware.cs
using Microsoft.AspNetCore.Http.Features;
using StreamingApi.Attributes;

namespace StreamingApi.Middleware;

public class StreamingMiddleware(RequestDelegate next)
{
    private readonly RequestDelegate _next = next;

    public async Task InvokeAsync(HttpContext context)
    {
        var endpoint = context.GetEndpoint();
        if (endpoint?.Metadata.GetMetadata<StreamResponseAttribute>() is not null)
        {
            var responseBodyFeature = context.Features.Get<IHttpResponseBodyFeature>();
            if (responseBodyFeature is not null)
            {
                // Disabling buffering is a one-way operation. Once disabled,
                // it cannot be re-enabled for the current request.
                await responseBodyFeature.DisableBufferingAsync();
            }
        }

        await _next(context);
    }
}

This middleware is lightweight and targeted. It does one thing: it checks the endpoint’s metadata and modifies the response pipeline accordingly.

Expert Insight: The IHttpResponseBodyFeature provides low-level control over the response body stream. By calling DisableBufferingAsync, we are telling the underlying server (like Kestrel) to start sending data to the client as soon as it’s written to the response stream, rather than waiting for the response to complete.

Step 3: Register the Middleware

Now, let’s register our new middleware in Program.cs. It should be placed early in the pipeline, before the middleware that executes the endpoint (like app.MapControllers() or app.UseRouting()/app.UseEndpoints()).

// Program.cs
using StreamingApi.Middleware;

var builder = WebApplication.CreateBuilder(args);

// Add services to the container.
builder.Services.AddControllers();

var app = builder.Build();

// ... other middleware

// Register our custom middleware
app.UseMiddleware<StreamingMiddleware>();

app.MapControllers();
app.MapGet("/", () => "Hello World!");

// Add our streaming endpoint example
ConfigureStreamingEndpoint(app);

app.Run();

// We'll define this method next
static void ConfigureStreamingEndpoint(IEndpointRouteBuilder app)
{
    // Implementation in the next section
}

Step 4: Create a Streaming Endpoint

With the infrastructure in place, creating a streaming endpoint is incredibly simple. We’ll use a Minimal API endpoint for this example, but the [StreamResponse] attribute works identically on an MVC controller action.

Here, we’ll create an endpoint that generates a large CSV file row by row and streams it directly to the client.

// Program.cs (continued)
using StreamingApi.Attributes;
using System.Text;

static void ConfigureStreamingEndpoint(IEndpointRouteBuilder app)
{
    // Apply the attribute using WithMetadata for Minimal APIs
    app.MapGet("/stream-csv", [StreamResponse] async (HttpContext context) =>
    {
        context.Response.ContentType = "text/csv";
        context.Response.Headers.Append("Content-Disposition", "attachment; filename=large-export.csv");

        // Write the CSV header before starting the loop
        var header = "Id,Name,Timestamp\n";
        await context.Response.WriteAsync(header, context.RequestAborted);

        // Stream 100,000 rows
        for (int i = 1; i <= 100_000; i++)
        {
            var row = $"{i},User-{i},{DateTime.UtcNow:o}\n";
            await context.Response.WriteAsync(row, context.RequestAborted);
            
            // Flushing is not strictly necessary as the stream will flush periodically,
            // but can be useful to ensure data is sent immediately.
            // await context.Response.Body.FlushAsync(context.RequestAborted);
        }
    })
    .WithTags("Streaming")
    .WithName("StreamLargeCsv");
}

When you run this application and navigate to /stream-csv, the browser will immediately start downloading the large-export.csv file. If you inspect the network traffic, you’ll see the response data flowing in chunks, not all at once. Crucially, the server’s memory usage will remain flat and low throughout the entire process.

Important Considerations for Streaming

Streaming is powerful, but it comes with trade-offs you must understand.

  1. Headers and Status Codes are Final: Once you write the first byte to the response body, the HTTP headers and status code are sent and cannot be changed. You must set ContentType, status code, and other headers before you start your streaming loop.
  2. Error Handling is Complex: If an exception occurs halfway through generating the stream, you’re in a tough spot. The client has already received a 200 OK status. You can’t change it to 500 Internal Server Error. Your options are to abruptly close the connection (leaving the client with a partial, potentially corrupt file) or to write an error indicator into the stream itself, which requires the client to be designed to parse for such errors.
  3. Resource Management: Ensure you are properly handling resources, especially within your streaming loop. Use CancellationToken (like context.RequestAborted) to gracefully stop generation if the client disconnects.

Conclusion

By combining a simple marker attribute with targeted middleware, we’ve created a clean and declarative pattern for enabling response streaming in ASP.NET Core. This approach keeps the streaming logic separate from the business logic of your endpoints, making your code more maintainable and scalable. For any API that needs to serve large files, data exports, or real-time feeds, mastering response streaming is an essential skill for building robust and high-performance applications.

References