Stream Responses in ASP.NET Core with Middleware
When building APIs that handle large datasets, memory consumption can quickly become a critical bottleneck. The default behavior in ASP.NET Core is to buffer the entire response in memory before sending it to the client. For a small JSON object, this is fine. For a 500 MB CSV export, it’s a recipe for disaster, potentially leading to high memory usage and even OutOfMemoryException.
The solution is response streaming. Instead of building the entire payload at once, we write it to the client in chunks as it’s generated. This drastically reduces the server’s memory footprint and improves the time to first byte (TTFB), making your application feel more responsive.
In this post, we’ll explore how to build a clean, reusable pattern for enabling response streaming using custom ASP.NET Core middleware.
The Problem with Response Buffering
Let’s visualize the default behavior. Imagine an endpoint that generates a large report:
- A request comes in for
/api/reports/annual-sales. - Your application logic queries a database and processes a massive amount of data.
- It serializes this data into a single, large string or byte array in memory. Let’s say it’s 200 MB.
- Only after the entire 200 MB payload is ready does ASP.NET Core begin sending the response to the client.
If 10 users request this report simultaneously, your server’s memory usage spikes by 2 GB just for these responses. This doesn’t scale well. Streaming sends the data piece by …
...