Skip to content

Supports streamed and file responses#184

Merged
Baldinof merged 7 commits into
Baldinof:3.xfrom
Kenny1911:streamed-respond
May 12, 2026
Merged

Supports streamed and file responses#184
Baldinof merged 7 commits into
Baldinof:3.xfrom
Kenny1911:streamed-respond

Conversation

@Kenny1911
Copy link
Copy Markdown
Contributor

@Kenny1911 Kenny1911 commented Mar 18, 2026

Closes #179 #130 #101

1. Updated spiral/roadrunner-http dependency to ^4.0

In this version, the HttpWorkerInterface::respond() method explicitly defines the $endOfStream parameter, allowing proper stream termination control.

2. Fixed streaming and file responses

  • StreamedResponse (generators, echo callback)
  • StreamedJsonResponse
  • EventStreamResponse
  • BinaryFileResponse (large files)

3. Created HttpFoundationResponder service

Responsible for sending responses based on their type. Implements flexible logic and serves as an extension point for future response types.

Implementations:

  • BufferedResponder - returns the entire response from memory. Used for simple responses and as a fallback.
  • ChunkedResponder - splits the response into chunks and sends them to RoadRunner. Chunk size depends on response type:
    • BinaryFileResponse 1 byte - Symfony itself streams the file in chunks, we don't interfere.
    • EventStreamResponse 1 byte - for instant real-time event delivery.
    • StreamedResponse, StreamedJsonResponse 16 KB - same as default file chunk size. Can be overrides by baldinof_road_runner.http_foundation_streamed_responder.chunk_size parameter.
  • ChainResponder - sequentially applies multiple responders (for complex cases)

Both main responders (BufferedResponder and ChunkedResponder) work through output buffering interception using ob_start(). This allows:

  • Properly capturing data sent via echo, print, and other output functions
  • Working with any response type without modifying the application code
  • Maintaining compatibility with standard Symfony behavior

Added unit tests.

4. Results

  • Eliminated connection resets and worker crashes during streaming
  • Maintained performance at the level of the original bundle
  • Established architecture for easy addition of new response types

Benchmarks

Streamed (1M records)
Runtime Status Time Speed, M/s Size, M
rr ❌ 500 0.464 0 0
rr-fork ✅ 200 0.410 153.4 62.8
fpm ✅ 200 1.996 31.4 62.8
Streamed echo (1M records)
Runtime Status Time Speed, M/s Size, M
rr ❌ 500 0.378 0 0
rr-fork ✅ 200 0.385 163.1 62.8
fpm ✅ 200 1.997 31.4 62.8
Streamed json (1M records)
Runtime Status Time Speed, M/s Size, M
rr ❌ 500 0.445 0 0
rr-fork ✅ 200 0.274 253.3 69.5
fpm ✅ 200 0.288 241.5 69.5
File (60 Mb)
Runtime Status Time Speed, M/s Size, M
rr ❌ 500 0.046 0 0
rr-fork ✅ 200 0.057 1051.5 60.0
fpm ✅ 200 0.082 730.5 60.0
Common
Runtime Status Time Speed, M/s Size, M
rr ✅ 200 0.010 0 0
rr-fork ✅ 200 0.001 0 0
fpm ✅ 200 0.002 0 0

Benchmarks repository: https://github.com/Kenny1911/baldinof-roadrunner-bundle-streamed-response-benchmark

@Kenny1911
Copy link
Copy Markdown
Contributor Author

@Baldinof Hello! Just a heads up — this PR has been waiting for review for 14 days now. If there are any comments or anything needs to be changed, let me know. If you don't have time at the moment, could you please suggest someone else who can review it instead?

@Baldinof
Copy link
Copy Markdown
Owner

Baldinof commented Apr 2, 2026

Hello! Thank you for opening this :)

It overall looks great, streaming has been long-awaited!

I'll try to review and test it during the next weekend.

@Kenny1911
Copy link
Copy Markdown
Contributor Author

@Baldinof Hello! I fix the code and to ensure backward compatibility with older symfony versions.

@Baldinof
Copy link
Copy Markdown
Owner

I tested it and it worked well.

  1. One weird thing is even if I reduced baldinof_road_runner.http_foundation_streamed_responder.chunk_size to 1, it was not streaming small chunks, they were all sent at once. With bigger chunks it worked as expected.
  2. Have you seen a performance impact of chain responder and some added loops for each request?

@Kenny1911
Copy link
Copy Markdown
Contributor Author

Hello!

ChunkedResponder uses output buffering interception (ob_start()). The parameter baldinof_road_runner.http_foundation_streamed_responder.chunk_size is used as the $chunk_size argument of the ob_start function. It does not split the buffer into chunks of $chunk_size length. This means that the buffer will be flushed after any block of code resulting in output that causes the buffer's length to equal or exceed $chunk_size.

Example:

$response = new StreamedResponse(function(): void {
    echo 'some text ...'; // Echo string of 16KB
});

$responder = new ChunkedResponder(
    responseClass: StreamedResponse::class,
    chunkSize: 1,
);
$responder->respond($httpWorker, $response);

The ChunkedResponder will receive a 16kb string as content. The ob_start function does not split it into 1-byte chunks.

Symfony was originally designed for dying runtimes — Apache with mod_php and PHP-FPM. The Response::sendContent() method is implemented differently for each response type. To avoid manually reimplementing a RoadRunner-specific version of this method for every response type, we make a compromise: we wrap the Response::sendContent() call in ob_start(). It's not ideal, but it works.

I tried, inside the ChunkedResponder, to additionally split the content into chunks of $chunk_size size within each iteration of ob_start and send them separately. However, in practice, this complicated the code, and I did not get any noticeable gain in performance or streaming behavior. Therefore, I decided to abandon this idea and keep the simple use of ob_start without additional splitting.


The performance impact of ChainResponder is negligible. It iterates over a small, fixed set of responders (5 by default) and performs simple supports() checks. The overhead per request is minimal. In practice, I have not noticed any noticeable performance degradation caused by this approach — not even under load.

It would be possible to rewrite ChainResponder to use tagged locator instead of tagged iterator. This would slightly improve performance, because you wouldn't need to iterate through the loop every time in search of a suitable responder.

However, in the supports() method (when using custom responders), the check can be more complex than a simple comparison of the Response object type. For example, a responder might support not a single specific class, but an entire family of responses, or it might check for the presence of certain headers. In such cases, tagged locator with access by a pre‑known key would no longer work.

Therefore, using tagged iterator is a more flexible approach. Yes, it is slightly slower, but it can handle any logic inside supports(). In practice, the performance difference is so small that it can be ignored.

@Kenny1911
Copy link
Copy Markdown
Contributor Author

@Baldinof Hello! Just writing so this PR doesn't get lost.

I'd really like to get this change merged properly. Could you at least briefly let me know:

  • Are you planning to review it anytime soon?

  • Or is there something I should fix right now?

I'm totally open to making changes. Thanks for understanding!

Comment on lines +36 to +39
// Skip, if content size less, then chunk size
if (\strlen($content) < $this->chunkSize && !$isLast) {
return '';
}
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really required? As chunksize is already passed to ob_start?

Copy link
Copy Markdown
Contributor Author

@Kenny1911 Kenny1911 May 4, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Because flush() breaks chunking.

When using only echo (without flush), the output buffer callback is only triggered when the buffer reaches the chunk size or the script ends. The condition strlen($content) < $chunkSize && !$isLast will never be true because the buffer is never flushed early:

$chunks = [];
ob_start(function($buffer) use (&$chunks) {
    $chunks[] = $buffer;

    return '';
}, chunk_size: 10);

echo "foo";
echo "bar";
echo "baz";
echo "qux";
echo "quux";
echo "quuux";

@ob_end_clean();

var_dump($chunks);

Output:

array(2) {
  [0] =>
  string(12) "foobarbazqux"
  [1] =>
  string(9) "quuxquuux"
}

However, when flush() is called between echo operators, the callback is invoked on every flush, regardless of the chunk size:

$chunks = [];
ob_start(function($buffer) use (&$chunks) {
    $chunks[] = $buffer; // <-- respond chunk

    return '';
}, chunk_size: 10);

echo "foo"; @ob_flush(); flush();
echo "bar"; @ob_flush(); flush();
echo "baz"; @ob_flush(); flush();
echo "qux"; @ob_flush(); flush();
echo "quux"; @ob_flush(); flush();
echo "quuux"; @ob_flush(); flush();

@ob_end_clean();

var_dump($chunks);

Output:

array(7) {
  [0] =>
  string(3) "foo"
  [1] =>
  string(3) "bar"
  [2] =>
  string(3) "baz"
  [3] =>
  string(3) "qux"
  [4] =>
  string(4) "quux"
  [5] =>
  string(5) "quuux"
  [6] =>
  string(0) ""
}

The solution

To solve this, we accumulate all incoming buffers until we either reach the chunkSize or detect the end of the stream ($isLast). This ensures that chunks are never smaller than the configured chunkSize.

Additionally, this approach reduces the number of HttpWorkerInterface::respond() invocations by batching multiple small flushes into single, properly-sized chunks.

Context

In Symfony ≥ 7.3, StreamedResponse accepts $callbackOrChunks as a callback with echo or an iterator with chunks. Symfony automatically flushes the buffer after each echo chunk.

This case is covered in unit tests: https://github.com/Kenny1911/roadrunner-bundle/blob/419915790892b49419447cc7651e23e5b835df31/tests/RoadRunnerBridge/HttpFoundationWorkerTest.php#L363

I also tested it in a benchmark project: https://github.com/Kenny1911/baldinof-roadrunner-bundle-streamed-response-benchmark/blob/master/src/Controller/Controller.php#L19

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Benchmark: With Condition vs Without Condition

The benchmark generates a response of approximately 62.8 MB, consisting of 1,000,000 iterations, each yielding a single line. The default chunk size is 16 KB.

Example controller:

public function action(): StreamedResponse
{
    return new StreamedResponse(function() {
        for ($i = 0; $i < 1_000_000; ++$i) {
            yield "id: {$i}; title: Title; description: Description; enabled: true" . PHP_EOL;
        }
    });
}
Runtime Status Time (s) Speed (M/s) Size (MB)
with condition ✅ 200 0.414 151.8 62.8
without condition ✅ 200 6.944 9.0 62.8
fpm ✅ 200 2.008 31.2 62.8

Results

With condition – Buffers are accumulated until the 16 KB chunk size is reached (or the stream ends). Despite 1,000,000 individual yield calls, the number of HttpWorkerInterface::respond() invocations is reduced to a minimum — approximately the number of full chunks (62.8 MB / 16 KB ≈ 4,000 calls).

Without condition – Each flush() triggers the callback immediately. As a result, every micro-chunk (often just a few bytes) is sent to the worker as a separate request. This creates a massive number of calls — around 1,000,000 (or even more, including empty chunks at the end).

FPM – Included as a baseline reference (standard Symfony with PHP-FPM, without RoadRunner optimizations). On 62.8 MB, it takes 2 seconds — slower than the with-condition approach (0.414 s), but significantly faster than the without-condition approach (6.944 s).


Conclusion

The with condition approach is approximately 16–17 times faster than the naive implementation (0.414 s vs 6.944 s) on the same data volume.

The difference is straightforward:

  • Without condition – 1,000,000+ worker calls (each yield triggers a separate micro-request)
  • With condition – ~4,000 worker calls (only when a full chunk is accumulated)

The condition if (strlen($content) < $this->chunkSize && !$isLast) ensures that partial chunks are not sent on intermediate flush() calls. Instead, data accumulates until either:

  • The buffer reaches chunkSize → send a full chunk
  • The stream ends ($isLast === true) → send the remaining data

@Kenny1911
Copy link
Copy Markdown
Contributor Author

@Baldinof Hello!

I don't want to be a bother, but a question on the substance: are there still any open questions about the implementation of this PR?

If not — I'd really appreciate a review whenever you get to it. If yes — I'm ready to make changes.

Thanks!

Copy link
Copy Markdown
Owner

@Baldinof Baldinof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry again for the delay, and thank you for the responses!

It's a great PR, thank you for contributing! StreamedResponse has been requested for a while now and it's the best implementation :)

@Baldinof Baldinof merged commit c4d990c into Baldinof:3.x May 12, 2026
21 checks passed
@Baldinof
Copy link
Copy Markdown
Owner

I'll do some final tests on the default branch and release later today or tomorrow

@Kenny1911
Copy link
Copy Markdown
Contributor Author

@Baldinof Hello!

Apologies for the follow-up, but could you give an approximate timeline for the release? I need this change for my project - otherwise I wouldn't be asking.

Thanks!

@Baldinof
Copy link
Copy Markdown
Owner

I just published 3.4.0 with it, let me know if it's all good :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect usage of streamed response

2 participants