Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
65 changes: 65 additions & 0 deletions proposals/093-correlated-request-response-data-api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# 93 - Request/Response Correlated Data API

Provide a mechanism within the Kroxylicious Filter API to allow filters to stash state during the Request phase and retrieve it during the corresponding Response phase without manual state management.

## Current situation

Currently, filters that need to associate data from a request with its future response (e.g., `AuthorizationFilter` or `EntityIsolationFilter`) must manually manage an internal `Map`. Typically, the filter stashes data using the Kafka `correlationId` as a key during the request path and removes ("pops") it during the response path.
Comment thread
robobario marked this conversation as resolved.

Each Filter has so far had to hand-roll this code.

## Motivation

Manual state management for correlated data is error-prone and introduces several risks:
* **Memory Leaks:** If a response never returns or a filter fails to "pop" the data, the internal map grows indefinitely.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even the io.kroxylicious.proxy.filter.ResponseFilter#shouldHandleResponse could even say no

* **Edge Case Complexity:** Authors must be aware of and manually account for "zero-ack" Produce requests (which have no response) and scenarios where requests are dropped or short-circuited via the Filter API.
* **Boilerplate:** Every filter author must reimplement the same "store-and-retrieve" logic, leading to inconsistent implementations.

## Proposal

The proposal introduces a framework-managed "Correlated Data" store scoped to a single filter instance and a single Request/Response lifecycle.

### API Changes
To ensure type safety and prevent invalid states, the API will be updated to allow attaching data only when a request is being forwarded. This is best achieved by introducing a more fluent builder pattern.

**Request Path:**
Currently `FilterContext#forwardRequest` returns a terminal stage.

Instead of a simple terminal stage, we introduce a `forwardRequestBuilder` to handle the transition:
```java
return context.requestFilterResultBuilder()
.forwardRequestBuilder(header, request)
.withCorrelatedData(myStateObject)
.completed();
```

By including it in the command object from the Filter to the Framework we make it sympathetic
to asynchronously populated correlated data.

**Response Path:**
The context made available to the response filter will provide access to the stashed object:
```java
Optional<MyStateObject> data = context.correlatedData();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what happens if the request filter calls this method? is it an error?

Copy link
Copy Markdown
Member Author

@robobario robobario Apr 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mm annoying, throwing is probably a good idea to signal it's being misused. I wonder if we should look into splitting a request and response context. We already have the awkwardness that FilterContext carries methods for both request and response (forward(..) etc).

```

### Implementation Considerations
* **Scoping:** The data is strictly scoped to the filter instance that created it. It is not shared with other filters in the chain to avoid implicit coupling.
* **Lifecycle Management:** The framework becomes responsible for "popping" the data. If a request is a zero-ack Produce request, the framework will proactively discard the stashed data to prevent leaks. If the Response Filter does not access the data, the framework will handle discarding the stashed data.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with this approach, but did you consider making it an error to try to store state in a zero-ack produce-request. should it be a rejected alternative?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's a bad idea because when a filter developer forgets to gets the acks=0 case (and we've been guilty of that ourselves!) they release something which will break with an exception.

But there is a deeper question there: We're assuming that this state that we're keeping around for the response is only really using memory. But what if there are other resources in there, that are invisible to the runtime. Like a socket/connection/file descriptor? So I wonder if we should require that the data they associate should actually be Closable. That way we can guarantee that the runtime will fully clean it up even if the filter developer has forgotten that acks=0 is a thing.

Wdyt @robobario ?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

then in the simple case they'd have to implement a no-op close in their record, and be forced to wrap simple data types in a Closable. Maybe it would be enough to optionally support it, so we close at pop-time if it's Closable.

* **Data Structure:** Initially, the API will support a single state object. Since filter authors can define a custom `Record` or `POJO` to hold multiple values, a complex Key-Value map is deemed unnecessary at this stage.

## Affected/not affected projects

* **Affected:** `kroxylicious-api` (interface changes), `kroxylicious-runtime` (implementation of state management).
* **Not Affected:** `kroxylicious-filters` (until they are refactored to use the new API), external integration tests.

## Compatibility

* **Backward Compatibility:** The `forward` methods are retained as a naive forward is a common action when a Filter does not want to take any action. The convenience is worth carrying the two mechanisms for forwarding.
* **Forward Compatibility:** By using a builder pattern (`forwardRequestBuilder`), we create a path for future extensions.

## Rejected alternatives

* **Downstream Shared State:** Attaching data that is visible to other filters. This was rejected as it introduces "spooky action at a distance" and makes filter ordering too brittle. We prefer explicit message manipulation or tags for inter-filter communication.
* **API decoupled from the Filter Command:** You could alternatively offer methods on the FilterContext like `context.pushCorrelatedData(x)`, but then you have to consider more edge-cases like Filters invoking it
at unpredictable times from uncontrolled threads.
* **Simple Method Overloading:** Adding `forward(header, request, data)` was rejected as it does not scale well if we need to add more parameters in the future and makes the `FilterContext` API cluttered.
Loading