Ai gateway by predic8 · Pull Request #2937 · membrane/api-gateway

predic8 · 2026-05-07T13:24:48Z

Summary by CodeRabbit

New Features
- Added LLM Gateway interceptor supporting Claude, Google Gemini, and OpenAI providers.
- Implemented token-based rate limiting with per-user quotas and configurable reset periods.
- Enabled API key sharing with per-user token budgets and model allowlisting.
- Added input/output token limit enforcement and usage tracking.

…limits

coderabbitai · 2026-05-07T13:25:17Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Adds an LLM gateway interceptor with provider abstractions (OpenAI, Claude, Google), SSE parsing, JSON helpers, token usage tracking and stores, error creators, tutorials, and integration/unit tests.

Changes

LLM Gateway Integration

Layer / File(s)	Summary
Core contracts and utilities `core/.../provider/(LLM)`, `core/.../json/JsonUtil.java`, `core/.../util/http/SSEParser.java`	Introduces LLMRequest/Response/Provider, error creator interface/base, JSON helpers, and SSE streaming parser with tests.
Provider adapters `core/.../provider/openai/`, `core/.../provider/claude/`, `core/.../provider/google/*`	Adds provider-specific request parsing, token estimates, SSE response handling, and error response builders.
Gateway interceptor orchestration `core/.../ai/LLMGatewayInterceptor.java`	Validates/authenticates, enforces token/model limits, rewrites API keys, forwards JSON, processes usage, and persists it.
Usage tracking and stores `core/.../ai/store/*`, `core/.../util/jdbc/AbstractJdbcSupport.java`	Defines Usage/AiApiUser, in-memory and JDBC stores, and JDBC connection helper.
Events and parsing `core/.../ai/*Event.java`	Parses provider SSE events into structured event types.
Tutorials and tests `distribution/tutorials/ai/llm-gateway/**`, `distribution/...TutorialTest.java`	Adds tutorials (OpenAI/Claude/Google) with scripts and comprehensive integration tests.
Minor infrastructure `annot/.../ObjectBinder.java`, `core/.../security/AbstractSecurityScheme.java`, MCP/jsonrpc headers	Instantiation helper and import/header cleanups; no behavior change in security/MCP/jsonrpc parts.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant Gateway as LLMGatewayInterceptor
  participant Provider as LLMProvider
  participant Upstream as Upstream LLM API
  participant Store as AiApiStore

  Client->>Gateway: HTTP request (JSON, API key)
  Gateway->>Provider: getLLMRequest(exchange)
  Gateway->>Store: getUser(token)? / checkLimit(...)
  alt Auth/limits fail
    Gateway-->>Client: Error response (auth/limits/model/input)
  else OK
    Gateway->>Provider: getLLMResponse(exchange, postProcessor)
    Provider->>Upstream: Forward adapted request
    Upstream-->>Provider: Streamed/JSON response
    Provider-->>Gateway: LLMResponse (usage, events)
    Gateway->>Store: store(user, usage)
    Gateway-->>Client: Upstream response
  end

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Suggested labels

7.x

Suggested reviewers

christiangoerdes
rrayst

Poem

A rabbit taps keys with a gentle thrum,
Streams turn to sense as the tokens come.
Claude, OpenAI, and Gemini sing—
The gateway conducts like a velvet spring.
Limits are counted, errors well-spun,
Burrows of tests say, “It’s done!” 🐇✨

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch ai-gateway

coderabbitai

Actionable comments posted: 13

🧹 Nitpick comments (6)

annot/src/main/java/com/predic8/membrane/annot/yaml/parsing/binding/ObjectBinder.java (1)

102-108: 💤 Low value

instantiate declares checked exceptions that are already swallowed by bind's catch-all

The helper correctly surfaces NoSuchMethodException as a descriptive error. Minor note: InvocationTargetException, InstantiationException, and IllegalAccessException are declared in the throws clause, but since instantiate is private and its only caller (bind) already catches Throwable, these declarations are noise. Consider removing them, which also removes the need for bind to implicitly rely on the catch-all for those paths.
🔧 Optional cleanup
-    private static <T> `@NotNull` T instantiate(Class<T> clazz) throws InvocationTargetException, InstantiationException, IllegalAccessException {
+    private static <T> `@NotNull` T instantiate(Class<T> clazz) {
         try {
             return clazz.getConstructor().newInstance();
         } catch (NoSuchMethodException e) {
             throw new ConfigurationParsingException("Class %s does not have a public no-arg constructor.".formatted(clazz.getName()));
+        } catch (InvocationTargetException | InstantiationException | IllegalAccessException e) {
+            throw new ConfigurationParsingException(e);
         }
     }
This keeps all exception handling inside the helper and makes the contract self-contained, while bind()'s catch (ConfigurationParsingException e) still applies the source-file context as before.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@annot/src/main/java/com/predic8/membrane/annot/yaml/parsing/binding/ObjectBinder.java`
around lines 102 - 108, The instantiate helper should not declare
InvocationTargetException, InstantiationException, or IllegalAccessException
since its caller bind already swallows Throwable; instead, remove those
exceptions from the method signature and catch InvocationTargetException,
InstantiationException, and IllegalAccessException inside instantiate,
converting them to a ConfigurationParsingException with a clear message
(including clazz.getName() and the original exception as the cause); keep the
existing NoSuchMethodException handling as-is; update nothing in bind beyond
removing the throws dependency so bind continues to catch
ConfigurationParsingException for source-file context.

core/src/main/java/com/predic8/membrane/core/util/jdbc/AbstractJdbcSupport.java (1)

60-71: Open @TODO: migrate subclass connections to getConnection()

The comment on line 60 documents that subclasses should be migrated to this method but haven't been yet. Without that migration the deduplication benefit is unrealised.

Would you like me to open a tracking issue for migrating existing subclasses to use getConnection()?

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/util/jdbc/AbstractJdbcSupport.java`
around lines 60 - 71, Subclasses are still opening connections directly instead
of using AbstractJdbcSupport.getConnection(), preventing centralized error
handling and deduplication; update all subclasses that currently call
datasource.getConnection() or manage SQLException to instead call
getConnection() so they benefit from the
ClassNotFoundException-to-ConfigurationException mapping provided by
getRootCause() and ConfigurationException, remove duplicate try/catch logic from
those subclasses, and ensure imports and tests reflect the consolidated behavior
(look for usages of datasource.getConnection(), direct new SQLException
handling, and any class-specific connection helpers to replace with
getConnection()).

core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java (1)

16-16: 💤 Low value

Consider documenting the checkLimit return value contract

It's unclear whether the return value means "remaining tokens," "0 means deny," or something else. A brief Javadoc would help implementors (e.g., JDBCAiApiUsageStore currently returns 0 always, which may be interpreted as "limit reached").

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java`
at line 16, Add a clear Javadoc to the AiApiStore.checkLimit(AiApiUser user)
method that defines the return-value contract (e.g., that it returns the number
of remaining allowed tokens/requests, that 0 means no remaining allowance and
should be treated as "deny", and any special sentinel values like -1 mean
"unlimited"), and update the JDBCAiApiUsageStore implementation to conform to
that contract (it currently returns 0 always which can be misinterpreted);
reference the method name checkLimit and the JDBCAiApiUsageStore implementation
when making the change so callers and implementors know the exact semantics.

core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java (2)

91-108: 💤 Low value

Optional: replace bare RuntimeException with a typed exception.

new RuntimeException(e) in setJsonResponse/getJson swallows the failure category; a domain-specific exception (or at least an IllegalStateException) makes upstream handling/logging clearer and avoids mixing programmer errors with checked-IO failures.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java`
around lines 91 - 108, Replace the bare RuntimeException throws in
setJsonResponse and getJson with a more specific unchecked exception (e.g.,
IllegalStateException or a domain-specific exception) so failure semantics are
clearer; update setJsonResponse to throw IllegalStateException with a contextual
message when JsonProcessingException occurs, and update getJson to throw
IllegalStateException (or a new OpenAIApiException) for both the "Expected JSON
Object" case and the caught IOException, including the original exception as the
cause and useful context (referencing methods setJsonResponse, getJson, om,
Exchange and Flow to locate the code).

71-89: ⚡ Quick win

Recommended: avoid parse/re-serialize when no transformation is needed.

getJson (line 71) and setJsonResponse (line 87) run on every request even when both maxOutputTokens and maxInputTokens are 0 (i.e., nothing to mutate). This adds latency and forces every request through this interceptor to be a JSON object — a non-JSON request will throw RuntimeException from getJson and surface as an internal server error. Skip the body work when there is nothing to do.

♻️ Proposed shape

-        var json = getJson(exc, REQUEST);
-
-        if (maxOutputTokens != 0) {
-            json.put(MAX_OUTPUT_TOKENS, maxOutputTokens);
-        }
-
-        if (maxInputTokens != 0) {
-            var input = json.get("input");
-            if (input != null) {
-                var estimated = estimateTokens(input.asText());
-                if (estimated > maxInputTokens) {
-                    exc.setResponse(contextLengthExceeded(maxInputTokens, estimated));
-                    return RETURN;
-                }
-            }
-        }
-        setJsonResponse(exc, json);
-        return CONTINUE;
+        if (maxOutputTokens == 0 && maxInputTokens == 0) {
+            return CONTINUE;
+        }
+        var json = getJson(exc, REQUEST);
+        if (maxInputTokens != 0) {
+            var input = json.get("input");
+            if (input != null) {
+                var estimated = estimateTokens(input.asText());
+                if (estimated > maxInputTokens) {
+                    exc.setResponse(contextLengthExceeded(maxInputTokens, estimated));
+                    return RETURN;
+                }
+            }
+        }
+        if (maxOutputTokens != 0) {
+            json.put(MAX_OUTPUT_TOKENS, maxOutputTokens);
+        }
+        setJsonResponse(exc, json);
+        return CONTINUE;

Also note: input-token check happens after max_output_tokens is written into json; it's safer to validate first so a rejected request never has its body mutated.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java`
around lines 71 - 89, Detect when both maxOutputTokens and maxInputTokens are
zero and short-circuit the interceptor before calling getJson or setJsonResponse
so you skip parsing/serializing entirely; locate the logic in
OpenAIAPIInterceptor.java around the getJson(...) and setJsonResponse(...) calls
and return CONTINUE immediately if both limits are 0. Additionally, when
maxInputTokens is > 0 or maxOutputTokens is > 0, validate the input token count
first (use estimateTokens on the request input via json.get("input") and call
exc.setResponse(contextLengthExceeded(...))/return RETURN if exceeded) before
mutating the json with json.put(MAX_OUTPUT_TOKENS, ...), ensuring getJson is
only invoked when you will actually inspect or modify the body.

core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java (1)

11-11: 💤 Low value

Minor: formatting nit on @MCElement attribute list.

Missing space after the comma in component = false, id="simple-ai-api-store" (name="simpleStore",component). Cosmetic only.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`
at line 11, The `@MCElement` annotation on the SimpleAiApiStore class has a
missing space after the comma in its attribute list; update the annotation on
SimpleAiApiStore (the `@MCElement`(...) line) to read name="simpleStore",
component = false, id="simple-ai-api-store" (ensure there is a space after the
comma between attributes and keep attribute names/values unchanged) so the
formatting is consistent.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@core/src/main/java/com/predic8/membrane/core/interceptor/ai/AiUtil.java`:
- Around line 45-48: The current check in AiUtil that uses
ah.indexOf(BEARER_PREFIX) allows "Bearer" to appear anywhere in the header
(e.g., "XBearer ..."); change the logic to ensure the Authorization header value
starts with the Bearer scheme by using ah.trim().startsWith(BEARER_PREFIX) (or
require index == 0 after trimming) before extracting the token in the method
that currently performs the indexOf check; after confirming the prefix at the
start, strip the exact prefix and any leading whitespace to return the token.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java`:
- Line 65: The stored property MEMBRANE_AI_USERTOKEN currently holds an
Optional<AiApiUser> (set via exc.setProperty(..., user)) but is later read with
exc.getProperty(MEMBRANE_AI_USERTOKEN, String.class) for per-user tracking;
change the stored representation to the actual bearer/user token string so the
typed accessor matches: when setting MEMBRANE_AI_USERTOKEN store
user.get().getToken() (or whichever String field represents the user/bearer)
instead of the Optional, and ensure any code that reads it
(exc.getProperty(MEMBRANE_AI_USERTOKEN, String.class) and
SimpleAiApiStore.store(String user, ...)) expects that String token
consistently.
- Around line 147-149: The getAiStore() accessor currently returns null; change
it to return the configured AiApiStore instance used by the interceptor (e.g.,
return the aiStore/aiApiStore field or delegate to the existing getter used by
other accessors) so reflective access and tests see the actual store; update the
OpenAIAPIInterceptor#getAiStore method to return the interceptor's stored
AiApiStore instance (matching the field name used in the class) instead of null.
- Around line 111-126: handleResponse currently calls store.store(...)
unconditionally which causes an NPE when no aiStore is configured; update the
OpenAIAPIInterceptor.handleResponse method to guard the usage-storage logic with
a null-check (if (store != null)) before calling store.store and getUsage,
mirroring the guard used in handleRequest, and ensure you still continue
processing (return CONTINUE) when store is null; reference symbols:
handleResponse, store, store.store(...), getUsage(...), MEMBRANE_AI_USERTOKEN.
- Around line 128-134: In OpenAIAPIInterceptor.getUsage, the current calls
usage.get("input_tokens").asInt() etc. will NPE if any field is missing; change
to use the safe JSON path accessor (e.g., usage.path("input_tokens").asInt(0))
for each token field so missing keys fall back to 0, and keep the existing guard
for a completely missing usage node (return new Usage(0,0,0)). Ensure you
construct the Usage with the three safe asInt(0) results.
- Around line 120-123: The current check json.get("error").isNull() causes an
NPE when the "error" field is absent; change to a null-safe path lookup using
JsonNode.path(): obtain JsonNode error = json.path("error") and then check both
!error.isMissingNode() && !error.isNull() before returning CONTINUE (update the
code in OpenAIAPIInterceptor where json.get("error") is used).

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAiApiUtil.java`:
- Around line 25-35: contextLengthExceeded currently serializes ErrorBody
directly producing a flat JSON; change it to wrap the ErrorBody inside an
ErrorEnvelope (same pattern used by authenticationFailed() and
tokenLimitExceeded()) before calling createJson so the response shape becomes
{"error":{...}}; update the return in contextLengthExceeded to build the
badRequest().json(createJson(new ErrorEnvelope(new ErrorBody(...)))) response
using the existing ErrorEnvelope and ErrorBody classes.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java`:
- Around line 17-38: AiApiLimit has concurrent mutations on tokens and
nextReset; make mutations atomic by synchronizing access: mark both checkLimit()
and addTokens(long) as synchronized so reads/writes to tokens and nextReset
happen under the same intrinsic lock for AiApiLimit, ensuring the reset branch
in checkLimit() and the this.tokens += tokens update are thread-safe;
alternatively, replace long tokens with AtomicLong and guard nextReset/reset
logic with a synchronized block, but keep checkLimit() and addTokens() logically
synchronized on the instance so tokens and nextReset stay consistent.
- Around line 22-24: The constructor sets nextReset using period which defaults
to 0, causing immediate resets; change initialization so nextReset is computed
lazily in checkLimit() (or validate period>0 in init()). Specifically, in class
AiApiLimit defer setting nextReset in the constructor, and in method
checkLimit() detect an uninitialized state (e.g., period <= 0 or nextReset ==
null/equals now sentinel) and then set nextReset = now().plusSeconds(period)
(after ensuring period>0) and initialize tokens appropriately; alternatively add
an init() validation in AiApiLimit to assert period > 0 and throw/configure
before any checkLimit() runs. Ensure references: AiApiLimit constructor, field
nextReset, field period, method checkLimit, and optional init().

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java`:
- Around line 53-61: The methods in JDBCAiApiUsageStore — getUser(String) and
checkLimit(AiApiUser) — are stubbed and silently break
authentication/rate-limiting; replace the current bodies so they throw
UnsupportedOperationException with a clear message like "getUser not implemented
in JDBCAiApiUsageStore" / "checkLimit not implemented in JDBCAiApiUsageStore"
(or alternatively add explicit TODO comments) so callers fail fast instead of
being silently denied; ensure messages reference the method names and suggest
implementing them or using a different store, and keep the existing store()
functionality intact.
- Around line 13-22: The CREATE_TABLE_SQL in JDBCAiApiUsageStore uses
PostgreSQL's "BIGINT GENERATED ALWAYS AS IDENTITY" which breaks MySQL/MariaDB;
update createTablesIfNotExist()/JDBCAiApiUsageStore to use database-specific
DDL: detect the JDBC URL/DatabaseMetaData and emit "BIGINT AUTO_INCREMENT
PRIMARY KEY" for MySQL/MariaDB and keep the IDENTITY form for PostgreSQL, or
alternatively switch to a portable approach (e.g., remove DB identity and
generate IDs in application code) so the CREATE_TABLE_SQL constant is
chosen/constructed per database engine.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Around line 20-33: store(...) and checkLimit(...) currently use a single
shared AiApiLimit (limit) so per-user quotas are not enforced and concurrent
updates are unsafe; change to maintain per-user limits (e.g., a
ConcurrentHashMap<String, AiApiLimit> or Map<AiApiUser, AiApiLimit> keyed by
token/username) and update store(user, usage) to add tokens to the caller's
AiApiLimit, and change checkLimit(AiApiUser) to consult that user's AiApiLimit;
also make AiApiLimit thread-safe (use AtomicLong/LongAdder for counters or
synchronize access) so addTokens/checkLimit are safe under concurrent requests,
or if you intend a global budget instead rename the field/methods to reflect a
shared limit and document it.
- Around line 16-28: The users field in SimpleAiApiStore is not initialized,
causing getUser(String) to NPE on users.stream(); fix by defensively
initializing the users field (private List<AiApiUser> users = new ArrayList<>())
or by guarding getUser(...) with a null/empty check (e.g., return
Optional.empty() if users is null) and ensure imports (java.util.ArrayList) are
added; update any related logic (store(...) and getUser(...)) to work with the
initialized/guarded users list.

---

Nitpick comments:
In
`@annot/src/main/java/com/predic8/membrane/annot/yaml/parsing/binding/ObjectBinder.java`:
- Around line 102-108: The instantiate helper should not declare
InvocationTargetException, InstantiationException, or IllegalAccessException
since its caller bind already swallows Throwable; instead, remove those
exceptions from the method signature and catch InvocationTargetException,
InstantiationException, and IllegalAccessException inside instantiate,
converting them to a ConfigurationParsingException with a clear message
(including clazz.getName() and the original exception as the cause); keep the
existing NoSuchMethodException handling as-is; update nothing in bind beyond
removing the throws dependency so bind continues to catch
ConfigurationParsingException for source-file context.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java`:
- Around line 91-108: Replace the bare RuntimeException throws in
setJsonResponse and getJson with a more specific unchecked exception (e.g.,
IllegalStateException or a domain-specific exception) so failure semantics are
clearer; update setJsonResponse to throw IllegalStateException with a contextual
message when JsonProcessingException occurs, and update getJson to throw
IllegalStateException (or a new OpenAIApiException) for both the "Expected JSON
Object" case and the caught IOException, including the original exception as the
cause and useful context (referencing methods setJsonResponse, getJson, om,
Exchange and Flow to locate the code).
- Around line 71-89: Detect when both maxOutputTokens and maxInputTokens are
zero and short-circuit the interceptor before calling getJson or setJsonResponse
so you skip parsing/serializing entirely; locate the logic in
OpenAIAPIInterceptor.java around the getJson(...) and setJsonResponse(...) calls
and return CONTINUE immediately if both limits are 0. Additionally, when
maxInputTokens is > 0 or maxOutputTokens is > 0, validate the input token count
first (use estimateTokens on the request input via json.get("input") and call
exc.setResponse(contextLengthExceeded(...))/return RETURN if exceeded) before
mutating the json with json.put(MAX_OUTPUT_TOKENS, ...), ensuring getJson is
only invoked when you will actually inspect or modify the body.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java`:
- Line 16: Add a clear Javadoc to the AiApiStore.checkLimit(AiApiUser user)
method that defines the return-value contract (e.g., that it returns the number
of remaining allowed tokens/requests, that 0 means no remaining allowance and
should be treated as "deny", and any special sentinel values like -1 mean
"unlimited"), and update the JDBCAiApiUsageStore implementation to conform to
that contract (it currently returns 0 always which can be misinterpreted);
reference the method name checkLimit and the JDBCAiApiUsageStore implementation
when making the change so callers and implementors know the exact semantics.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Line 11: The `@MCElement` annotation on the SimpleAiApiStore class has a missing
space after the comma in its attribute list; update the annotation on
SimpleAiApiStore (the `@MCElement`(...) line) to read name="simpleStore",
component = false, id="simple-ai-api-store" (ensure there is a space after the
comma between attributes and keep attribute names/values unchanged) so the
formatting is consistent.

In
`@core/src/main/java/com/predic8/membrane/core/util/jdbc/AbstractJdbcSupport.java`:
- Around line 60-71: Subclasses are still opening connections directly instead
of using AbstractJdbcSupport.getConnection(), preventing centralized error
handling and deduplication; update all subclasses that currently call
datasource.getConnection() or manage SQLException to instead call
getConnection() so they benefit from the
ClassNotFoundException-to-ConfigurationException mapping provided by
getRootCause() and ConfigurationException, remove duplicate try/catch logic from
those subclasses, and ensure imports and tests reflect the consolidated behavior
(look for usages of datasource.getConnection(), direct new SQLException
handling, and any class-specific connection helpers to replace with
getConnection()).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 5199bbff-c303-4ad8-a638-e79ae07a87e6

📥 Commits

Reviewing files that changed from the base of the PR and between 1af0001 and d792f5f.

📒 Files selected for processing (12)

annot/src/main/java/com/predic8/membrane/annot/yaml/parsing/binding/ObjectBinder.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/AiUtil.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAiApiUtil.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/Usage.java
core/src/main/java/com/predic8/membrane/core/security/AbstractSecurityScheme.java
core/src/main/java/com/predic8/membrane/core/util/jdbc/AbstractJdbcSupport.java

coderabbitai · 2026-05-07T13:31:34Z

+    private static final String CREATE_TABLE_SQL = """
+            CREATE TABLE IF NOT EXISTS ai_api_usage (
+                id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
+                username VARCHAR(255) NOT NULL,
+                input_tokens INT NOT NULL,
+                output_tokens INT NOT NULL,
+                total_tokens INT NOT NULL,
+                created_at TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
+            )
+            """;


⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

🧩 Analysis chain

🌐 Web query:

Does MySQL support BIGINT GENERATED ALWAYS AS IDENTITY syntax in CREATE TABLE?

💡 Result:

No, MySQL does not support the BIGINT GENERATED ALWAYS AS IDENTITY syntax in CREATE TABLE. The official MySQL documentation for versions 8.0 [1][2][3] and 8.4 [4] shows no such syntax. Instead, MySQL uses AUTO_INCREMENT for auto-generating unique values in integer columns, including BIGINT. Examples include BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY [5][6][7]. GENERATED ALWAYS AS requires an expression like (expr) [VIRTUAL|STORED], not IDENTITY [2]. This syntax is from PostgreSQL [8], and MySQL users convert it to AUTO_INCREMENT [9]. As of MySQL 8.4 (latest per release notes [4]), no support exists.

Citations:

1: https://dev.mysql.com/doc/refman/8.0/en/create-table.html

2: https://dev.mysql.com/doc/refman/8.0/en/create-table-generated-columns.html

3: https://dev.mysql.com/doc/en/create-table-generated-columns.html

4: https://dev.mysql.com/doc/relnotes/mysql/8.4/en/news-8-4-0.html

5: https://dev.mysql.com/doc/refman/8.1/en/example-auto-increment.html

6: https://www.bennadel.com/blog/4125-testing-bigint-auto-increment-usage-in-mysql-5-7-32-and-lucee-cfml-5-3-8-201.htm

7: https://dev.mysql.com/doc/refman/en/example-auto-increment.html

8: https://www.postgresql.org/docs/current/ddl-identity-columns.html

9: https://stackoverflow.com/questions/47637651/how-to-add-generated-by-default-as-identity-in-mysql-table

BIGINT GENERATED ALWAYS AS IDENTITY is not supported by MySQL/MariaDB

The SQL:2003 syntax GENERATED ALWAYS AS IDENTITY is specific to PostgreSQL (10+). MySQL/MariaDB do not support this; they use AUTO_INCREMENT instead. Running this DDL against MySQL will fail at startup during createTablesIfNotExist().

If multi-database support is required, use database-specific DDL generation or a more portable approach (e.g., application-layer ID generation or separate schema scripts per database).

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java` around lines 13 - 22, The CREATE_TABLE_SQL in JDBCAiApiUsageStore uses PostgreSQL's "BIGINT GENERATED ALWAYS AS IDENTITY" which breaks MySQL/MariaDB; update createTablesIfNotExist()/JDBCAiApiUsageStore to use database-specific DDL: detect the JDBC URL/DatabaseMetaData and emit "BIGINT AUTO_INCREMENT PRIMARY KEY" for MySQL/MariaDB and keep the IDENTITY form for PostgreSQL, or alternatively switch to a portable approach (e.g., remove DB identity and generate IDs in application code) so the CREATE_TABLE_SQL constant is chosen/constructed per database engine.

- Introduced thread-safe `AtomicLong` for token management. - Synchronized reset logic in `AiApiLimit`. - Improved error handling and null checks in OpenAI API interactions. - Default-initialized user list in `SimpleAiApiStore`. - Fixed getter for `AiApiStore` in interceptor.

- Removed `AiUtil` and replaced with modular `AiProvider` interface. - Added provider implementations: `Claude`, `OpenAI`. - Updated `OpenAIAPIInterceptor` to use configurable providers and enforce model restrictions. - Introduced `NoAiApiLimit` for simplified limit management. - Enhanced error handling with model validation in `OpenAiApiUtil`.

coderabbitai

Actionable comments posted: 4

♻️ Duplicate comments (2)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java (1)
69-69: ⚠️ Potential issue | 🔴 Critical | ⚡ Quick win

Property type mismatch breaks usage tracking (AiApiUser stored, String retrieved).

You store user object in MEMBRANE_AI_USERTOKEN but later read it as String.class; this will yield null/cast issues and wrong accounting keys.

Also applies to: 137-137
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java`
at line 69, You’re storing an AiApiUser object into the MEMBRANE_AI_USERTOKEN
property via exc.setProperty but other code expects a String; change the set to
store a String key instead (e.g., exc.setProperty(MEMBRANE_AI_USERTOKEN,
user.getId() or user.getUsername()/getToken()) so the later retrieval as
String.class works), and make the same change at the other occurrence around the
second site (the other set at line ~137) so both storage and retrieval use the
same String identifier rather than an AiApiUser instance.
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java (1)
21-34: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Quota accounting is global, not per user.

user is ignored for limit tracking, so one heavy caller can exhaust quota for everyone. This defeats per-user enforcement.

Based on learnings: verify shared-state protections are preserved when logic is split or delegated; here the shared limit state is central and should be explicitly scoped per user.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`
around lines 21 - 34, The quota accounting currently uses a single shared
"limit" object so store(String user, Usage usage) and checkLimit(AiApiUser user)
ignore the user and apply/globalize tokens; change this to a per-user limit map
(e.g., Map<String, Limit> or Map<AiApiUser, Limit>) and update store to look up
the caller's Limit by token/user and call addTokens on that per-user Limit, and
update checkLimit(AiApiUser user) to query that same per-user Limit; ensure the
map uses a concurrent implementation (ConcurrentHashMap) or other
synchronization to protect shared state and initialize a Limit for a user on
first use (using get-or-create semantics) so existing methods getUser, store,
and checkLimit operate per-user rather than against the single "limit" field.

🧹 Nitpick comments (1)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java (1)
3-8: ⚡ Quick win

NoAiApiLimit enforces a fixed limit of 1000 instead of "no limit".

The class name implies unlimited access, but checkLimit() returns a hardcoded 1000. Since checkLimit() semantically represents remaining quota and is used to reject requests when the value drops to zero, this effectively caps all users at 1000 tokens. Either rename the class to reflect the actual limit (e.g., DefaultAiApiLimit) or return Long.MAX_VALUE (or similar unbounded sentinel) to match the "no limit" semantics.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java`
around lines 3 - 8, The class NoAiApiLimit currently returns a hardcoded 1000 in
checkLimit(), which contradicts its "no limit" name and caps usage; update the
implementation of NoAiApiLimit.checkLimit() to return an unbounded sentinel such
as Long.MAX_VALUE (or another agreed "no limit" value) so it truly represents
unlimited quota, or alternatively rename the class (e.g., DefaultAiApiLimit) and
keep the 1000 if the intention is a fixed default limit; reference NoAiApiLimit
and its checkLimit() method (and AiApiLimit base type) when making the change.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java`:
- Around line 80-89: The current logic only calls
provider.estimateInputTokens(json) if json.get("input") is present, which skips
providers that estimate from other fields; change the block in
OpenAIAPIInterceptor so that whenever maxInputTokens != 0 you call
provider.estimateInputTokens(json) unconditionally (remove the input null
check), then compare the returned estimated value to maxInputTokens and, if
exceeded, call exc.setResponse(contextLengthExceeded(maxInputTokens, estimated))
and return RETURN; ensure you keep the same symbols: maxInputTokens,
provider.estimateInputTokens(json), exc.setResponse(...),
contextLengthExceeded(...), and RETURN.
- Around line 56-57: provider is used without validation and can NPE when
calling provider.getApiKey(...) or provider.setApiKey(...); add a non-null
validation either at startup in the OpenAIAPIInterceptor.init() method (throw a
clear configuration exception if provider is null) or add a request-time guard
in the request handling path that checks if provider == null and returns a 5xx
response with a clear error message; update references to
provider.getApiKey(...) and provider.setApiKey(...) to only be executed after
this check so no NPE can occur.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAI.java`:
- Around line 35-42: In OpenAI.java, the Bearer token extraction currently uses
ah.indexOf(BEARER_PREFIX) which matches the prefix anywhere; change the logic in
the method that reads ah to require a strict prefix (e.g., after trimming
leading whitespace the header must startWith BEARER_PREFIX or match a
case‑insensitive regex like ^Bearer\s+(\S+)$) and then extract the token from
that exact location (using BEARER_PREFIX.length()), returning null for any
header that does not strictly start with the Bearer prefix or that yields an
empty token; reference BEARER_PREFIX and the variable ah when making this
change.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Around line 27-29: getUser in SimpleAiApiStore uses u.getToken().equals(token)
which can NPE if a configured AiApiUser has a null token; change the equality
check to Objects.equals(u.getToken(), token) inside the getUser method to safely
handle nulls and add an import for java.util.Objects if not already present so
the stream filter becomes safe.

---

Duplicate comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java`:
- Line 69: You’re storing an AiApiUser object into the MEMBRANE_AI_USERTOKEN
property via exc.setProperty but other code expects a String; change the set to
store a String key instead (e.g., exc.setProperty(MEMBRANE_AI_USERTOKEN,
user.getId() or user.getUsername()/getToken()) so the later retrieval as
String.class works), and make the same change at the other occurrence around the
second site (the other set at line ~137) so both storage and retrieval use the
same String identifier rather than an AiApiUser instance.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Around line 21-34: The quota accounting currently uses a single shared "limit"
object so store(String user, Usage usage) and checkLimit(AiApiUser user) ignore
the user and apply/globalize tokens; change this to a per-user limit map (e.g.,
Map<String, Limit> or Map<AiApiUser, Limit>) and update store to look up the
caller's Limit by token/user and call addTokens on that per-user Limit, and
update checkLimit(AiApiUser user) to query that same per-user Limit; ensure the
map uses a concurrent implementation (ConcurrentHashMap) or other
synchronization to protect shared state and initialize a Limit for a user on
first use (using get-or-create semantics) so existing methods getUser, store,
and checkLimit operate per-user rather than against the single "limit" field.

---

Nitpick comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java`:
- Around line 3-8: The class NoAiApiLimit currently returns a hardcoded 1000 in
checkLimit(), which contradicts its "no limit" name and caps usage; update the
implementation of NoAiApiLimit.checkLimit() to return an unbounded sentinel such
as Long.MAX_VALUE (or another agreed "no limit" value) so it truly represents
unlimited quota, or alternatively rename the class (e.g., DefaultAiApiLimit) and
keep the 1000 if the intention is a fixed default limit; reference NoAiApiLimit
and its checkLimit() method (and AiApiLimit base type) when making the change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 2ac0a859-8e9a-4f2a-bead-13d81b96b81b

📥 Commits

Reviewing files that changed from the base of the PR and between 8fa0d95 and 356a4c5.

📒 Files selected for processing (9)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAiApiUtil.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AiProvider.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/Claude.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAI.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java

✅ Files skipped from review due to trivial changes (2)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AiProvider.java

🚧 Files skipped from review as they are similar to previous changes (1)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java

membrane-ci-server · 2026-05-07T16:13:51Z

This pull request needs "/ok-to-test" from an authorized committer.

…bstraction - Added `AiApiRequest` and `AiApiResponse` abstractions for request/response handling. - Introduced `AbstractAiApiRequest` and `AbstractAiApiResponse` as base classes. - Implemented providers: `Google`, `OpenAI`, and `Claude` with concrete request/response handling. - Updated `AiProvider` to handle request/response creation. - Refactored `OpenAIAPIInterceptor` to leverage request/response abstraction and enforce contract restrictions. - Enhanced `JsonUtil` with helper methods for JSON body parsing and updates. - Updated `AiApiStore` and related classes for improved usage tracking and user abstraction.

coderabbitai

Actionable comments posted: 5

♻️ Duplicate comments (1)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiRequest.java (1)
40-47: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Bearer token extraction is too permissive — matches anywhere in the header value.

indexOf(BEARER_PREFIX) accepts headers like "Basic abc Bearer xyz" and extracts "xyz" as the token. The standard requires the scheme to be at the start of the value.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiRequest.java`
around lines 40 - 47, The current extraction uses ah.indexOf(BEARER_PREFIX)
which wrongly matches the scheme anywhere in the header; change the logic to
require the scheme at the start (e.g., use ah.startsWith(BEARER_PREFIX) or a
regex anchored with ^) before calling ah.substring(...) and trimming, and return
null if the header does not begin with BEARER_PREFIX; keep using BEARER_PREFIX,
ah.substring(...), and token.isEmpty() checks but only after validating the
prefix is at the start.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/ClaudeAiRequest.java`:
- Around line 15-35: The method estimateInputTokens uses an int accumulator
named tokens which can overflow before being returned as a long; change the
accumulator declaration in estimateInputTokens from int to long and ensure all
constants/updates use long arithmetic (e.g., use 1000L and ensure divisions
produce long results or cast operands) when adding system, message, block and
image token estimates so no intermediate int overflow occurs when computing the
total from json.path("system") and json.path("messages").

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/GoogleAiRequest.java`:
- Around line 107-118: getGenerationConfig() can NPE because json may be null
(see estimateInputTokens guard); update getGenerationConfig() (used by
setMaxOutputTokens) to first ensure json is non-null and is an ObjectNode
(create/replace it with a new ObjectNode when null or not an object) before
calling json.get("generationConfig"), then return or create the
"generationConfig" ObjectNode; reference methods: getGenerationConfig,
setMaxOutputTokens and the field json.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAiAiRequest.java`:
- Around line 15-17: The estimateInputTokens() method in OpenAiAiRequest
currently uses json.path("input").asText("") which returns an empty string for
Chat Completions (no "input" field) and for ArrayNode inputs, making token
checks no-op; update estimateInputTokens() to inspect both "input" and
"messages": if "messages" is an array (Chat Completions) iterate it and
concatenate or sum the lengths of text-like fields (e.g., each message's
"content" or all string values) to compute total chars, if "input" is an array
iterate elements and for each element use asText("") for strings or extract
textual fields from objects before summing; finally compute tokens as
ceil(totalChars / 4.0) and return that long so OpenAIAPIInterceptor's
maxInputTokens check works correctly.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java`:
- Line 15: The SQL DDL string in JDBCAiApiUsageStore (created in the
createTablesIfNotExist block) contains a Java text-block literal with a `//
`@TODO` ...` comment which is being embedded into the SQL and breaks parsing;
remove the `// `@TODO` ...` from the SQL text block or replace it with a valid SQL
comment (e.g., `/* ... */` or `-- ...`), ensuring the `id BIGINT GENERATED
ALWAYS AS IDENTITY PRIMARY KEY` line ends cleanly (no stray `//` token) so the
DDL executes successfully.

In `@core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java`:
- Around line 89-98: getJsonObject currently throws a bare
RuntimeException/IOException on non-JSON or non-object bodies which bubbles out
and crashes the pipeline; change getJsonObject to return Optional<ObjectNode>
(signature: getJsonObject(Message msg) -> Optional<ObjectNode>) and implement it
to return Optional.empty() for non-object JSON or when parsing fails (wrap IO
errors in UncheckedIOException only if you must rethrow); then update callers
(notably AbstractAiApiResponse constructor and any usages in
OpenAIAPIInterceptor) to check the Optional and convert an empty result into a
structured error response or a controlled exception (e.g.,
IllegalArgumentException) so the interceptor can map it to an HTTP error instead
of letting an unchecked RuntimeException propagate.

---

Duplicate comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiRequest.java`:
- Around line 40-47: The current extraction uses ah.indexOf(BEARER_PREFIX) which
wrongly matches the scheme anywhere in the header; change the logic to require
the scheme at the start (e.g., use ah.startsWith(BEARER_PREFIX) or a regex
anchored with ^) before calling ah.substring(...) and trimming, and return null
if the header does not begin with BEARER_PREFIX; keep using BEARER_PREFIX,
ah.substring(...), and token.isEmpty() checks but only after validating the
prefix is at the start.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: b118fe28-0fcb-4a38-a1db-897ea2b30c18

📥 Commits

Reviewing files that changed from the base of the PR and between 356a4c5 and 863e5cc.

📒 Files selected for processing (20)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiRequest.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiResponse.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/AiApiRequest.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/AiApiResponse.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AiProvider.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/Claude.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/ClaudeAiRequest.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/ClaudeAiResponse.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/Google.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/GoogleAiRequest.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/GoogleAiResponse.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAI.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAiAiRequest.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAiAiResponse.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java

✅ Files skipped from review due to trivial changes (4)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAiAiResponse.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/AiApiRequest.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/ClaudeAiResponse.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java

🚧 Files skipped from review as they are similar to previous changes (1)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java

…ponents - Updated `checkLimit` method to consider input and output tokens. - Improved token calculation logic in AI request providers. - Enhanced JSON parsing in `JsonUtil` with Optional for safer operations. - Added detailed error handling in `OpenAIAPIInterceptor` for invalid requests. - Refined token estimation logic with safety margins and JSON structure considerations.

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java`:
- Around line 69-72: The setter setPeriod(int period) must validate that period
> 0 and must update the `@GuardedBy`("lock") field nextReset while holding the
lock; modify setPeriod to throw an IllegalArgumentException for non-positive
period values and move the assignment of nextReset = now().plusSeconds(period)
inside the same critical section that protects nextReset (use the class's lock
or synchronized(lock)) so both period and nextReset are updated atomically under
the lock.
- Around line 39-53: The reset-check and token updates must be made atomic: move
the nextReset null/now.isAfter(nextReset) check and the subsequent
tokens.set(0)/nextReset update inside the same synchronized(lock) block and
ensure any token-modifying methods use that same lock; specifically wrap the
block that computes and returns maxTokens - tokens.get() - tokensForNextRequest
in synchronized(lock) (the method containing nextReset) and change
addTokens(long tokens) to synchronize on lock as well (or perform its add inside
synchronized(lock)) so tokens, nextReset and tokensForNextRequest are always
updated/read under the same lock.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Around line 22-23: Replace the INFO-level log that prints per-user identifiers
in SimpleAiApiStore (the log.info call that uses user.getName() and usage) with
a lower-verbosity, non-identifying message: change log.info(...) to
log.debug(...) (or log.trace(...) if you prefer) and remove the user.getName()
from the message so only non-sensitive data like usage.totalTokens() or a
redacted indicator is logged; keep the limit.addTokens(usage.totalTokens())
behavior unchanged.
- Around line 40-42: The setter setUsers currently allows assigning null to the
users field which later causes NPEs in getUser when it calls users.stream();
update setUsers to guard against null by assigning an empty immutable list
(e.g., Collections.emptyList() or List.of()) when the incoming users argument is
null, so that getUser can safely call users.stream(); keep the field name users
and the method getUser unchanged so callers remain consistent.

In `@core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java`:
- Around line 98-103: The code in JsonUtil.java reads a JsonNode with "JsonNode
jsonNode = om.readTree(msg.getBodyAsStreamDecoded())" but dereferences jsonNode
via jsonNode.getNodeType(), which can NPE when readTree returns null for empty
input; update the logic in the method containing that line to first check "if
(jsonNode == null) return Optional.empty();" (and/or log that input was empty)
before any access, and only call getNodeType() when jsonNode is non-null; also
keep the existing branch that returns Optional.of(on) when the node is an
ObjectNode and return Optional.empty() for any other node types.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f29cb49e-f9a8-4c53-b7fc-24c1061b037c

📥 Commits

Reviewing files that changed from the base of the PR and between 863e5cc and 2933dad.

📒 Files selected for processing (11)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiRequest.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiResponse.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/ClaudeAiRequest.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/OpenAiAiRequest.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java

🚧 Files skipped from review as they are similar to previous changes (4)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/ClaudeAiRequest.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractAiApiRequest.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/OpenAIAPIInterceptor.java

- Synchronized token management methods in `AiApiLimit` to ensure thread safety. - Adjusted log levels for `SimpleAiApiStore` to reduce verbosity. - Added PostgreSQL dependency to the distribution. - Updated logging configuration to set debug level for AI interceptors.

- Introduced `SSEUtil` for parsing Server-Sent Events (SSE) from chunks. - Enhanced `AbstractAiApiRequest` to handle JSON requests conditionally. - Deprecated and replaced `max_output_tokens` usage in specific providers. - Improved stream support in `OpenAiAiRequest` with response usage tracking. - Refactored token limit logic in `OpenAIAPIInterceptor` for better flow.

- Renamed `AiApiRequest` to `LLMRequest` and `AiApiResponse` to `LLMResponse`. - Updated providers (`Google`, `OpenAI`, `Claude`) to align with `LLMProvider` interface. - Refactored `OpenAIAPIInterceptor` to `LLMGatewayInterceptor` and related utilities. - Removed `SSEUtil` and replaced with `SSEParser`. - Improved streaming and token usage handling in `AbstractLLMResponse`.

- Modularized LLM responses for `Claude` and `OpenAI` providers. - Replaced `LLMResponse` interface and `AbstractLLMResponse` with updated abstractions. - Added `ChatCompletionsSSEParser` for advanced SSE chunk handling. - Introduced specific SSE event classes: `ChatCompletionEvent`, `ChatCompletionDoneEvent`, `ResponsesApiEvent`. - Renamed and restructured classes for consistency in AI namespace. - Improved token usage tracking and event-based streaming.

…acking logic - Deleted `ChatCompletionsSSEParser` and related classes/methods. - Simplified `ChatCompletionEvent` by removing token usage parsing. - Updated tool extraction logic in `AbstractLLMRequest` to handle function-specific tools.

predic8 · 2026-05-18T12:12:18Z

@coderabbitai plz review

coderabbitai · 2026-05-18T12:12:32Z

@predic8 Sure! I'll start a full review of the PR right away.

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 17

♻️ Duplicate comments (1)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMRequest.java (1)

108-118: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

setMaxOutputTokens() can still NPE when json is null.

On Line 113, json.get("generationConfig") dereferences json unguarded. This is the same underlying issue previously flagged and appears to persist after the rename.

Proposed fix

     `@Override`
     public void setMaxOutputTokens(int maxOutputTokens) {
+        if (json == null || json.isNull()) {
+            return;
+        }
         getGenerationConfig().put("max_output_tokens", maxOutputTokens);
     }
 
     private ObjectNode getGenerationConfig() {
+        if (!(json instanceof ObjectNode objectJson)) {
+            json = com.fasterxml.jackson.databind.node.JsonNodeFactory.instance.objectNode();
+            objectJson = (ObjectNode) json;
+        }
-        var gc = json.get("generationConfig");
+        var gc = objectJson.get("generationConfig");
         if (gc instanceof ObjectNode objectNode) {
             return objectNode;
         }
-        return json.putObject("generationConfig");
+        return objectJson.putObject("generationConfig");
     }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMRequest.java`
around lines 108 - 118, setMaxOutputTokens can still NPE because
getGenerationConfig calls json.get(...) without ensuring json is initialized;
update getGenerationConfig (used by setMaxOutputTokens) to first ensure the
field json is non-null (initialize it to an ObjectNode or call a helper that
creates it) before accessing json.get("generationConfig"), and then return or
create the "generationConfig" ObjectNode; reference the methods
getGenerationConfig and setMaxOutputTokens and the json field when making the
change.

🧹 Nitpick comments (1)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMApiUtil.java (1)

64-72: ⚡ Quick win

Catch JsonProcessingException specifically instead of broad Exception.

ObjectMapper.writeValueAsString() throws JsonProcessingException on serialization failures. Catching the specific exception type makes the error handling intent clearer and follows best practices for exception handling.

♻️ Proposed fix

+import com.fasterxml.jackson.core.JsonProcessingException;
+
 public static String createJson(Object o) {
     try {
         return om.writeValueAsString(o);
-    } catch (Exception e) {
+    } catch (JsonProcessingException e) {
         return """
                 { "error": "Could not create JSON" }
                 """;
     }
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMApiUtil.java`
around lines 64 - 72, The createJson method currently catches a broad Exception;
change the catch to handle com.fasterxml.jackson.core.JsonProcessingException
specifically for om.writeValueAsString(o) to make error handling precise
(replace the generic catch(Exception e) with catch(JsonProcessingException e))
and keep the same fallback JSON return and any logging if present.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractLLMEvent.java`:
- Around line 35-40: In AbstractLLMEvent, the Optional holding the parsed JSON
(the variable named opt used after sse.data()) is dereferenced unguarded (var
json = opt.get()), which can throw NoSuchElementException; update the code in
the method that handles SSE parsing to check opt.isEmpty() and bail out (e.g.,
return or skip processing) after logging the "Unknown event format" message so
you never call opt.get() when empty, ensuring safe handling of non-object or
failed parses.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/ChatCompletionEvent.java`:
- Around line 41-42: In ChatCompletionEvent (around the log.debug call), avoid
logging raw tool-call arguments from fn.path("arguments").asText(); instead
build a safe, redacted string (e.g., replace sensitive patterns or mask the
content and/or truncate to a max length like 100 chars with "..." appended) and
log that; update the log.debug invocation to use the redacted/truncated value so
the source (fn.path("arguments").asText()) is never written raw to logs.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java`:
- Around line 65-77: The input token limit is bypassed when no aiStore is
configured because inputTokens is only set inside the store branch; compute and
enforce input token counting regardless of store presence: in
LLMGatewayInterceptor, call aiReq.estimateInputTokens() when
exc.getRequest().isPOSTRequest() before or outside the store != null branch (or
duplicate the estimate in the else) so inputTokens holds the actual value for
the later maxInputTokens check; apply the same change for the second occurrence
referenced around lines 92-97 so the maxInputTokens check always uses the real
estimated inputTokens even when store is null.
- Around line 116-119: The callback passed to provider.getLLMResponse currently
calls store.store(...) whenever store != null, which can pass a null AiApiUser
if MEMBRANE_AI_USER isn't set on the exchange; change the callback to first read
the user from the exchange (exc.getProperty(MEMBRANE_AI_USER, AiApiUser.class)),
check that the user is not null (and only then call store.store(user,
res.getUsage())), leaving the existing store != null guard intact so persistence
only happens when both store and a valid AiApiUser are present.
- Around line 144-146: The getDisplayName method in LLMGatewayInterceptor
currently returns the stale provider-specific string "OpenAI API"; update
LLMGatewayInterceptor.getDisplayName() to return a provider-agnostic name (e.g.,
"LLM Gateway" or "AI Gateway") so the interceptor no longer advertises a single
provider, and adjust any related UI/tooltip text retrieval (methods referencing
getDisplayName) to remain consistent with the new neutral label.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMRequest.java`:
- Around line 81-83: getModel() can throw an NPE because the backing json field
may be null for non-JSON requests; update AbstractLLMRequest.getModel to first
check whether json is null (or json.path("model") is missing) and return a safe
value (e.g., null or an empty string) instead of dereferencing json, or
otherwise handle non-JSON requests upstream. Locate the json field and the
method getModel() in class AbstractLLMRequest and add a null-check/guard that
avoids calling json.path(...) when json==null and documents the chosen fallback
behavior.
- Around line 65-72: The current extraction in AbstractLLMRequest uses
ah.indexOf(BEARER_PREFIX) which accepts "Bearer" anywhere; change it to parse
the Authorization header strictly as a Bearer scheme: trim ah, verify the header
starts with the bearer scheme (case-insensitive) followed by whitespace (e.g.
check ah.trim().toLowerCase().startsWith((BEARER_PREFIX + " ").toLowerCase()) or
use regionMatches(true,...)), and only then take the substring after
BEARER_PREFIX, trim it and return null on empty; update the token extraction
logic around BEARER_PREFIX and the variable ah accordingly.
- Around line 35-40: The mapping in AbstractLLMRequest that extracts tool names
from tools.valueStream() is unsafe because chained get(...).get(...).asText()
can NPE or throw for malformed entries; update the lambda in the stream to
defensively check each node before accessing "function" and "name" (e.g., verify
node.isObject() / has("function") and has("name") and that name is textual or
non-null), return the function.name when present and valid, otherwise the
top-level name when valid, and fallback to a safe default or skip the entry
instead of calling asText() blindly so malformed tool objects no longer throw in
get(...).get(...).asText().

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMResponse.java`:
- Around line 61-71: The lambda passed to terminal.ifPresent in
AbstractLLMResponse currently returns early on a "[DONE]" event which skips
calling postProcessor and can leave json null causing isError() to NPE; change
the handling so that when event.data() equals "[DONE]" you do not return early
but instead assign json to a safe empty/error JsonNode (e.g.,
JsonNodeFactory.instance.objectNode().put("done", true) or a descriptive error
node) and then call postProcessor.accept(AbstractLLMResponse.this); apply the
same fix to the other terminal handling block referenced (the similar code at
lines ~75-77) so postProcessor is always invoked and json is always non-null
before isError() is called.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeLLMRequest.java`:
- Around line 21-67: The code assumes the field json is non-null in
setMaxOutputTokens and estimateInputTokens which leads to NPEs when the request
body is absent or parsing failed; update both setMaxOutputTokens and
estimateInputTokens to guard against a null json (either early-return with
sensible defaults or lazily initialize json to an empty ObjectNode) before
accessing json.path(...), and ensure any subsequent logic (e.g.,
thinking/budget_tokens adjustments and token estimation) handles the empty json
case safely; reference the json field and the methods setMaxOutputTokens and
estimateInputTokens when making the change.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeLLMResponse.java`:
- Around line 48-50: The current log.debug call in ClaudeLLMResponse that prints
inputJson (log.debug("Tool {} with {}", tool, inputJson.toString())) must not
log raw payloads; change it to log a redacted or summarized form instead (e.g.,
log tool name plus either the set of top-level keys, input size, or a sanitized
version) by replacing the direct inputJson.toString() usage with a safe helper
(e.g., redactToolInput(inputJson) or summarizeToolInput(inputJson)) that
strips/omits sensitive values; update the log.debug invocation to use the helper
and keep the tool identifier (tool) intact so callers can locate the change.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/MessageDelta.java`:
- Around line 25-31: The Usage object in MessageDelta is currently constructed
using only md.inputTokens and md.outputTokens, omitting billable cache tokens;
update the Usage creation in the MessageDelta parsing logic so that it includes
md.cacheCreationInputTokens and md.cacheReadInputTokens (i.e., total =
inputTokens + outputTokens + cacheCreationInputTokens + cacheReadInputTokens) so
billing/rate-limit accounting is correct for the MessageDelta instance (look for
the MessageDelta class and the md.usage = new Usage(...) line).

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiLLMRequest.java`:
- Around line 25-33: The method estimateInputTokens uses the field json without
guarding for null; add a null-check at the start of estimateInputTokens (or a
guard in any helper it calls) so that if json is null the method returns 0 (or
an appropriate default) instead of calling json.path(...) and risking an NPE;
update estimateInputTokens to check json before calling
countText(json.path("input")), estimateChatCompletitions(), and
countText(json.path("system")) so those invocations only run when json is
non-null.
- Around line 17-20: The current code in OpenAiLLMRequest uses
json.putObject("stream_options") which unconditionally replaces any existing
stream_options; change this to merge instead: check for an existing JsonObject
via json.getJsonObject("stream_options") (or equivalent), if it exists and is an
object call put("include_usage", true) on it, otherwise create a new JsonObject
with include_usage=true and assign it (only when
json.path("stream").asBoolean(false) && isChatCompletionsRequest(exchange));
keep the check/behavior around isChatCompletionsRequest(exchange) intact.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiLLMResponse.java`:
- Around line 34-42: The getUsage() logic in OpenAiLLMResponse is ignoring
returned values from getInputTokens/getOutputTokens and is reading total_tokens
from the wrong JSON node; assign the results of getInputTokens(usage) and
getOutputTokens(usage) to inputTokens and outputTokens respectively, and read
totalTokens from usage.path("total_tokens") (not json.path("total_tokens")) for
both the new and older chat completions branches so usage.total_tokens is used
consistently.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/ResponsesApiEvent.java`:
- Around line 29-31: The current log statement in ResponsesApiEvent.java logs
raw function-call arguments via on.path("arguments").asText() at INFO; change
this to a non-production level (use log.debug) and avoid printing full
payloads—log only metadata such as the function name (on.path("name").asText()),
argument size or keys, or a redacted/truncated version of arguments (e.g., first
N characters or replace values with "<REDACTED>") instead of the raw text;
update the log invocation that references on.path("name") and
on.path("arguments") accordingly so sensitive data is not emitted at INFO.

In `@core/src/main/java/com/predic8/membrane/core/util/http/SSEParser.java`:
- Line 38: The current log call in SSEParser that does log.debug("Parsing SSE
chunk: {}", chunk) exposes raw streamed payloads; change it to avoid printing
chunk contents and instead log safe metadata: compute the chunk size (e.g.,
chunk.length() or bytes) and a simple event count (e.g., count occurrences of
"data:" or empty-line delimiters) and use those values in the debug message.
Update the logging in the SSEParser code path (referencing the chunk variable
and the log instance) to emit only size and event-count metadata and remove any
direct inclusion of the chunk string.

---

Duplicate comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMRequest.java`:
- Around line 108-118: setMaxOutputTokens can still NPE because
getGenerationConfig calls json.get(...) without ensuring json is initialized;
update getGenerationConfig (used by setMaxOutputTokens) to first ensure the
field json is non-null (initialize it to an ObjectNode or call a helper that
creates it) before accessing json.get("generationConfig"), and then return or
create the "generationConfig" ObjectNode; reference the methods
getGenerationConfig and setMaxOutputTokens and the json field when making the
change.

---

Nitpick comments:
In `@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMApiUtil.java`:
- Around line 64-72: The createJson method currently catches a broad Exception;
change the catch to handle com.fasterxml.jackson.core.JsonProcessingException
specifically for om.writeValueAsString(o) to make error handling precise
(replace the generic catch(Exception e) with catch(JsonProcessingException e))
and keep the same fallback JSON return and any logging if present.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 11071819-f5de-4c68-9b96-669afb387962

📥 Commits

Reviewing files that changed from the base of the PR and between 863e5cc and 062ac5c.

📒 Files selected for processing (33)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/AbstractLLMEvent.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/ChatCompletionDoneEvent.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/ChatCompletionEvent.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMApiUtil.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/ResponsesApiEvent.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMMessage.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMRequest.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMResponse.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/LLMProvider.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/LLMRequest.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/LLMResponse.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeLLMRequest.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeLLMResponse.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeProvider.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ContentBlockDelta.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ContentBlockStart.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/MessageDelta.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ToolUse.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMRequest.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMResponse.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleProvider.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAIProvider.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiLLMRequest.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiLLMResponse.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
core/src/main/java/com/predic8/membrane/core/util/http/SSEParser.java
core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java
core/src/test/java/com/predic8/membrane/core/util/http/SSEParserTest.java

✅ Files skipped from review due to trivial changes (3)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ContentBlockStart.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ToolUse.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/ChatCompletionDoneEvent.java

🚧 Files skipped from review as they are similar to previous changes (5)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/NoAiApiLimit.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiLimit.java

…acking - Removed redundant `isResponsesAPI` variable in `OpenAIProvider`. - Optimized tool extraction in `OpenAiLLMResponsesRequest` and `OpenAiLLMChatCompletionsRequest`. - Updated `AiApiLimit` to support unlimited tokens with `MAX_VALUE`. - Replaced `token` with `apiKey` in `AiApiUser` along with added `tokens` field. - Improved JSON parsing logic in `JsonUtil` with better exception handling and logging. - Adjusted output token parameter naming in multiple request classes for consistency.

…r handling - Deleted `AiApiLimit` and `NoAiApiLimit` classes, consolidating token management into `SimpleAiApiStore`. - Introduced `LLMErrorCreator` and its implementations (`OpenAiErrorCreator`, etc.) for reusable error generation. - Refactored `LLMGatewayInterceptor` to utilize provider-specific error creators, simplifying token and model validation. - Enhanced `SimpleAiApiStore` with token reset functionality and user-specific token tracking. - Updated tutorials and examples to align with this refactored approach.

- Added `resetTokensUsedInPeriod` for user-specific token reset in `SimpleAiApiStore`. - Improved inline documentation for methods across `AiApiUser`, `AiApiStore`, and `LLMGatewayInterceptor`. - Updated parameter descriptions for clarity and consistency.

predic8 · 2026-05-20T13:41:52Z

@coderabbitai review

coderabbitai · 2026-05-20T13:41:59Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 6

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java (2)

154-156: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

store.store(user, ...) may receive null user for non-POST requests.

For non-POST requests, handleRequest returns at line 77 without setting MEMBRANE_AI_USER on the exchange. When handleResponse executes, user will be null (line 148). If store != null, store.store(null, ...) is called.

SimpleAiApiStore.store() calls user.getName() (line 40) and user.addTokensUsedInPeriod() (line 41), which will NPE on null user.

Consider guarding:

         if (store != null) {
-            store.store(user, res.getUsage());
+            if (user != null) {
+                store.store(user, res.getUsage());
+            }
         }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java`
around lines 154 - 156, In LLMGatewayInterceptor, handleResponse calls
store.store(user, ...) but user can be null for non-POST requests because
MEMBRANE_AI_USER is only set in handleRequest for POST; update handleResponse to
check for null before calling store.store (e.g., only call store.store(...) when
user != null) or provide a safe default user; reference the symbols
LLMGatewayInterceptor.handleResponse, MEMBRANE_AI_USER, store.store(...) and
SimpleAiApiStore.store (which calls user.getName() and
user.addTokensUsedInPeriod()) to locate the code to guard against a null user
and avoid the NPE.

149-153: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Inverted log-level logic appears to suppress detailed user logs.

When log.isInfoEnabled() && user != null, the code logs at DEBUG level (line 150). When user is null or info logging is disabled, it logs at INFO level (line 152). This seems inverted — typically you'd want to log user-specific details at DEBUG and generic messages at INFO.

Was the intent to log user details only when DEBUG is enabled?

-            if (log.isInfoEnabled() && user != null) {
-                log.debug("Token usage of user {}: {}", user, res.getUsage());
+            if (log.isDebugEnabled() && user != null) {
+                log.debug("Token usage for user {}: {}", user, res.getUsage());
             } else {
                 log.info("Token usage: {}", res.getUsage());
             }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java`
around lines 149 - 153, The log-level branch in LLMGatewayInterceptor (around
the token-usage logging) is inverted: it checks log.isInfoEnabled() && user !=
null but then calls log.debug(...) for user-specific details and log.info(...)
for the generic case; change the condition to check log.isDebugEnabled() && user
!= null so detailed per-user information is only logged when DEBUG is enabled,
and keep the generic usage message at INFO when DEBUG is not enabled or user is
null (update the condition near the token usage logging where res.getUsage() is
printed).

♻️ Duplicate comments (1)

core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java (1)

105-111: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Handle empty-input readTree() results explicitly instead of relying on caught NPEs.

At Line 109 and Line 133, node.getNodeType() is called without a null check. For empty input, Jackson readTree(...) can return null, so this currently falls into catch (Exception) via NPE. It still returns Optional.empty(), but through exception-driven flow and weaker diagnostics.

💡 Suggested minimal fix

@@
-        try {
-            var node = om.readTree(jsonString);
+        try {
+            var node = om.readTree(jsonString);
+            if (node == null) {
+                log.debug("Expected JSON Object but got: empty body");
+                return empty();
+            }
             if (node instanceof ObjectNode on) {
                 return Optional.of(on);
             }
-            log.debug("Expected JSON Object but got: {}",node.getNodeType());
+            log.debug("Expected JSON Object but got: {}", node.getNodeType());
         } catch (Exception e) {
-            log.debug("Error reading JSON: {}", e.getMessage());
+            log.debug("Error reading JSON", e);
         }
@@
-        try {
-            var node = om.readTree(obj);
+        try {
+            var node = om.readTree(obj);
+            if (node == null) {
+                log.debug("Expected JSON Object but got: empty body");
+                return empty();
+            }
             if (node instanceof ObjectNode on) {
                 return Optional.of(on);
             }
-            log.debug("Expected JSON Object but got: {}",node.getNodeType());
+            log.debug("Expected JSON Object but got: {}", node.getNodeType());
         } catch (Exception e) {
-            log.debug("Error reading JSON: {}", e.getMessage());
+            log.debug("Error reading JSON", e);
         }

Also applies to: 129-135

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java` around
lines 105 - 111, In JsonUtil.java, avoid NPEs from om.readTree(...) returning
null by adding null checks before calling node.getNodeType(): after var node =
om.readTree(jsonString) check if (node == null) and log a clear debug message
like "Empty JSON input" (or include jsonString info) and return
Optional.empty(); do the same for the second occurrence around lines 129-135
where node.getNodeType() is invoked; leave the Optional.empty() return behavior
unchanged but replace relying on caught NPEs with explicit null handling and
improved logs.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeProvider.java`:
- Around line 29-32: getErrorCreator() currently returns null which can break
error-path handling; change ClaudeProvider.getErrorCreator() to return a
non-null LLMErrorCreator implementation (e.g., a concrete ClaudeLLMErrorCreator
class or an anonymous/inner LLMErrorCreator instance) that maps
provider-specific errors to the gateway LLM error model and implements required
methods used by callers (so callers no longer receive null). Ensure the returned
creator is instantiated in ClaudeProvider and handles common Claude error shapes
to be null-safe.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleProvider.java`:
- Around line 29-32: The getErrorCreator() method in GoogleProvider currently
returns null; replace it with a concrete LLMErrorCreator instance (e.g., return
new DefaultLLMErrorCreator() or implement and return a
GoogleProviderErrorCreator) so provider error generation never receives null;
implement the concrete class if it doesn't exist, ensure it implements
LLMErrorCreator, and return that instance from GoogleProvider.getErrorCreator().

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java`:
- Around line 35-37: In AiApiUser.checkLimit(long tokensNeededForRequest) the
subtraction treats tokens==0 as a finite allowance and can yield negative
remaining; change the logic to treat tokens==0 as "no limit" — if this.tokens ==
0 return a large positive allowance (or explicitly return Long.MAX_VALUE or a
sentinel positive value) so tokensUsedInPeriod.get() and tokensNeededForRequest
do not cause denial; update checkLimit to first check this.tokens == 0 and only
perform the subtraction when tokens > 0, keeping the method name and behavior
otherwise.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Around line 54-61: The code iterates the users list inside synchronized(lock)
but setUsers and getUser mutate/read users without synchronization; synchronize
setUsers (and any user-accessing methods like getUser) on the same lock or
switch users to a thread-safe collection; specifically, wrap the body of
setUsers(...) and getUser(...) in synchronized(lock) (or alternatively change
the users field to a CopyOnWriteArrayList and update usages) so that the
iteration in the block that calls
users.forEach(AiApiUser::resetTokensUsedInPeriod) cannot race with concurrent
replacements of the users reference.

In `@distribution/tutorials/ai/llm-gateway/20-Sharing-API-Keys.yaml`:
- Around line 17-19: The tutorial references a missing helper file
"wrong-model.json" in the "3. Wrong Model" step which will cause a
file-not-found error; either add a new file named wrong-model.json to the
tutorial directory containing an invalid model payload (e.g., model:
"invalid-model-name", input: "Hello, AI!"), or update the step text (# 3. Wrong
Model) to instruct users to create the request inline (show the JSON to paste)
or remove the step entirely—modify whichever you choose and ensure the curl
sample still points to "wrong-model.json" or to an inline -d payload
accordingly.
- Around line 38-41: Update the model identifiers listed under the models array:
replace "gpt-5-nano" with "gpt-5.4-nano" and "gpt-5-mini" with "gpt-5.4-mini" so
the models array contains "gpt-5.4", "gpt-5.4-nano", and "gpt-5.4-mini" (modify
the models block shown in the diff).

---

Outside diff comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java`:
- Around line 154-156: In LLMGatewayInterceptor, handleResponse calls
store.store(user, ...) but user can be null for non-POST requests because
MEMBRANE_AI_USER is only set in handleRequest for POST; update handleResponse to
check for null before calling store.store (e.g., only call store.store(...) when
user != null) or provide a safe default user; reference the symbols
LLMGatewayInterceptor.handleResponse, MEMBRANE_AI_USER, store.store(...) and
SimpleAiApiStore.store (which calls user.getName() and
user.addTokensUsedInPeriod()) to locate the code to guard against a null user
and avoid the NPE.
- Around line 149-153: The log-level branch in LLMGatewayInterceptor (around the
token-usage logging) is inverted: it checks log.isInfoEnabled() && user != null
but then calls log.debug(...) for user-specific details and log.info(...) for
the generic case; change the condition to check log.isDebugEnabled() && user !=
null so detailed per-user information is only logged when DEBUG is enabled, and
keep the generic usage message at INFO when DEBUG is not enabled or user is null
(update the condition near the token usage logging where res.getUsage() is
printed).

---

Duplicate comments:
In `@core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java`:
- Around line 105-111: In JsonUtil.java, avoid NPEs from om.readTree(...)
returning null by adding null checks before calling node.getNodeType(): after
var node = om.readTree(jsonString) check if (node == null) and log a clear debug
message like "Empty JSON input" (or include jsonString info) and return
Optional.empty(); do the same for the second occurrence around lines 129-135
where node.getNodeType() is invoked; leave the Optional.empty() return behavior
unchanged but replace relying on caught NPEs with explicit null handling and
improved logs.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 0f476e7b-3ad5-4b76-b421-3c8fabe19123

📥 Commits

Reviewing files that changed from the base of the PR and between 97e597c and 0af0f35.

📒 Files selected for processing (18)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/AbstractLLMErrorCreator.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/LLMErrorCreator.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/LLMProvider.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/LLMRequest.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeProvider.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMRequest.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleProvider.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAIProvider.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiErrorCreator.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiLLMChatCompletionsRequest.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiLLMResponsesRequest.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiStore.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/JDBCAiApiUsageStore.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
core/src/main/java/com/predic8/membrane/core/util/json/JsonUtil.java
distribution/tutorials/ai/llm-gateway/20-Sharing-API-Keys.yaml

- Added synchronized blocks to `SimpleAiApiStore` for thread-safe access to user data. - Introduced `invalidRequestError` to `LLMErrorCreator` and implemented it in `OpenAiErrorCreator`. - Allowed unlimited tokens for users with `MAX_VALUE` in `AiApiUser`. - Simplified logic in `LLMGatewayInterceptor` for token and model validation. - Updated tutorial JSON with test input for validation.

predic8 · 2026-05-20T14:52:00Z

@coderabbitai review

coderabbitai · 2026-05-20T14:52:08Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java (1)

75-78: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Documentation mismatch: Javadoc says default is 0 but code defaults to MAX_VALUE.

The @default 0 (no limit) comment no longer matches the actual default of Long.MAX_VALUE set at line 16.

📝 Proposed fix

     /**
      * `@description` Number of tokens that the user has available within the current period.
-     * `@default` 0 (no limit)
+     * `@default` Long.MAX_VALUE (no limit)
      * `@param` tokens available tokens
      */

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java`
around lines 75 - 78, The Javadoc for AiApiUser.tokens incorrectly states the
default is 0 while the code initializes the tokens field to Long.MAX_VALUE;
update the Javadoc to reflect the actual default (e.g., "`@default` Long.MAX_VALUE
(no limit)") so docs match the implementation of the tokens field in class
AiApiUser, or alternatively change the tokens field default to 0 if the intended
behavior was different—adjust either the comment or the field initialization for
consistency.

🧹 Nitpick comments (1)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java (1)

88-90: 💤 Low value

getUsers() reads @GuardedBy("lock") field without synchronization.

For consistency with the other accessors/mutators, this getter should also synchronize on lock or return a defensive copy.

♻️ Proposed fix

     public List<AiApiUser> getUsers() {
-        return users;
+        synchronized (lock) {
+            return users;
+        }
     }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`
around lines 88 - 90, getUsers() currently returns the `@GuardedBy`("lock") field
users without synchronization; update getUsers() in SimpleAiApiStore to either
synchronize on lock (e.g., synchronized(lock) { return users; }) or, preferably,
return a defensive copy while holding the lock (e.g., synchronized(lock) {
return new ArrayList<>(users); }) so access is consistent with other
synchronized accessors/mutators that use lock.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java`:
- Around line 75-78: The Javadoc for AiApiUser.tokens incorrectly states the
default is 0 while the code initializes the tokens field to Long.MAX_VALUE;
update the Javadoc to reflect the actual default (e.g., "`@default` Long.MAX_VALUE
(no limit)") so docs match the implementation of the tokens field in class
AiApiUser, or alternatively change the tokens field default to 0 if the intended
behavior was different—adjust either the comment or the field initialization for
consistency.

---

Nitpick comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Around line 88-90: getUsers() currently returns the `@GuardedBy`("lock") field
users without synchronization; update getUsers() in SimpleAiApiStore to either
synchronize on lock (e.g., synchronized(lock) { return users; }) or, preferably,
return a defensive copy while holding the lock (e.g., synchronized(lock) {
return new ArrayList<>(users); }) so access is consistent with other
synchronized accessors/mutators that use lock.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: f9675200-8f29-4e10-b89c-3241ee03df4b

📥 Commits

Reviewing files that changed from the base of the PR and between 0af0f35 and 329bb80.

📒 Files selected for processing (6)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/LLMErrorCreator.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiErrorCreator.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
distribution/tutorials/ai/llm-gateway/wrong-model.json

- Added Claude-specific error handling with `ClaudeErrorCreator` and `ClaudeErrorResponse`. - Introduced `10-Basic-LLM-Gateway.yaml` tutorial for Claude integration. - Enhanced token usage tracking in `ClaudeLLMResponse`. - Updated examples and tutorials to support both OpenAI and Claude.

…ring and token limit examples - Added `10-Basic-LLM-Gateway.yaml` and `20-Sharing-API-Keys.yaml` tutorials for Google Gemini. - Enhanced Claude tutorials with improved key handling and token limit examples. - Introduced `GoogleErrorCreator` for detailed error handling in Google LLM Gateway. - Updated `LLMGatewayInterceptor` and token tracking logic to reflect effective max token handling. - Modified existing OpenAI and Claude examples for consistency and clarity.

predic8 · 2026-05-21T09:48:16Z

@coderabitai review

…tutorials - Introduced `AbstractAiTutorialTest` base class and provider-specific extensions for easier test creation. - Added integration tests for basic gateway setups and API key sharing for Claude, OpenAI, and Google Gemini. - Simulated upstream mock APIs to enable testing token limits, key handling, and input/output transformations.

predic8 · 2026-05-21T13:20:32Z

@coderabbitai review

coderabbitai · 2026-05-21T13:20:39Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

Actionable comments posted: 5

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

distribution/tutorials/ai/llm-gateway/google/membrane.cmd (1)
1-25: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Convert to CRLF line endings.

Windows cmd.exe parses .cmd/.bat files in 512-byte chunks and LF-only line endings can cause label resolution (:search_up, :found, :notfound) to misbehave depending on file size and chunk boundaries. Ensure this file is committed with CRLF (e.g., add a .gitattributes entry like *.cmd text eol=crlf or run unix2dos).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@distribution/tutorials/ai/llm-gateway/google/membrane.cmd` around lines 1 -
25, This file uses LF-only endings which can break label parsing in cmd scripts;
convert distribution/tutorials/ai/llm-gateway/google/membrane.cmd to CRLF line
endings and ensure the repo enforces CRLF for .cmd files (e.g., add or update
.gitattributes with a rule like "*.cmd text eol=crlf" and re-commit the fixed
file) so labels such as :search_up, :found and :notfound resolve correctly when
executed; alternatively run a tool like unix2dos on the file and commit the
CRLF-converted version.

♻️ Duplicate comments (3)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java (1)
149-150: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Guard usage persistence when no authenticated user is present.

On Line 150, store.store(user, ...) is called even when user is null (e.g., request paths that never set MEMBRANE_AI_USER), which can break persistence or store invalid records.
Proposed fix
-            if (store != null) {
+            if (store != null && user != null) {
                 store.store(user, res.getUsage());
             }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java`
around lines 149 - 150, The persistence call may be invoked with a null
authenticated user: guard the store.store(user, res.getUsage()) call so it only
runs when both store != null and the user object (from MEMBRANE_AI_USER) is
non-null; update the condition around the call in LLMGatewayInterceptor (the
block containing store.store, user and res.getUsage()) to check user != null
before calling store.store to avoid persisting invalid records.
distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/SharingApiKeysGoogleTutorialTest.java (1)
180-218: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Same test-isolation concern as the Claude variant.

This alicesTokenBudgetIsExhaustedWhileBobIsUnaffected shares Alice's gateway-side token counter with the other Alice-tests in the same class. If JUnit 5 happens to execute another Alice-test first, Alice's running total will not start at 0 and the budget arithmetic in the Javadoc no longer holds. Please ensure the base class resets per-user usage between @Test methods (see the same comment on the Claude variant).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/SharingApiKeysGoogleTutorialTest.java`
around lines 180 - 218, The test
alicesTokenBudgetIsExhaustedWhileBobIsUnaffected relies on a fresh per-user
token counter but current test runs share Alice's gateway-side state across
methods; add a test teardown that resets per-user usage between `@Test` methods
(e.g., an `@AfterEach` in the common base test class used by
SharingApiKeysGoogleTutorialTest) so the gateway's user token counters are
reinitialized before each test, or call the existing reset method on the shared
token-store (e.g., TokenUsageStore.reset(), UsageTracker.clear(), or similar)
from that teardown to guarantee Alice's counter starts at 0 for every test.
distribution/tutorials/ai/llm-gateway/openai/membrane.cmd (1)
1-25: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Convert to CRLF line endings (duplicate of google/membrane.cmd).

Same issue as the Google variant: this .cmd file is committed with LF-only endings, which can cause cmd.exe's 512-byte chunked parser to mis-handle labels and goto/call targets. Switch to CRLF (e.g., via a .gitattributes rule *.cmd text eol=crlf).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@distribution/tutorials/ai/llm-gateway/openai/membrane.cmd` around lines 1 -
25, This .cmd script (labels :search_up, :found, :notfound and uses call/goto)
is committed with LF-only endings which can break cmd.exe parsing; convert the
file to CRLF line endings and add a repository rule to enforce this (e.g., add a
.gitattributes entry like "*.cmd text eol=crlf") or re-save the file with CRLF
so the labels and call "%MEMBRANE_HOME%\scripts\run-membrane.cmd" %* work
correctly.

🧹 Nitpick comments (3)

distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/AbstractAiTutorialTest.java (1)

46-47: ⚡ Quick win

Avoid fixed mock port to reduce CI/test flakiness.

Line 46 hard-codes 3100; concurrent test runs or occupied local ports can make this fail nondeterministically.

💡 Proposed fix

+import java.net.ServerSocket;
@@
-    protected static final int MOCK_LLM_PORT = 3100;
+    protected int mockLlmPort;
@@
-    protected String mockApiUrl() {
-        return "http://localhost:" + MOCK_LLM_PORT;
+    protected String mockApiUrl() {
+        return "http://localhost:" + mockLlmPort;
     }
@@
     private void startMockLlmApi() throws Exception {
+        mockLlmPort = findFreePort();
         var si = new StaticInterceptor();
@@
-        var sp = new ServiceProxy(new ServiceProxyKey(MOCK_LLM_PORT), null, 0);
+        var sp = new ServiceProxy(new ServiceProxyKey(mockLlmPort), null, 0);
@@
         mockRouter.start();
     }
+
+    private static int findFreePort() throws Exception {
+        try (ServerSocket socket = new ServerSocket(0)) {
+            return socket.getLocalPort();
+        }
+    }

Also applies to: 93-95, 111-112

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/AbstractAiTutorialTest.java`
around lines 46 - 47, The test currently hardcodes MOCK_LLM_PORT = 3100 in
AbstractAiTutorialTest which causes CI flakiness; change this to use a
dynamically allocated port: remove the fixed final constant, add an instance
field (e.g., mockLlmPort) and initialize it at test setup by binding a
ServerSocket to port 0 (or using an OS-assigned ephemeral port) to retrieve the
actual port, then update all references that used MOCK_LLM_PORT (including
usages around the earlier noted blocks) to use the new mockLlmPort field (or a
getter like getMockLlmPort()) so each test run uses a free, non-conflicting
port.

distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/SharingApiKeysOpenAiTutorialTest.java (1)

112-113: 💤 Low value

Tutorial YAML uses a literal placeholder as the upstream API key.

This assertion confirms the tutorial's 20-Sharing-API-Keys.yaml ships with <<Replace with your API_KEY>> as the actual upstream apiKey value. If a user copies the tutorial to a real deployment without substituting the placeholder, the gateway will forward Bearer <<Replace with your API_KEY>> to OpenAI and get an opaque 401. Consider asserting that the tutorial YAML includes a prominent "REPLACE BEFORE USE" comment, or fail-fast in the gateway when the configured apiKey still contains <<…>> markers. Not a blocker for the tests themselves.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/SharingApiKeysOpenAiTutorialTest.java`
around lines 112 - 113, The tutorial YAML ships with a literal placeholder API
key (`<<Replace with your API_KEY>>`) which can be accidentally forwarded; add a
fail-fast check where the gateway loads the upstream apiKey (the code path that
the test SharingApiKeysOpenAiTutorialTest exercises / whatever class constructs
the OpenAI upstream config) to detect placeholder markers like "<<" or ">>" or
the substring "Replace with your API_KEY" and throw a clear
IllegalStateException (or log an error and exit) explaining the developer must
replace the placeholder in 20-Sharing-API-Keys.yaml before use; update the
configuration loader/constructor that sets the apiKey so it validates and
rejects placeholder values at startup.

distribution/tutorials/ai/llm-gateway/google/20-Sharing-API-Keys.yaml (1)

13-19: 💤 Low value

Consider illustrating bob's key in the example curls.

Scenario #2 mentions that "Bob should still be able to send requests", but every example curl uses alice's key abc123. Adding one example with bob's key (qwertz) would make the multi-user demo concretely runnable and avoid confusion.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@distribution/tutorials/ai/llm-gateway/google/20-Sharing-API-Keys.yaml` around
lines 13 - 19, The examples all use Alice's API key (abc123) but the scenario
text references Bob remaining able to send requests; add or replace one curl
example to show Bob's key (qwertz) being used — e.g., a curl to
/v1beta/models/gpt-5:generateContent with header "x-goog-api-key: qwertz" near
the Token Limit Exceeded / Wrong Model examples so readers can run the
multi-user demo (references: alice, bob, abc123, qwertz, endpoint
/v1beta/models/gpt-5:generateContent).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java`:
- Around line 93-99: The current effectiveMaxTokens calculation using
Math.min(aiReq.getRequestedMaxOutputTokens(), maxOutputTokens) can produce 0 or
-1 when sentinel/unlimited values are present and thus corrupt limit accounting;
change the logic in LLMGatewayInterceptor to treat -1 as "unlimited" and compute
effectiveMaxTokens like: if aiReq.getRequestedMaxOutputTokens() == -1 use
maxOutputTokens; else if maxOutputTokens == -1 use
aiReq.getRequestedMaxOutputTokens(); otherwise use Math.min(requested,
maxOutputTokens); then ensure effectiveMaxTokens is non-negative (e.g., clamp to
0) before calling store.checkLimit(user, inputTokens, effectiveMaxTokens) and
when passing the required token count into
errorCreator.tokenLimitExceeded(inputTokens + effectiveMaxTokens, remaining,
store.getRemainingResetTime()) so the required-token value is never negative or
sentinel.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleErrorCreator.java`:
- Around line 24-33: The rate-limit error message in GoogleErrorCreator builds a
string using tokenRequired, tokenRemaining, and tokenResetInSeconds and
currently formats tokenRemaining directly; change the formatting to clamp
tokenRemaining to a non-negative value (e.g., use Math.max(0, tokenRemaining) or
an equivalent local variable like visibleTokenRemaining) before calling
envelope/json so the message never shows negative "only -5 remain"; update the
formatted argument used in the envelope call where tokenRemaining is passed.

In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java`:
- Around line 89-91: getUsers() currently returns the mutable `@GuardedBy`("lock")
field users directly, breaking the synchronization contract; change
SimpleAiApiStore.getUsers() to avoid leaking the guarded list by making and
returning a defensive copy (or an unmodifiable view) while holding lock: inside
synchronized(lock) create a new List (e.g., new ArrayList<>(users)) or
Collections.unmodifiableList(new ArrayList<>(users)) and return that, ensuring
callers cannot mutate the original users field outside the lock.

In `@distribution/tutorials/ai/llm-gateway/google/10-Basic-LLM-Gateway.yaml`:
- Line 5: Update the comment that currently reads "# Replace <<API_KEY>> with
your OpenAI API key." to correctly reference Google/Gemini credentials for this
Google Gemini tutorial; locate the comment string "# Replace <<API_KEY>> with
your OpenAI API key." in the 10-Basic-LLM-Gateway.yaml file and replace it with
a clear phrase such as "# Replace <<API_KEY>> with your Google API key
(Gemini)". Ensure the new wording explicitly mentions Google or Gemini so it
matches the tutorial context.

In `@distribution/tutorials/ai/llm-gateway/google/20-Sharing-API-Keys.yaml`:
- Around line 25-27: The comment in the tutorial snippet contains a grammar
typo: replace the phrase "even it was requested" with "even though it was
requested" in the comment block (the curl example comment in
20-Sharing-API-Keys.yaml that mentions "totalTokens should not exceed 200 even
it was requested in max-output.json") so the sentence reads "...totalTokens
should not exceed 200 even though it was requested in max-output.json".

---

Outside diff comments:
In `@distribution/tutorials/ai/llm-gateway/google/membrane.cmd`:
- Around line 1-25: This file uses LF-only endings which can break label parsing
in cmd scripts; convert
distribution/tutorials/ai/llm-gateway/google/membrane.cmd to CRLF line endings
and ensure the repo enforces CRLF for .cmd files (e.g., add or update
.gitattributes with a rule like "*.cmd text eol=crlf" and re-commit the fixed
file) so labels such as :search_up, :found and :notfound resolve correctly when
executed; alternatively run a tool like unix2dos on the file and commit the
CRLF-converted version.

---

Duplicate comments:
In
`@core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java`:
- Around line 149-150: The persistence call may be invoked with a null
authenticated user: guard the store.store(user, res.getUsage()) call so it only
runs when both store != null and the user object (from MEMBRANE_AI_USER) is
non-null; update the condition around the call in LLMGatewayInterceptor (the
block containing store.store, user and res.getUsage()) to check user != null
before calling store.store to avoid persisting invalid records.

In
`@distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/SharingApiKeysGoogleTutorialTest.java`:
- Around line 180-218: The test alicesTokenBudgetIsExhaustedWhileBobIsUnaffected
relies on a fresh per-user token counter but current test runs share Alice's
gateway-side state across methods; add a test teardown that resets per-user
usage between `@Test` methods (e.g., an `@AfterEach` in the common base test class
used by SharingApiKeysGoogleTutorialTest) so the gateway's user token counters
are reinitialized before each test, or call the existing reset method on the
shared token-store (e.g., TokenUsageStore.reset(), UsageTracker.clear(), or
similar) from that teardown to guarantee Alice's counter starts at 0 for every
test.

In `@distribution/tutorials/ai/llm-gateway/openai/membrane.cmd`:
- Around line 1-25: This .cmd script (labels :search_up, :found, :notfound and
uses call/goto) is committed with LF-only endings which can break cmd.exe
parsing; convert the file to CRLF line endings and add a repository rule to
enforce this (e.g., add a .gitattributes entry like "*.cmd text eol=crlf") or
re-save the file with CRLF so the labels and call
"%MEMBRANE_HOME%\scripts\run-membrane.cmd" %* work correctly.

---

Nitpick comments:
In
`@distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/AbstractAiTutorialTest.java`:
- Around line 46-47: The test currently hardcodes MOCK_LLM_PORT = 3100 in
AbstractAiTutorialTest which causes CI flakiness; change this to use a
dynamically allocated port: remove the fixed final constant, add an instance
field (e.g., mockLlmPort) and initialize it at test setup by binding a
ServerSocket to port 0 (or using an OS-assigned ephemeral port) to retrieve the
actual port, then update all references that used MOCK_LLM_PORT (including
usages around the earlier noted blocks) to use the new mockLlmPort field (or a
getter like getMockLlmPort()) so each test run uses a free, non-conflicting
port.

In
`@distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/SharingApiKeysOpenAiTutorialTest.java`:
- Around line 112-113: The tutorial YAML ships with a literal placeholder API
key (`<<Replace with your API_KEY>>`) which can be accidentally forwarded; add a
fail-fast check where the gateway loads the upstream apiKey (the code path that
the test SharingApiKeysOpenAiTutorialTest exercises / whatever class constructs
the OpenAI upstream config) to detect placeholder markers like "<<" or ">>" or
the substring "Replace with your API_KEY" and throw a clear
IllegalStateException (or log an error and exit) explaining the developer must
replace the placeholder in 20-Sharing-API-Keys.yaml before use; update the
configuration loader/constructor that sets the apiKey so it validates and
rejects placeholder values at startup.

In `@distribution/tutorials/ai/llm-gateway/google/20-Sharing-API-Keys.yaml`:
- Around line 13-19: The examples all use Alice's API key (abc123) but the
scenario text references Bob remaining able to send requests; add or replace one
curl example to show Bob's key (qwertz) being used — e.g., a curl to
/v1beta/models/gpt-5:generateContent with header "x-goog-api-key: qwertz" near
the Token Limit Exceeded / Wrong Model examples so readers can run the
multi-user demo (references: alice, bob, abc123, qwertz, endpoint
/v1beta/models/gpt-5:generateContent).

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 80d72558-1bcf-4c21-a2dd-5a2266de02f2

📥 Commits

Reviewing files that changed from the base of the PR and between 329bb80 and 38acaa3.

📒 Files selected for processing (74)

core/src/main/java/com/predic8/membrane/core/interceptor/ai/LLMGatewayInterceptor.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/LLMErrorCreator.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeErrorCreator.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeErrorResponse.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeLLMResponse.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/claude/ClaudeProvider.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleErrorCreator.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMRequest.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleLLMResponse.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/google/GoogleProvider.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/provider/openai/OpenAiErrorCreator.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/AiApiUser.java
core/src/main/java/com/predic8/membrane/core/interceptor/ai/store/SimpleAiApiStore.java
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/ExchangeToolSupport.java
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/ExchangeUtils.java
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/MCPUtil.java
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpPayloadSanitizer.java
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpSchemaBuilder.java
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpSessionContext.java
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpSessionManager.java
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpToolDefinition.java
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpToolHandler.java
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpToolRegistry.java
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/MembraneMCPServer.java
core/src/main/java/com/predic8/membrane/core/jsonrpc/JSONRPCRequest.java
core/src/main/java/com/predic8/membrane/core/jsonrpc/JSONRPCResponse.java
core/src/main/java/com/predic8/membrane/core/jsonrpc/JSONRPCUtil.java
core/src/main/java/com/predic8/membrane/core/mcp/MCPInitialize.java
core/src/main/java/com/predic8/membrane/core/mcp/MCPInitializeResponse.java
core/src/main/java/com/predic8/membrane/core/mcp/MCPInitialized.java
core/src/main/java/com/predic8/membrane/core/mcp/MCPNotification.java
core/src/main/java/com/predic8/membrane/core/mcp/MCPPing.java
core/src/main/java/com/predic8/membrane/core/mcp/MCPRequest.java
core/src/main/java/com/predic8/membrane/core/mcp/MCPResponse.java
core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsCall.java
core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsCallResponse.java
core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsList.java
core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsListResponse.java
core/src/test/java/com/predic8/membrane/core/interceptor/mcp/MembraneMCPServerTest.java
core/src/test/java/com/predic8/membrane/core/jsonrpc/JSONRPCRequestTest.java
core/src/test/java/com/predic8/membrane/core/jsonrpc/JSONRPCResponseTest.java
core/src/test/java/com/predic8/membrane/core/mcp/MCPInitializeTest.java
distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/AbstractAiTutorialTest.java
distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/claude/BasicClaudeLLMGatewayTutorialTest.java
distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/claude/SharingApiKeysTutorialTest.java
distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/AbstractGoogleTutorialTest.java
distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/BasicGoogleLLMGatewayTutorialTest.java
distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/google/SharingApiKeysGoogleTutorialTest.java
distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/AbstractOpenAiTutorialTest.java
distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/BasicOpenAiLLMGatewayTutorialTest.java
distribution/src/test/java/com/predic8/membrane/tutorials/ai/llmgateway/openai/SharingApiKeysOpenAiTutorialTest.java
distribution/tutorials/ai/llm-gateway/claude/10-Basic-LLM-Gateway.yaml
distribution/tutorials/ai/llm-gateway/claude/20-Sharing-API-Keys.yaml
distribution/tutorials/ai/llm-gateway/claude/max-input.json
distribution/tutorials/ai/llm-gateway/claude/max-output.json
distribution/tutorials/ai/llm-gateway/claude/membrane.cmd
distribution/tutorials/ai/llm-gateway/claude/membrane.sh
distribution/tutorials/ai/llm-gateway/claude/simple.json
distribution/tutorials/ai/llm-gateway/claude/wrong-model.json
distribution/tutorials/ai/llm-gateway/google/10-Basic-LLM-Gateway.yaml
distribution/tutorials/ai/llm-gateway/google/20-Sharing-API-Keys.yaml
distribution/tutorials/ai/llm-gateway/google/max-input.json
distribution/tutorials/ai/llm-gateway/google/max-output.json
distribution/tutorials/ai/llm-gateway/google/membrane.cmd
distribution/tutorials/ai/llm-gateway/google/membrane.sh
distribution/tutorials/ai/llm-gateway/google/simple.json
distribution/tutorials/ai/llm-gateway/openai/10-Basic-LLM-Gateway.yaml
distribution/tutorials/ai/llm-gateway/openai/20-Sharing-API-Keys.yaml
distribution/tutorials/ai/llm-gateway/openai/max-input.json
distribution/tutorials/ai/llm-gateway/openai/max-output.json
distribution/tutorials/ai/llm-gateway/openai/membrane.cmd
distribution/tutorials/ai/llm-gateway/openai/membrane.sh
distribution/tutorials/ai/llm-gateway/openai/simple.json
distribution/tutorials/ai/llm-gateway/openai/wrong-model.json

💤 Files with no reviewable changes (5)

distribution/tutorials/ai/llm-gateway/openai/simple.json
distribution/tutorials/ai/llm-gateway/claude/membrane.cmd
distribution/tutorials/ai/llm-gateway/openai/wrong-model.json
distribution/tutorials/ai/llm-gateway/openai/max-input.json
distribution/tutorials/ai/llm-gateway/claude/membrane.sh

✅ Files skipped from review due to trivial changes (39)

distribution/tutorials/ai/llm-gateway/claude/max-output.json
distribution/tutorials/ai/llm-gateway/claude/wrong-model.json
distribution/tutorials/ai/llm-gateway/google/simple.json
distribution/tutorials/ai/llm-gateway/openai/max-output.json
distribution/tutorials/ai/llm-gateway/claude/max-input.json
distribution/tutorials/ai/llm-gateway/claude/20-Sharing-API-Keys.yaml
distribution/tutorials/ai/llm-gateway/google/max-output.json
distribution/tutorials/ai/llm-gateway/claude/simple.json
distribution/tutorials/ai/llm-gateway/openai/10-Basic-LLM-Gateway.yaml
core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsList.java
distribution/tutorials/ai/llm-gateway/claude/10-Basic-LLM-Gateway.yaml
core/src/test/java/com/predic8/membrane/core/jsonrpc/JSONRPCRequestTest.java
core/src/main/java/com/predic8/membrane/core/mcp/MCPRequest.java
core/src/main/java/com/predic8/membrane/core/jsonrpc/JSONRPCResponse.java
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpToolHandler.java
core/src/main/java/com/predic8/membrane/core/jsonrpc/JSONRPCUtil.java
core/src/main/java/com/predic8/membrane/core/mcp/MCPInitializeResponse.java
core/src/test/java/com/predic8/membrane/core/mcp/MCPInitializeTest.java
core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsListResponse.java
core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsCall.java
core/src/test/java/com/predic8/membrane/core/interceptor/mcp/MembraneMCPServerTest.java
core/src/test/java/com/predic8/membrane/core/jsonrpc/JSONRPCResponseTest.java
core/src/main/java/com/predic8/membrane/core/mcp/MCPResponse.java
core/src/main/java/com/predic8/membrane/core/mcp/MCPToolsCallResponse.java
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpToolRegistry.java
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpToolDefinition.java
core/src/main/java/com/predic8/membrane/core/jsonrpc/JSONRPCRequest.java
distribution/tutorials/ai/llm-gateway/openai/20-Sharing-API-Keys.yaml
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/ExchangeToolSupport.java
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/MembraneMCPServer.java
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpSchemaBuilder.java
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/ExchangeUtils.java
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpSessionManager.java
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpPayloadSanitizer.java
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/McpSessionContext.java
core/src/main/java/com/predic8/membrane/core/interceptor/mcp/MCPUtil.java
core/src/main/java/com/predic8/membrane/core/mcp/MCPInitialize.java
core/src/main/java/com/predic8/membrane/core/mcp/MCPInitialized.java
core/src/main/java/com/predic8/membrane/core/mcp/MCPNotification.java

…for LLM Gateway - Ensure thread-safe access to users in `SimpleAiApiStore` with `List.copyOf`. - Introduce `visibleRemaining` to handle non-negative token values in `GoogleErrorCreator`. - Add configuration validation in `LLMGatewayInterceptor` to enforce API key substitution. - Enhance token limit handling to adjust output tokens dynamically in `LLMGatewayInterceptor`. - Update Google and Claude tutorials with clearer instructions for API key usage and token limits.

predic8 · 2026-05-21T16:58:46Z

/ok-to-test

- Introduced `StreamingOpenAiLLMGatewayTutorialTest` with SSE mocking and validation. - Added JSON fixtures (`stream.json`, `max-output-stream.json`) for testing streaming requests. - Enhanced base test framework to support `text/event-stream` responses. - Updated `LLMGatewayInterceptor` to handle streaming scenarios with capped tokens.

- Replaced raw API key placeholders with `TEST_API_KEY` constant in tutorial tests to ensure consistency. - Added `TEST_API_KEY` to `AbstractAiTutorialTest` for upstream key substitution verification. - Updated `log4j2.xml` to limit logging to `com.predic8.membrane.core.interceptor.ai`. - Introduced PostgreSQL dependency in `pom.xml` for future enhancements.

…eptor

…dd policies support - Unified OpenAI and Chat Completions error handling under `ChatCompletionsErrorCreator`. - Deprecated older OpenAI-specific classes in favor of `ChatCompletions` equivalents. - Introduced detailed usage policies handling in `LLMGatewayInterceptor`. - Updated YAML tutorials to reflect the new `policies` configuration model.

predic8 added 2 commits May 7, 2026 15:21

feat: add AI API usage tracking and management components

cf84c3b

feat: implement OpenAI API interceptor with usage tracking and token …

d792f5f

…limits

coderabbitai Bot reviewed May 7, 2026

View reviewed changes

predic8 added 5 commits May 7, 2026 15:58

refactor: make tokens in AiApiLimit final to improve immutability

8fa0d95

Merge branch 'master' into ai-gateway

1e162fc

Merge remote-tracking branch 'origin/ai-gateway' into ai-gateway

356a4c5

coderabbitai Bot reviewed May 7, 2026

View reviewed changes

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

predic8 added 10 commits May 8, 2026 13:28

docs: clarify parameter description in checkLimit method

ece8d1d

refactor: remove unused terminalEvent method from LLMApiUtil

f0a3b21

Merge branch 'master' into ai-gateway

1f0d509

Merge remote-tracking branch 'origin/ai-gateway' into ai-gateway

20b0919

Merge branch 'master' into ai-gateway

062ac5c

predic8 closed this May 18, 2026

predic8 reopened this May 18, 2026

coderabbitai Bot reviewed May 18, 2026

View reviewed changes

predic8 added 3 commits May 20, 2026 13:21

coderabbitai Bot reviewed May 20, 2026

View reviewed changes

predic8 added 4 commits May 20, 2026 17:56

refactor: improve parameter documentation in LLMErrorCreator

e0157c8

chore: add Apache 2.0 license headers to core files

11491c6

coderabbitai Bot reviewed May 21, 2026

View reviewed changes

predic8 and others added 6 commits May 22, 2026 08:01

Merge branch 'master' into ai-gateway

5e3ced7

Update tutorial to use Anthropic-specific API key and headers

ab2fac9

Fix handling of invalid max output token requests in LLMGatewayInterc…

37f7515

…eptor

Merge branch 'master' into ai-gateway

d903931

predic8 requested a review from rrayst May 22, 2026 14:34

predic8 added 2 commits May 26, 2026 11:59

Merge remote-tracking branch 'origin/ai-gateway' into ai-gateway

12991ac

Conversation

predic8 commented May 7, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested labels

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

membrane-ci-server Bot commented May 7, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

predic8 commented May 18, 2026

Uh oh!

coderabbitai Bot commented May 18, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

predic8 commented May 20, 2026

Uh oh!

coderabbitai Bot commented May 20, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

predic8 commented May 7, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 7, 2026 •

edited

Loading