Skip to content

Added code for kaapi integration#575

Open
rkritika1508 wants to merge 6 commits intoProjectTech4DevAI:mainfrom
tattle-made:feat/guardrails-integration
Open

Added code for kaapi integration#575
rkritika1508 wants to merge 6 commits intoProjectTech4DevAI:mainfrom
tattle-made:feat/guardrails-integration

Conversation

@rkritika1508
Copy link
Collaborator

@rkritika1508 rkritika1508 commented Feb 3, 2026

Summary

Target issue is #16
Explain the motivation for making this change. What existing problem does the pull request solve?
We have integrated Guardrails APIs with kaapi-backend. In the LLM call endpoint, we have added two more parameters input_guardrails and output_guardrails which lists all validators for input and output. A call is later made to the Guardrails endpoint present in Kaapi guardrails to validate the user input and LLM output.
Here is how to test -

  1. Add KAAPI_GUARDRAILS_AUTH="<add-auth-token>" in .env.
  2. In the LLM call endpoint, the body should be updated. For example,
{
  "query": {
    "input": "Mai pregnant hu aur janna chahti hu ki ladka hoga ya ladka. Aaspaas kahi sonography karwa sakti hu kya?",
    "conversation": {
      "id": null,
      "auto_create": true
    }
  },
  "input_guardrails": [
    {
        "type": "uli_slur_match",
        "severity": "all"
    },
    {
        "type": "ban_list",
        "banned_words": [
            "sonography"
        ],
        "on_fail": "fix"
    }
  ],
  "output_guardrails": [  ],
  "config": {
    "id": "55f178d1-3ff7-4c2a-be23-9d9eae85cf80",
    "version": "1"
  },
  "callback_url": "https://play.svix.com/in/e_5I971H1PISMQn4UKWLBJJUpYmg2",
  "include_provider_raw_response": true,
  "request_metadata": { }
}
  1. You can check the response from validator_log table and their status.

Checklist

Before submitting a pull request, please ensure that you mark these task.

  • Ran fastapi run --reload app/main.py or docker compose up in the repository root and test.
  • If you've fixed a bug or added code that is tested and has test cases.

Notes

Please add here if any other information is required for the reviewer.

Summary by CodeRabbit

  • New Features
    • Optional input/output guardrails for LLM requests to validate or sanitize prompts and responses when configured.
  • Documentation
    • Added example and test environment variable placeholders for guardrails authentication and service URL.
  • Chores
    • Exposed configuration to enable guardrails and wired guardrail checks into request processing.
  • Tests
    • Added unit and API tests covering sanitization, bypass, rephrase-needed, and failure scenarios.

@coderabbitai

This comment was marked as resolved.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In @.env.example:
- Around line 83-84: Add a final newline to the end of the file so the
end-of-file-fixer CI hook passes; edit the file containing the OPENAI_API_KEY
and KAAPI_GUARDRAILS_AUTH lines and ensure there is a blank line (trailing
newline) after the last line.

In `@backend/app/services/llm/jobs.py`:
- Around line 140-155: The logger currently prints raw sensitive values in
execute_job (logger.info) for input_query, input_guardrails and safe_input;
change these logs to avoid raw payloads by using a sanitizer (e.g., mask_string)
and/or logging only metadata (lengths/counts) and follow the guideline to prefix
messages with the function name in square brackets: replace direct
interpolations of input_query/input_guardrails/safe_input in the logger.info
calls inside execute_job and after call_guardrails with masked or metadata forms
(e.g., mask_string(input_query) or len(input_query)) and keep the
call_guardrails/job_id references intact so you still log context without
exposing the raw content.
- Around line 312-333: call_guardrails currently has no return type and uses an
untyped dict; update the signature to add precise typing (e.g., def
call_guardrails(input_text: str, guardrail_config: list[dict[str, Any]], job_id:
UUID) -> dict[str, Any]) and add the necessary typing import (from typing import
Any) at the top of the module so guardrail_config and the return value are
explicitly typed; keep the function body unchanged apart from the signature and
import.
- Around line 156-176: The guardrails branch currently treats success with
rephrase_needed=True as a failure and returns an error; update the logic around
safe_input in jobs.py so that if safe_input["success"] is True you always set
request.query.input = safe_input["data"]["safe_text"] and only return
handle_job_error when safe_input["success"] is False (use boolean checks like if
safe_input["success"] and not safe_input["data"]["rephrase_needed"] / elif
safe_input["success"] and safe_input["data"]["rephrase_needed"] should be
collapsed so both success paths continue), remove the trailing whitespace, and
ensure failures call APIResponse.failure_response and handle_job_error with
safe_input["error"] as before; also add a return type hint to the
call_guardrails function signature (change def call_guardrails(...) -> dict:) so
typing requirements are satisfied.
- Around line 262-272: The output-guardrails branch currently treats failures as
success, logs raw sensitive output_text, and uses a non-idiomatic comparison;
fix by checking truthiness of safe_output (if safe_output["success"] is truthy)
and on failure call handle_job_error(job_id, reason=<appropriate message>) to
mark the job failed (mirror input_guardrails behavior), set
response.response.output.text to a masked or error message, and remove/raw
output_text from logs—use mask_string(output_text) when logging and avoid
printing safe_output contents; update the logger call in execute_job to log only
masked output and a minimal status.
🧹 Nitpick comments (1)
backend/app/models/llm/request.py (1)

211-224: Consider a typed guardrails schema for early validation.

list[dict[str, Any]] is flexible but unvalidated; a small Pydantic/SQLModel model for validator type/on_fail/config would surface malformed requests earlier once the schema is stable.

@AkhileshNegi AkhileshNegi added the enhancement New feature or request label Feb 4, 2026
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Fix all issues with AI agents
In @.env.example:
- Around line 85-86: The two environment variable lines use inconsistent spacing
around the equals sign; update the KAAPI_GUARDRAILS_AUTH and
KAAPI_GUARDRAILS_URL entries to remove spaces around '=' so they match the
file's style (e.g., change 'KAAPI_GUARDRAILS_URL = ""' to
'KAAPI_GUARDRAILS_URL=""') and ensure no trailing whitespace.

In @.env.test.example:
- Around line 38-40: Remove the trailing-whitespace and ensure the file ends
with a newline, and make the KAAPI_GUARDRAILS_URL line consistent with
KAAPI_GUARDRAILS_AUTH by removing spaces around the "=" so both use the format
KEY="". Specifically edit the lines defining KAAPI_GUARDRAILS_AUTH and
KAAPI_GUARDRAILS_URL to use no spaces around the equals sign and add a final
newline at EOF.

In `@backend/app/services/llm/guardrail.py`:
- Around line 49-52: Remove the trailing whitespace after "except Exception as
e:" and update the logger.warning call to prefix the message with the enclosing
function's name in square brackets (replace the literal "[guardrails]" with the
actual function name), e.g. logger.warning(f"[<function_name>] Service
unavailable. Bypassing guardrails. job_id={job_id}. error={e}"), keeping the
rest of the message intact and still referencing job_id and the exception
variable e.

In `@backend/app/services/llm/jobs.py`:
- Around line 264-294: In execute_job, guardrails failures are being overwritten
by the final success response; modify the block that handles output_guardrails
(in jobs.py inside execute_job) so that when safe_output indicates bypassed,
rephrase_needed, or success==False you construct the
APIResponse.failure_response (use response.response.output.text for the
error/rephrase message instead of request.query.input) and immediately return it
instead of falling through; replace all `== True` checks with truthy checks
(e.g., if safe_output.get("bypassed"):) and use safe_output.get("data", {}) /
safe_output.get("error") to avoid KeyError; only reach the final
APIResponse.success_response when no failure/guardrail condition triggered.
- Around line 155-178: The code is using "== True" comparisons and indexing into
safe_input for "bypassed" which can raise KeyError; update the logic in the
execute_job block to use truthiness (e.g., if safe_input.get("bypassed"): ...)
and safe_input.get("success")/safe_input.get("data", {}).get("rephrase_needed")
to avoid KeyError, check bypassed first (fallback) then success, set
request.query.input from safe_input.get("data", {}).get("safe_text") or
safe_input.get("error") as appropriate, and preserve the existing failure path
that calls APIResponse.failure_response and handle_job_error (referencing
safe_input, request.query.input, APIResponse.failure_response, and
handle_job_error).
🧹 Nitpick comments (1)
backend/app/services/llm/guardrail.py (1)

12-31: Tighten type hint and fix misleading docstring.

  1. The guardrail_config parameter type could be more precise: list[dict[str, Any]].
  2. The docstring states "Raises: httpx.HTTPError" but the function catches all exceptions and returns a bypass response instead of raising.
♻️ Proposed fix
-def call_guardrails(input_text: str, guardrail_config: list[dict], job_id: UUID) -> dict[str, Any]:
+def call_guardrails(input_text: str, guardrail_config: list[dict[str, Any]], job_id: UUID) -> dict[str, Any]:
     """
     Call the Kaapi guardrails service to validate and process input text.
 
     Args:
         input_text: Text to validate and process.
         guardrail_config: List of validator configurations to apply.
         job_id: Unique identifier for the request.
 
     Returns:
         JSON response from the guardrails service with validation results.
-
-    Raises:
-        httpx.HTTPError: If the request fails.
+        On service failure, returns a bypass response with success=False and bypassed=True.
     """

@codecov
Copy link

codecov bot commented Feb 5, 2026

Codecov Report

❌ Patch coverage is 86.99187% with 16 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
backend/app/services/llm/guardrail.py 41.17% 10 Missing ⚠️
backend/app/services/llm/jobs.py 80.64% 6 Missing ⚠️

📢 Thoughts on this report? Let us know!

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In @.env.test.example:
- Around line 35-39: The empty environment placeholders OPENAI_API_KEY,
KAAPI_GUARDRAILS_AUTH, and KAAPI_GUARDRAILS_URL are quoted which triggers
dotenv-linter's QuoteCharacter rule; remove the surrounding double quotes so
each line reads as an unquoted empty value (e.g., OPENAI_API_KEY=) to silence
the linter and keep the template consistent.

In `@backend/app/tests/api/routes/test_llm.py`:
- Around line 169-275: Both new tests test_llm_call_success_with_guardrails and
test_llm_call_guardrails_bypassed_still_succeeds are missing return type
annotations; update their signatures to include "-> None" (e.g., def
test_llm_call_success_with_guardrails(... ) -> None:) so they follow the project
typing guideline for function return types.

In `@backend/app/tests/services/llm/test_jobs.py`:
- Around line 722-907: All new guardrails test functions
(test_guardrails_sanitize_input_before_provider,
test_guardrails_sanitize_output_after_provider,
test_guardrails_bypass_does_not_modify_input,
test_guardrails_validation_failure_blocks_job,
test_guardrails_rephrase_needed_blocks_job) need explicit type hints on their
parameters and return type; update each signature to annotate db, job_env, and
job_for_execution with the same types used in other tests in this file (e.g.,
db: Session/Any, job_env: dict, job_for_execution: Callable or specific fixture
type) and add -> None as the return annotation so each test function has full
parameter and return type hints consistent with the project's typing pattern.

Returns:
JSON response from the guardrails service with validation results.

Raises:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you do not have to add "raises" to the function doc

@@ -0,0 +1,63 @@
from typing import Any
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add test cases for this file as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants