Skip to content

experiement implementation#10

Merged
MarquisG merged 5 commits into
masterfrom
feature/experiment
May 19, 2025
Merged

experiement implementation#10
MarquisG merged 5 commits into
masterfrom
feature/experiment

Conversation

@MarquisG
Copy link
Copy Markdown
Member

@MarquisG MarquisG commented May 19, 2025

Summary by CodeRabbit

  • New Features

    • Added support for experiment creation and management with new experiment data types and API endpoints.
    • Enhanced trace, log, and generation objects to include richer metadata, evaluation configurations, and experiment associations.
  • Improvements

    • Renamed chain_slug to feature_slug for clarity across traces and related payloads.
    • Introduced new logger levels (info, error) and refined logging behavior.
    • Added filtering by feature slug to prompt listing functionality.
    • Updated request payloads to include experiment and evaluation details.
    • Improved trace lifecycle management with flush control and enhanced logging.
  • Bug Fixes

    • Fixed variable replacement and logging in prompt system text.
    • Adjusted logging levels and error reporting for better clarity.
  • Documentation

    • Added comprehensive docstrings and usage examples for new and updated monitoring classes.
  • Tests

    • Updated tests to align with naming convention changes and modified request structures.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 19, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

This update introduces significant enhancements and refactoring across the codebase. It adds comprehensive type definitions and dataclasses for monitoring, tracing, logging, generations, evaluators, and experiments. Logging and network layers are refactored for improved type safety and clarity, with new log levels and error handling. Experiment management is now supported in the SDK, and endpoint and DTO handling is modernized. Tests and interfaces are updated to align with new naming conventions and structures.

Changes

File(s) Change Summary
basalt/_version.py Updated version string from "0.1.1" to "0.2.0".
basalt/basalt_facade.py, basalt/basaltsdk.py Refactored imports for IMonitorSDK, updated constructor parameter types, and improved logger/networker integration.
basalt/endpoints/list_prompts.py Modified prepare_request to accept PromptListDTO and include query parameters.
basalt/endpoints/monitor/create_experiment.py Added new module for experiment creation endpoint with DTOs and response decoding logic.
basalt/endpoints/monitor/send_trace.py Refactored trace log processing, replaced chainSlug with featureSlug, and added new fields to request body.
basalt/objects/base_log.py Updated constructor to use BaseLogParams, added evaluator management methods and properties.
basalt/objects/experiment.py Introduced Experiment wrapper class exposing key properties.
basalt/objects/generation.py Changed constructor and properties to use GenerationParams and snake_case naming for tokens.
basalt/objects/log.py Updated constructor to use LogParams, improved append method for generation management.
basalt/objects/trace.py Major refactor: renamed chain_slug to feature_slug, added experiment/evaluator support, enhanced lifecycle methods, and improved logging/flush logic.
basalt/ressources/monitor/base_log_types.py New module with BaseLogParams and BaseLog dataclasses, providing foundational log structure and methods.
basalt/ressources/monitor/evaluator_types.py New module defining Evaluator and EvaluationConfig dataclasses.
basalt/ressources/monitor/experiment_types.py New module with ExperimentParams and Experiment dataclasses.
basalt/ressources/monitor/generation_types.py New module introducing PromptReference, GenerationParams, and Generation dataclasses.
basalt/ressources/monitor/log_type.py New enum-like LogType class for log category constants.
basalt/ressources/monitor/log_types.py New module defining LogParams and Log dataclasses with methods for log management.
basalt/ressources/monitor/monitorsdk_types.py New protocol IMonitorSDK defining the monitoring SDK interface.
basalt/ressources/monitor/trace_types.py New module with User, Organization, TraceParams, and Trace dataclasses and methods for trace management.
basalt/sdk/monitorsdk.py Added experiment creation support, refactored trace creation for new types, and improved logging.
basalt/sdk/promptsdk.py Updated prompt listing to accept feature_slug, enhanced variable replacement in prompts, and improved logging.
basalt/utils/api.py Removed trailing whitespace in docstring.
basalt/utils/dtos.py Updated type aliases, added PromptListDTO, removed legacy dataclasses, and improved import structure.
basalt/utils/flusher.py Changed "chain_slug" to "feature_slug", added experiment/evaluator fields, and updated logging levels.
basalt/utils/logger.py Replaced debug with info and error, updated log level logic, and improved permission checks.
basalt/utils/networker.py Removed logger dependency and debug logs from network layer.
basalt/utils/protocols.py Removed local IMonitorSDK, updated IPromptSDK, added info/error to ILogger, and introduced LogLevel type alias.
tests/test_monitor_sdk.py, tests/test_send_trace_endpoint.py Updated tests to use feature_slug instead of chain_slug and adjusted assertions accordingly.
.gitignore Added .DS_Store to ignored files.

Sequence Diagram(s)

sequenceDiagram
    participant User
    participant MonitorSDK
    participant CreateExperimentEndpoint
    participant API

    User->>MonitorSDK: create_experiment(feature_slug, params)
    MonitorSDK->>CreateExperimentEndpoint: prepare_request(dto)
    CreateExperimentEndpoint->>API: POST /monitor/experiments
    API-->>CreateExperimentEndpoint: Response (experiment data)
    CreateExperimentEndpoint-->>MonitorSDK: decode_response(body)
    MonitorSDK-->>User: Experiment instance
Loading
sequenceDiagram
    participant User
    participant PromptSDK
    participant ListPromptsEndpoint
    participant API

    User->>PromptSDK: list(feature_slug)
    PromptSDK->>ListPromptsEndpoint: prepare_request(dto)
    ListPromptsEndpoint->>API: GET /prompts?featureSlug=...
    API-->>ListPromptsEndpoint: Response (prompt list)
    ListPromptsEndpoint-->>PromptSDK: decode_response(body)
    PromptSDK-->>User: ListResult
Loading

Poem

🐇
With logs and traces, tokens in tow,
Experiments bloom and evaluators grow.
From slug to feature, the naming's now right,
Our SDK hops forward, much to delight!
New types and classes, the garden expands—
Basalt now tracks with the best in the lands!
Hoppity hooray for this code update night!

Note

⚡️ AI Code Reviews for VS Code, Cursor, Windsurf

CodeRabbit now has a plugin for VS Code, Cursor and Windsurf. This brings AI code reviews directly in the code editor. Each commit is reviewed immediately, finding bugs before the PR is raised. Seamless context handoff to your AI code agent ensures that you can easily incorporate review feedback.
Learn more here.


Note

⚡️ Faster reviews with caching

CodeRabbit now supports caching for code and dependencies, helping speed up reviews. This means quicker feedback, reduced wait times, and a smoother review experience overall. Cached data is encrypted and stored securely. This feature will be automatically enabled for all accounts on May 16th. To opt out, configure Review - Disable Cache at either the organization or repository level. If you prefer to disable all data retention across your organization, simply turn off the Data Retention setting under your Organization Settings.
Enjoy the performance boost—your workflow just got faster.


📜 Recent review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 79e700c and 06fc5f4.

📒 Files selected for processing (2)
  • .gitignore (1 hunks)
  • basalt/sdk/monitorsdk.py (5 hunks)
✨ Finishing Touches
  • 📝 Generate Docstrings

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Explain this complex logic.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai explain this code block.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and explain its main purpose.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR.
  • @coderabbitai generate sequence diagram to generate a sequence diagram of the changes in this PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 16

🔭 Outside diff range comments (7)
tests/test_monitor_sdk.py (1)

50-70: 🛠️ Refactor suggestion

Update tests to verify experiment-related functionality

The PR description mentions adding experiment management support, but the test modifications don't include verification of this new functionality.

Consider adding new test cases or extending existing ones to verify the experiment-related functionality. For example:

def test_create_trace_with_experiment(self):
    """Test creating a trace with experiment metadata."""
    experiment_id = "exp-123"
    trace = self.monitor.create_trace(
        "test-slug",
        {
            "input": self.content,
            "experiment": {"id": experiment_id, "name": "Test Experiment"},
            "metadata": {"property1": "value1"}
        }
    )
    
    # Assert trace was created correctly with experiment data
    self.assertIsNotNone(trace)
    self.assertEqual(trace.feature_slug, "test-slug")
    self.assertEqual(trace.experiment.get("id"), experiment_id)
basalt/endpoints/list_prompts.py (1)

32-38: 🛠️ Refactor suggestion

Update docstring to reflect the new parameter

The prepare_request method signature has been updated to accept a PromptListDTO parameter, but the docstring doesn't document this parameter. The docstring should be updated to include the dto parameter and its purpose.

@staticmethod
def prepare_request(dto: PromptListDTO) -> Dict[str, Any]:
    """
    Prepare the request dictionary for the ListPrompts endpoint.
+
+    Args:
+        dto (PromptListDTO): The data transfer object containing the request parameters.

    Returns:
    	The path, method, and query parameters for getting a prompt on the API.
    """
basalt/basalt_facade.py (1)

20-27: 🛠️ Refactor suggestion

Update docstring to reflect LogLevel type

The method signature has been updated to use the LogLevel type instead of a string, but the docstring still refers to log_level as a string and lists the possible values as strings. The docstring should be updated to reflect the new type.

def __init__(self, api_key: str, log_level: LogLevel = 'all'):
    """
    Initializes the Basalt client with the given API key and log level.

    Args:
        api_key (str): The API key for authenticating with the Basalt SDK.
-        log_level (str, optional): The log level for the logger. Defaults to 'all'. (all, warn, error, debug, none)
+        log_level (LogLevel, optional): The log level for the logger. Defaults to 'all'.
    """
basalt/sdk/promptsdk.py (1)

39-41: 🛠️ Refactor suggestion

Avoid mutable default for variables

Using {} as a default argument is unsafe because the same dictionary instance is reused across calls.

-        variables: Dict[str, str] = {},
+        variables: Optional[Dict[str, str]] = None,

And inside the method:

variables = variables or {}

Apply the same pattern to _replace_vars.

🧰 Tools
🪛 Ruff (0.11.9)

39-39: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

basalt/objects/base_log.py (1)

119-129: ⚠️ Potential issue

Missing evaluators in to_dict method

The to_dict method doesn't include the new _evaluators field in the returned dictionary, which could lead to data loss during serialization.

Consider updating the to_dict method to include evaluators:

def to_dict(self) -> Dict[str, Any]:
    """Convert the log to a dictionary for API serialization."""
    return {
        "id": self._id,
        "type": self._type,
        "name": self._name,
        "start_time": self._start_time,
        "end_time": self._end_time,
        "metadata": self._metadata,
        "parent": {"id": self._parent.id} if self._parent else None,
+       "evaluators": self._evaluators,
    }
basalt/sdk/monitorsdk.py (1)

145-157: 🛠️ Refactor suggestion

⚠️ Potential issue

None experiment causes AttributeError inside Trace – hot-fix pipeline failure

Trace.__init__ expects params["experiment"] to be either absent or a valid object.
Passing the key with a None value means the constructor tries experiment.feature_slug, raising the failure seen in CI.

-            "metadata": params.metadata,
-            "experiment": params.experiment,
+            "metadata": params.metadata,

If you still need to forward the experiment, guard it:

if params.experiment is not None:
    params_dict["experiment"] = params.experiment

Apply the same pattern to evaluators and evaluation_config if they can also be None.

basalt/objects/trace.py (1)

16-28: ⚠️ Potential issue

params is typed as TraceParams but treated like a dict

TraceParams is a dataclass; calling .get() on it triggers an AttributeError.
Either pass a plain dict or access attributes directly:

-        self._input = params.get("input")
+        self._input = params.input

Repeat for every subsequent .get() usage (input, output, name, …).
Keeping the current mismatch will break at instantiation time, masking other logic errors.

🧹 Nitpick comments (15)
basalt/ressources/monitor/log_type.py (2)

1-17: Consider using Python's Enum class for better type safety

While the current implementation works as a simple container for log type constants, consider using Python's built-in Enum class for better type safety, autocomplete support, and to prevent invalid assignments.

-class LogType:
+from enum import Enum, auto
+
+class LogType(Enum):
     """Enum-like class for log types.
     
     Attributes:
         SPAN: Represents a span log type
         GENERATION: Represents a generation log type
         FUNCTION: Represents a function log type
         TOOL: Represents a tool log type
         RETRIEVAL: Represents a retrieval log type
         EVENT: Represents an event log type
     """
-    SPAN = 'span'
-    GENERATION = 'generation'
-    FUNCTION = 'function'
-    TOOL = 'tool'
-    RETRIEVAL = 'retrieval'
-    EVENT = 'event' 
+    SPAN = 'span'
+    GENERATION = 'generation'
+    FUNCTION = 'function'
+    TOOL = 'tool'
+    RETRIEVAL = 'retrieval'
+    EVENT = 'event'

17-17: Remove trailing whitespace

There's a trailing whitespace at the end of line 17.

-    EVENT = 'event' 
+    EVENT = 'event'
basalt/endpoints/list_prompts.py (1)

39-45: Add handling for None featureSlug value

The current implementation always includes the featureSlug parameter in the query, even if dto.featureSlug is None. This could potentially cause issues if the API doesn't handle null values for this parameter properly.

return {
    "path": "/prompts",
    "method": "GET",
    "query": {
-        "featureSlug": dto.featureSlug
+        **({"featureSlug": dto.featureSlug} if dto.featureSlug is not None else {})
    }
}
basalt/ressources/monitor/experiment_types.py (1)

4-8: Consider adding feature_slug to ExperimentParams

The ExperimentParams class only has a name field, but the Experiment class requires a feature_slug. This suggests that the feature slug is provided through some other means when creating an experiment. Consider adding a feature_slug field to ExperimentParams for clarity and completeness.

@dataclass
class ExperimentParams:
    """Parameters for creating an experiment."""
    name: str
+    feature_slug: str
basalt/sdk/promptsdk.py (1)

3-3: Unused import GetResult – please remove to keep the module clean

GetResult is imported but never used, which will trigger Ruff F401 and adds noise to the diff.

-from ..utils.dtos import GetPromptDTO, PromptResponse, DescribePromptResponse, DescribePromptDTO, GetResult, DescribeResult, ListResult, PromptListResponse, PromptListDTO
+from ..utils.dtos import (
+    GetPromptDTO,
+    PromptResponse,
+    DescribePromptResponse,
+    DescribePromptDTO,
+    DescribeResult,
+    ListResult,
+    PromptListResponse,
+    PromptListDTO,
+)
🧰 Tools
🪛 Ruff (0.11.9)

3-3: ..utils.dtos.GetResult imported but unused

Remove unused import: ..utils.dtos.GetResult

(F401)

basalt/utils/flusher.py (2)

32-46: Potential serialization issues for experiment / evaluators / evaluationConfig

_trace_to_dict forwards complex python objects (Experiment instance, possibly complex evaluator objects) directly.
While SendTraceEndpoint dereferences experiment.id, it leaves evaluators and evaluationConfig untouched – if they contain non-serialisable objects, json.dumps in the network layer will fail.

Consider converting them to plain serialisable structures here:

-            "experiment": trace.experiment,
+            "experiment": {"id": trace.experiment.id} if trace.experiment else None,
-            "evaluators": trace.evaluators,
-            "evaluationConfig": trace.evaluation_config
+            "evaluators": [dict(e) for e in trace.evaluators] if trace.evaluators else None,
+            "evaluationConfig": dict(trace.evaluation_config) if trace.evaluation_config else None,

93-112: Missing success feedback after flush

On failure the method logs an error and returns, but on success it does nothing.
A lightweight info/debug log helps confirm monitoring is functioning.

@@
     error, result = self._api.invoke(endpoint, dto)
 
     if error:
         self._logger.error(f"Failed to flush trace {trace.feature_slug}: {error}")
         return
+
+    self._logger.info(f"Trace {trace.feature_slug} flushed successfully")
basalt/objects/log.py (1)

74-80: options hard-coded – preserve existing options

Overwriting generation.options outright may discard legitimate fields (e.g., temperature). Consider merging:

-        generation.options = {"type": "multi"}
+        generation.options = {**(generation.options or {}), "type": "multi"}
basalt/utils/logger.py (1)

15-17: Consider more robust error logging implementation

The error method uses the same print function as other log levels. For production code, consider using a more robust approach for error logging, such as writing to stderr or using a dedicated logging library.

def error(self, *args):
    if self._can_error():
-        print(*args)
+        import sys
+        print(*args, file=sys.stderr)
basalt/utils/dtos.py (1)

127-130: Missing from_dict classmethod in PromptListDTO

Unlike other DTOs in this file, PromptListDTO doesn't have a from_dict classmethod for consistent deserialization.

Consider adding a from_dict classmethod for consistency:

@dataclass(frozen=True)
class PromptListDTO:
    featureSlug: Optional[str] = None
+
+    @classmethod
+    def from_dict(cls, data: Dict[str, Any]):
+        return cls(
+            featureSlug=data.get("featureSlug")
+        )
basalt/endpoints/monitor/send_trace.py (1)

45-58: Simplify conditional dictionary lookups with .get() method.

The code uses conditional checks for dictionary keys, but the .get() method would be more concise.

 # Convert dates and handle parent ID
 log_data = {
     "startTime": dict_log["start_time"].isoformat() if isinstance(dict_log["start_time"], datetime) else dict_log["start_time"],
     "endTime": dict_log["end_time"].isoformat() if isinstance(dict_log["end_time"], datetime) and dict_log["end_time"] else None,
     "parentId": dict_log["parent"]["id"] if dict_log["parent"] else None,
-    "inputTokens": dict_log["input_tokens"] if "input_tokens" in dict_log else None,
-    "outputTokens": dict_log["output_tokens"] if "output_tokens" in dict_log else None,
-    "cost": dict_log["cost"] if "cost" in dict_log else None,
-    "variables": [{"label": k, "value": v} for k, v in dict_log["variables"].items()] if "variables" in dict_log else None,
-    "input": dict_log["input"] if "input" in dict_log else None,
-    "output": dict_log["output"] if "output" in dict_log else None,
-    "prompt": dict_log["prompt"] if "prompt" in dict_log else None,
-    "evaluators": dict_log["evaluators"] if "evaluators" in dict_log else None
+    "inputTokens": dict_log.get("input_tokens"),
+    "outputTokens": dict_log.get("output_tokens"),
+    "cost": dict_log.get("cost"),
+    "variables": [{"label": k, "value": v} for k, v in dict_log["variables"].items()] if "variables" in dict_log else None,
+    "input": dict_log.get("input"),
+    "output": dict_log.get("output"),
+    "prompt": dict_log.get("prompt"),
+    "evaluators": dict_log.get("evaluators")
 }
🧰 Tools
🪛 Ruff (0.11.9)

50-50: Use dict_log.get("input_tokens", None) instead of an if block

Replace with dict_log.get("input_tokens", None)

(SIM401)


51-51: Use dict_log.get("output_tokens", None) instead of an if block

Replace with dict_log.get("output_tokens", None)

(SIM401)


52-52: Use dict_log.get("cost", None) instead of an if block

Replace with dict_log.get("cost", None)

(SIM401)


54-54: Use dict_log.get("input", None) instead of an if block

Replace with dict_log.get("input", None)

(SIM401)


55-55: Use dict_log.get("output", None) instead of an if block

Replace with dict_log.get("output", None)

(SIM401)


56-56: Use dict_log.get("prompt", None) instead of an if block

Replace with dict_log.get("prompt", None)

(SIM401)


57-57: Use dict_log.get("evaluators", None) instead of an if block

Replace with dict_log.get("evaluators", None)

(SIM401)

basalt/ressources/monitor/log_types.py (1)

71-163: Methods are defined but not implemented.

The methods start, end, append, and create_generation have comprehensive docstrings with examples, but their implementations are placeholders (...). This suggests this file defines an interface that will be implemented elsewhere, but it should be clearly documented.

Would you like me to verify how these methods are implemented in other parts of the codebase, or help complete the implementations here?

basalt/ressources/monitor/base_log_types.py (1)

61-69: Default id stringification is redundant

str(f'log-{uuid4().hex[:8]}') already yields a string, so wrapping it in an extra str() call is unnecessary.

-    id: str = field(default_factory=lambda: str(f'log-{uuid4().hex[:8]}'))
+    id: str = field(default_factory=lambda: f'log-{uuid4().hex[:8]}')
basalt/ressources/monitor/trace_types.py (1)

4-6: LogType import is unused

Static analysis (ruff F401) is correct – the symbol is never referenced.
Remove the import to keep the module clean.

-from .log_type import LogType
🧰 Tools
🪛 Ruff (0.11.9)

6-6: .log_type.LogType imported but unused

Remove unused import: .log_type.LogType

(F401)

basalt/objects/trace.py (1)

323-327: to_dict serialises raw objects – risk of non-JSON serialisable payload

self._logs, self._experiment, self._evaluators, and self._evaluation_config are complex objects.
Pushing them directly into a JSON encoder will fail or leak internal representation.

Consider:

"logs": [log.to_dict() for log in self._logs],
"experiment": self._experiment.to_dict() if self._experiment else None,
"evaluators": [e.to_dict() for e in self._evaluators] if self._evaluators else None,

Serialise only primitives and nested dicts.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 899f00e and 80e2e38.

⛔ Files ignored due to path filters (2)
  • .DS_Store is excluded by !**/.DS_Store
  • basalt/.DS_Store is excluded by !**/.DS_Store
📒 Files selected for processing (29)
  • basalt/_version.py (1 hunks)
  • basalt/basalt_facade.py (3 hunks)
  • basalt/basaltsdk.py (2 hunks)
  • basalt/endpoints/list_prompts.py (3 hunks)
  • basalt/endpoints/monitor/create_experiment.py (1 hunks)
  • basalt/endpoints/monitor/send_trace.py (3 hunks)
  • basalt/objects/base_log.py (4 hunks)
  • basalt/objects/experiment.py (1 hunks)
  • basalt/objects/generation.py (4 hunks)
  • basalt/objects/log.py (2 hunks)
  • basalt/objects/trace.py (7 hunks)
  • basalt/ressources/monitor/base_log_types.py (1 hunks)
  • basalt/ressources/monitor/evaluator_types.py (1 hunks)
  • basalt/ressources/monitor/experiment_types.py (1 hunks)
  • basalt/ressources/monitor/generation_types.py (1 hunks)
  • basalt/ressources/monitor/log_type.py (1 hunks)
  • basalt/ressources/monitor/log_types.py (1 hunks)
  • basalt/ressources/monitor/monitorsdk_types.py (1 hunks)
  • basalt/ressources/monitor/trace_types.py (1 hunks)
  • basalt/sdk/monitorsdk.py (5 hunks)
  • basalt/sdk/promptsdk.py (4 hunks)
  • basalt/utils/api.py (1 hunks)
  • basalt/utils/dtos.py (3 hunks)
  • basalt/utils/flusher.py (3 hunks)
  • basalt/utils/logger.py (1 hunks)
  • basalt/utils/networker.py (2 hunks)
  • basalt/utils/protocols.py (3 hunks)
  • tests/test_monitor_sdk.py (1 hunks)
  • tests/test_send_trace_endpoint.py (2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (7)
basalt/basaltsdk.py (3)
basalt/utils/protocols.py (3)
  • IPromptSDK (29-32)
  • IBasaltSDK (34-38)
  • monitor (38-38)
basalt/basalt_facade.py (1)
  • monitor (54-58)
basalt/ressources/monitor/monitorsdk_types.py (1)
  • IMonitorSDK (9-176)
basalt/endpoints/list_prompts.py (5)
basalt/utils/dtos.py (2)
  • PromptListResponse (108-125)
  • PromptListDTO (128-129)
basalt/endpoints/monitor/create_experiment.py (1)
  • prepare_request (37-47)
basalt/utils/protocols.py (1)
  • prepare_request (14-14)
basalt/endpoints/get_prompt.py (1)
  • prepare_request (32-49)
basalt/endpoints/describe_prompt.py (1)
  • prepare_request (32-49)
basalt/utils/flusher.py (5)
basalt/objects/trace.py (6)
  • feature_slug (90-92)
  • logs (80-82)
  • logs (85-87)
  • experiment (100-102)
  • evaluators (110-112)
  • evaluation_config (105-107)
basalt/utils/logger.py (1)
  • error (15-17)
basalt/utils/protocols.py (2)
  • error (43-43)
  • invoke (18-18)
basalt/endpoints/monitor/send_trace.py (1)
  • SendTraceEndpoint (11-134)
basalt/utils/api.py (1)
  • invoke (41-74)
basalt/ressources/monitor/experiment_types.py (2)
basalt/objects/experiment.py (5)
  • name (13-14)
  • Experiment (4-22)
  • id (9-10)
  • feature_slug (17-18)
  • created_at (21-22)
basalt/endpoints/monitor/create_experiment.py (1)
  • Experiment (7-21)
basalt/utils/logger.py (1)
basalt/utils/protocols.py (4)
  • ILogger (40-43)
  • warn (41-41)
  • info (42-42)
  • error (43-43)
basalt/endpoints/monitor/send_trace.py (4)
basalt/objects/base_log.py (4)
  • trace (70-72)
  • trace (80-82)
  • to_dict (119-129)
  • id (30-32)
basalt/objects/trace.py (4)
  • logs (80-82)
  • logs (85-87)
  • to_dict (311-327)
  • append (227-245)
basalt/objects/log.py (1)
  • append (58-81)
basalt/objects/experiment.py (1)
  • id (9-10)
basalt/ressources/monitor/monitorsdk_types.py (7)
basalt/ressources/monitor/trace_types.py (4)
  • TraceParams (26-38)
  • Trace (41-301)
  • create_generation (226-254)
  • create_log (256-281)
basalt/ressources/monitor/experiment_types.py (2)
  • ExperimentParams (5-7)
  • Experiment (10-14)
basalt/objects/trace.py (4)
  • Trace (12-339)
  • create_generation (247-270)
  • create_log (272-289)
  • feature_slug (90-92)
basalt/ressources/monitor/generation_types.py (2)
  • GenerationParams (26-59)
  • Generation (62-156)
basalt/ressources/monitor/log_types.py (4)
  • LogParams (11-25)
  • Log (28-188)
  • create_generation (134-162)
  • create_log (164-188)
basalt/objects/log.py (3)
  • Log (6-145)
  • create_generation (103-127)
  • create_log (129-145)
basalt/sdk/monitorsdk.py (4)
  • create_trace (45-65)
  • create_generation (67-81)
  • create_log (83-97)
  • create_experiment (27-42)
🪛 GitHub Actions: Python SDK Tests
tests/test_monitor_sdk.py

[error] 52-239: Multiple test failures in TestMonitorSDK and TestMonitorSDKIntegration due to AttributeError: 'NoneType' object has no attribute 'feature_slug' when calling monitor.create_trace

basalt/sdk/monitorsdk.py

[error] 158-158: Failure in _create_trace method when instantiating Trace object due to experiment being None and causing AttributeError

basalt/objects/trace.py

[error] 39-39: AttributeError: 'NoneType' object has no attribute 'feature_slug' in Trace.init when accessing experiment.feature_slug

🪛 Ruff (0.11.9)
basalt/endpoints/monitor/send_trace.py

50-50: Use dict_log.get("input_tokens", None) instead of an if block

Replace with dict_log.get("input_tokens", None)

(SIM401)


51-51: Use dict_log.get("output_tokens", None) instead of an if block

Replace with dict_log.get("output_tokens", None)

(SIM401)


52-52: Use dict_log.get("cost", None) instead of an if block

Replace with dict_log.get("cost", None)

(SIM401)


54-54: Use dict_log.get("input", None) instead of an if block

Replace with dict_log.get("input", None)

(SIM401)


55-55: Use dict_log.get("output", None) instead of an if block

Replace with dict_log.get("output", None)

(SIM401)


56-56: Use dict_log.get("prompt", None) instead of an if block

Replace with dict_log.get("prompt", None)

(SIM401)


57-57: Use dict_log.get("evaluators", None) instead of an if block

Replace with dict_log.get("evaluators", None)

(SIM401)

basalt/sdk/promptsdk.py

3-3: ..utils.dtos.GetResult imported but unused

Remove unused import: ..utils.dtos.GetResult

(F401)

basalt/ressources/monitor/trace_types.py

6-6: .log_type.LogType imported but unused

Remove unused import: .log_type.LogType

(F401)

basalt/sdk/monitorsdk.py

9-9: Redefinition of unused Generation from line 6

(F811)


10-10: Redefinition of unused Log from line 7

(F811)

🔇 Additional comments (39)
basalt/utils/api.py (1)

23-23: Whitespace cleanup looks good

This minor whitespace cleanup in the docstring improves code consistency.

basalt/_version.py (1)

1-1: Version bump is appropriate for new experiment implementation

The version bump to 0.2.0 correctly follows semantic versioning for the addition of new features (experiment implementation) without breaking changes.

basalt/basaltsdk.py (1)

1-2: Refactored import structure looks good

The import statement for IMonitorSDK has been correctly updated to import from the dedicated interface module instead of the generic protocols module. This aligns with the broader refactoring to improve type safety and organization.

tests/test_send_trace_endpoint.py (2)

17-17: Field name update is consistent with API changes

The renaming from "chain_slug" to "feature_slug" in the test input data aligns with the broader refactoring across the codebase.


50-50: Assertion updated properly for new field name

The assertion correctly checks for "featureSlug" in the response body, maintaining consistency with the input change from "chain_slug" to "feature_slug".

basalt/utils/networker.py (1)

12-13: Simplified constructor by removing logging dependency

The Networker class no longer requires a logger, which aligns with the decoupling of logging responsibilities from the network layer.

basalt/ressources/monitor/evaluator_types.py (2)

1-10: Well-defined Evaluator dataclass

The Evaluator dataclass is implemented with clear type annotations and appropriate documentation. It provides a clean structure for representing evaluator configurations.


11-16: Clean implementation of EvaluationConfig

The EvaluationConfig dataclass is well-structured with proper typing and default values. Using Optional for sample_rate with None as default is a good practice.

basalt/objects/experiment.py (1)

1-22: LGTM! Clean wrapper implementation

The Experiment class provides a clean wrapper around the IExperiment interface with properly typed read-only properties. This implementation follows good object-oriented design principles and provides type safety.

basalt/basalt_facade.py (1)

29-31: Removal of logger from Networker constructor

The logger parameter has been removed from the Networker constructor. This suggests that logging has been refactored out of the networker, which is a good separation of concerns. Make sure that any necessary logging for network operations is handled elsewhere in the codebase.

basalt/sdk/promptsdk.py (1)

197-208:

Details

✅ Verification successful

self._logger.warn may not exist – verify logger interface

The only logger method shown in basalt/utils/logger.py is error().
If warn() (or warning()) is not implemented, these calls will raise AttributeError the first time a missing variable is encountered.

Request verification:

If absent, either:

-            self._logger.warn(...)
+            self._logger.error(...)  # or .info/.warning once implemented

or implement a warn/warning method on the logger.


🏁 Script executed:

#!/bin/bash
# Check for a warn|warning method on ILogger implementation
rg -n "def (warn|warning)" | head

Length of output: 154


No action needed: warn() is implemented on the logger

The ILogger protocol (protocols.py:41) and the Logger implementation (logger.py:7) both define a warn() method, so these calls will not raise an exception. No changes are required here.

🧰 Tools
🪛 Ruff (0.11.9)

197-197: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

basalt/utils/protocols.py (5)

1-1: Good addition of the Literal type import

The addition of Literal to the import statement is appropriate for defining the new LogLevel type alias.


4-4: Clean refactoring by importing IMonitorSDK from a dedicated type module

Moving the IMonitorSDK interface to a dedicated type module improves code organization and separation of concerns.


32-32: API enhancement for filtering prompts by feature

Adding the optional feature_slug parameter enables filtering prompts by feature, which aligns with the new PromptListDTO in dtos.py.


41-43: Improved logging interface with additional severity levels

The introduction of info and error methods enhances the logging interface with more granular control over log levels.


46-46: Type safety improvement with LogLevel type alias

The LogLevel type alias using Literal improves type safety by restricting possible values to "all", "warning", or "none".

basalt/objects/base_log.py (5)

2-8: Good restructuring with explicit type imports

The import updates properly reference the new type modules, enhancing code organization and type safety.


14-14: Enhanced type safety with BaseLogParams

Replacing a generic dictionary with the strongly typed BaseLogParams improves type safety and code clarity.


23-23: Adding evaluators support

Adding the _evaluators field enables tracking evaluator data within logs, supporting experiment tracking functionality.


74-77: Good encapsulation with evaluators property

The property getter for evaluators follows proper encapsulation principles by providing read-only access to the internal _evaluators field.


94-99: Well-implemented evaluator addition method

The add_evaluator method includes proper null checking and initialization, and follows the method chaining pattern consistently used in the class.

basalt/utils/logger.py (3)

1-4: Improved type safety in logger initialization

Using the imported LogLevel type for the log_level parameter enhances type checking and prevents invalid log level values.


11-17: Enhanced logging granularity with new methods

The addition of info and error methods provides more granular control over logging levels, improving debugging capabilities.


19-26: Well-structured logging permission checks

The permission check methods implement a clear hierarchy: errors are always logged, warnings are logged with 'warning' or 'all' levels, and info is logged only with 'all' level.

basalt/utils/dtos.py (3)

4-4: Good integration with monitoring system

Importing the Generation type from the monitor module shows good integration between the prompt and monitoring systems.


127-130: Consistent addition of PromptListDTO

The new PromptListDTO dataclass with an optional featureSlug field aligns with the updated list method signature in IPromptSDK.


73-73:

Details

❓ Verification inconclusive

Enhanced return type with Generation support

Expanding the GetResult type to include an optional Generation enables returning monitoring data alongside prompt responses.

This change may impact existing code that expects a 2-tuple. Please ensure all consumers of GetResult have been updated to handle the 3-tuple result:


🏁 Script executed:

#!/bin/bash
# Find all places where GetResult is unpacked to ensure they handle the new 3-tuple format
rg -A 2 "GetResult" | grep -A 5 "(err, result)" 

Length of output: 47


Please verify all GetResult consumers handle the new 3-tuple format

I couldn’t find any direct unpacking of GetResult in the codebase. Please manually ensure that every call site:

  • Invokes functions annotated with -> GetResult
  • Unpacks three variables (e.g. err, response, generation = …) rather than only two

You can use these commands to locate call sites and unpack patterns:

# Find all functions returning GetResult
rg -n "->\s*GetResult" -C2

# Find any two-variable unpacking (err, result = …) that may need a third variable
rg -n "^[[:space:]]*[[:alpha:]_][[:alnum:]_]*,[[:space:]]*[[:alpha:]_][[:alnum:]_]*[[:space:]]*=" -C2
basalt/objects/generation.py (6)

4-4: Import style is consistent with the project's refactoring toward typed interfaces.

The import of GenerationParams from the new types module aligns with the broader effort to improve type safety throughout the codebase.


10-10: Constructor signature changed to use typed parameter.

The constructor now explicitly requires a GenerationParams type instead of a generic dictionary, improving type safety while maintaining backward compatibility through dictionary-like access patterns.


20-21: Attribute naming convention changed from camelCase to snake_case.

This change from _inputTokens/_outputTokens to _input_tokens/_output_tokens aligns with Python naming conventions and is consistent with the broader codebase refactoring effort.


54-56: Property naming updated to match attribute naming.

The input_tokens property declaration and implementation now uses snake_case, consistent with the attribute rename.


59-61: Property naming updated to match attribute naming.

The output_tokens property declaration and implementation now uses snake_case, consistent with the attribute rename.


136-137: Update method keys renamed to match the new attribute naming.

The update method now uses the snake_case keys consistent with the property and attribute renames.

basalt/endpoints/monitor/send_trace.py (1)

94-106: Updated API payload structure with new fields.

The request body now includes experiment, evaluators, and evaluationConfig fields, supporting the new experimental features. The naming convention change from chainSlug to featureSlug is consistently applied.

basalt/endpoints/monitor/create_experiment.py (1)

32-56: Well-structured endpoint implementation.

The CreateExperimentEndpoint follows the established pattern for endpoints in the codebase, with separate methods for request preparation and response decoding. The error handling is consistent with other endpoints.

basalt/ressources/monitor/log_types.py (1)

36-66: Excellent docstring with detailed usage examples.

The comprehensive docstring with real-world usage examples greatly enhances developer understanding of how the Log class should be used within the system.

basalt/objects/trace.py (2)

33-36: Key name mismatch for evaluation fields

You read evaluationConfig from the params, but the dataclass defines evaluation_config.
Even after the dict/dataclass issue is fixed, the camelCase/ snake_case discrepancy will silently drop data.

-        self._evaluation_config = params.get("evaluationConfig")
+        self._evaluation_config = params.evaluation_config

330-339: _can_flush logs via warn but Logger exposes warning

Double-check the Logger API. If the method is warning() (Python std-lib style), the current call will raise AttributeError, preventing the guard from working.

tests/test_monitor_sdk.py (1)

70-70:

Details

❌ Incorrect review comment

Fix the implementation error causing NoneType error in feature_slug check.

The test has been correctly updated to check feature_slug instead of chain_slug, reflecting the change in the attribute name. However, the pipeline is failing with AttributeError: 'NoneType' object has no attribute 'feature_slug' when calling monitor.create_trace.

This suggests that the implementation of MonitorSDK.create_trace is either:

  1. Returning None instead of a proper Trace object, or
  2. Not correctly assigning the feature_slug attribute to the value passed as the first argument.

🏁 Script executed:

#!/bin/bash
# Check the implementation of MonitorSDK.create_trace
echo "Checking MonitorSDK.create_trace implementation..."
cat basalt/sdk/monitorsdk.py | grep -A 20 "def create_trace"

# Check the Trace class implementation
echo "Checking Trace class implementation..."
cat basalt/objects/trace.py | grep -A 20 "__init__" | head -n 30

Length of output: 1642


🏁 Script executed:

#!/bin/bash
echo "Inspecting _create_trace implementation in MonitorSDK..."
grep -R -n -A 20 "def _create_trace" basalt/sdk/monitorsdk.py

echo
echo "Checking for a return of the Trace instance..."
grep -R -n "return " -n -A 5 basalt/sdk/monitorsdk.py | grep "_create_trace"

echo
echo "Verifying feature_slug property in Trace class..."
grep -R -n "feature_slug" basalt/objects/trace.py

Length of output: 1674


🏁 Script executed:

#!/bin/bash
echo "Displaying full _create_trace implementation..."
sed -n '128,260p' basalt/sdk/monitorsdk.py

Length of output: 2843


MonitorSDK.create_trace correctly returns a Trace instance

The implementation of MonitorSDK.create_trace (and its internal _create_trace) properly constructs and returns a Trace with feature_slug set to the provided slug. The Trace class’s constructor and its feature_slug property are implemented correctly, so the NoneType error isn’t coming from here.

Please check the test setup in tests/test_monitor_sdk.py:

  • How is monitor instantiated? Ensure it’s a valid MonitorSDK instance, not None.
  • Review any fixtures or mocks for MonitorSDK to confirm they return a real object.

Likely an incorrect or invalid review comment.

Comment thread tests/test_monitor_sdk.py
Comment thread basalt/utils/networker.py
Comment thread basalt/ressources/monitor/experiment_types.py
Comment thread basalt/objects/log.py
Comment thread basalt/endpoints/monitor/create_experiment.py
Comment thread basalt/sdk/monitorsdk.py
Comment thread basalt/ressources/monitor/base_log_types.py
Comment thread basalt/ressources/monitor/trace_types.py
Comment thread basalt/objects/trace.py
Comment thread basalt/objects/trace.py
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
basalt/objects/trace.py (2)

223-226: update() suffers from same dataclass/dict & camelCase issues

The method replicates the earlier mistakes (params.get(...), evaluationConfig).
Refactor using attribute access and correct field names to ensure updates propagate.

 self._name = params.get("name", self._name)
-self._evaluators = params.get("evaluators", self._evaluators)
-self._evaluation_config = params.get("evaluationConfig", self._evaluation_config)
+self._evaluators = params.get("evaluators", self._evaluators)
+self._evaluation_config = params.get("evaluation_config", self._evaluation_config)

Or better, use getattr for attribute access:

 self._name = params.get("name", self._name)
-self._evaluators = params.get("evaluators", self._evaluators)
-self._evaluation_config = params.get("evaluationConfig", self._evaluation_config)
+self._evaluators = getattr(params, "evaluators", self._evaluators)
+self._evaluation_config = getattr(params, "evaluation_config", self._evaluation_config)

37-45: ⚠️ Potential issue

Initialize self._experiment before conditional code path

The _experiment attribute is only initialized in the else branch when the experiment is valid. If there's no "experiment" key in params or if the experiment is invalid, self._experiment remains uninitialized which could cause AttributeError when accessing the experiment property later.

 def __init__(self, feature_slug: str, params: TraceParams, flusher: 'Flusher', logger: 'Logger'):
     self._feature_slug = feature_slug

     self._input = params.get("input")
     self._output = params.get("output")
     self._name = params.get("name")
     self._start_time = params.get("start_time", datetime.now())
     self._end_time = params.get("end_time")
     self._user = params.get("user")
     self._organization = params.get("organization")
     self._metadata = params.get("metadata")

     self._logs: List['BaseLog'] = []

     self._flusher = flusher
     self._is_ended = False

     self._evaluators = params.get("evaluators")
     self._evaluation_config = params.get("evaluationConfig")
     self._logger = logger

+    self._experiment = None
     if "experiment" in params:
         experiment = params["experiment"]
         if experiment is None:
             self._logger.warn("Warning: Experiment is None. This experiment will be ignored.")
         elif experiment.feature_slug != self._feature_slug:
             self._logger.warn("Warning: Experiment feature slug does not match trace feature slug. This experiment will be ignored.")
         else:
             self._experiment = experiment
🧹 Nitpick comments (1)
basalt/objects/trace.py (1)

331-341: Simplify _can_flush method logic

The current implementation is confusing - it checks if the trace is ended, logs a warning if it is, then returns the negation of _is_ended. The method should either return early after logging the warning, or separate the warning logic from the boolean check.

def _can_flush(self) -> bool:
    """
    Check if the trace can be flushed.

    Returns:
        bool: True if the trace can be flushed, False otherwise.
    """
    if self._is_ended:
        self._logger.warn('Trace already ended. This operation will be ignored.')
+       return False

-    return not self._is_ended
+    return True
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between 80e2e38 and aa38c40.

📒 Files selected for processing (1)
  • basalt/objects/trace.py (7 hunks)
🔇 Additional comments (1)
basalt/objects/trace.py (1)

306-309: 🛠️ Refactor suggestion

Improve flush control logic in end()

The method calls _can_flush() which logs a warning if the trace is already ended, but then proceeds to check not self._is_ended in the return. This creates a situation where you log a warning but then still potentially try to flush.

 # Send to the API using the flusher
-if self._can_flush():
+can_flush = self._can_flush()
+if can_flush:
     self._end_time = datetime.now()
     self._is_ended = True
     self._flusher.flush_trace(self)

Likely an incorrect or invalid review comment.

Comment thread basalt/objects/trace.py
Comment on lines +172 to +184
def set_experiment(self, experiment: Dict[str, Any]) -> 'Trace':
"""
Set the experiment for the trace.

Args:
experiment (Dict[str, Any]): The experiment to set.

Returns:
Trace: The trace instance.
"""
self._experiment = experiment
return self

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Type mismatch in set_experiment method

The method signature accepts a Dict[str, Any] but assigns it directly to self._experiment, which is likely supposed to be an Experiment object based on the property return type and import. This could cause type errors later.

-def set_experiment(self, experiment: Dict[str, Any]) -> 'Trace':
+def set_experiment(self, experiment: 'Experiment') -> 'Trace':
     """
     Set the experiment for the trace.

     Args:
-        experiment (Dict[str, Any]): The experiment to set.
+        experiment ('Experiment'): The experiment to set.

     Returns:
         Trace: The trace instance.
     """
     self._experiment = experiment
     return self
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def set_experiment(self, experiment: Dict[str, Any]) -> 'Trace':
"""
Set the experiment for the trace.
Args:
experiment (Dict[str, Any]): The experiment to set.
Returns:
Trace: The trace instance.
"""
self._experiment = experiment
return self
def set_experiment(self, experiment: 'Experiment') -> 'Trace':
"""
Set the experiment for the trace.
Args:
experiment ('Experiment'): The experiment to set.
Returns:
Trace: The trace instance.
"""
self._experiment = experiment
return self
🤖 Prompt for AI Agents
In basalt/objects/trace.py around lines 172 to 184, the set_experiment method
accepts a Dict[str, Any] but assigns it directly to self._experiment, which
should be an Experiment object. To fix this, change the method to accept an
Experiment instance instead of a dictionary, and update the type hint
accordingly. Ensure that the assigned value to self._experiment is of type
Experiment to maintain type consistency and avoid errors.

Comment thread basalt/objects/trace.py Outdated
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (2)
basalt/sdk/monitorsdk.py (2)

6-10: Remove shadowed imports for Generation and Log

This duplicates the previous review issue which is still present. The imports on lines 6-7 shadow the ones on lines 9-10, causing Ruff F811 warnings.

-from ..ressources.monitor.generation_types import GenerationParams, Generation
-from ..ressources.monitor.log_types import LogParams, Log
+from ..ressources.monitor.generation_types import GenerationParams
+from ..ressources.monitor.log_types import LogParams
🧰 Tools
🪛 Ruff (0.11.9)

9-9: Redefinition of unused Generation from line 6

(F811)


10-10: Redefinition of unused Log from line 7

(F811)


27-42: ⚠️ Potential issue

Return type in docstring doesn't match annotation

The method annotation correctly returns Tuple[Optional[Exception], Optional[Experiment]], but the docstring incorrectly states it returns just Experiment. Also, the params type in docstring is incorrect.

    def create_experiment(
        self,
        feature_slug: str,
        params: ExperimentParams
    ) -> Tuple[Optional[Exception], Optional[Experiment]]:
        """
        Creates a new experiment for monitoring.

        Args:
            feature_slug (str): The feature slug for the experiment.
-           params (Dict[str, Any]): Parameters for the experiment.
+           params (ExperimentParams): Parameters for the experiment.

        Returns:
-           Experiment: A new Experiment instance.
+           Tuple[Optional[Exception], Optional[Experiment]]: A tuple containing an optional error and an optional Experiment instance.
        """
🧹 Nitpick comments (1)
basalt/sdk/monitorsdk.py (1)

48-48: TraceParams should have a default value of None

To be consistent with the code that handles None params, the type annotation should allow for None.

-       params: Optional[TraceParams] = None
+       params: Optional[Dict[str, Any]] = None

Alternatively, modify the implementation to properly handle None for TraceParams.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Cache: Disabled due to data retention organization setting
Knowledge Base: Disabled due to data retention organization setting

📥 Commits

Reviewing files that changed from the base of the PR and between aa38c40 and 06204e5.

📒 Files selected for processing (1)
  • basalt/sdk/monitorsdk.py (5 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
basalt/sdk/monitorsdk.py (7)
basalt/utils/protocols.py (6)
  • IApi (17-18)
  • ILogger (40-43)
  • monitor (38-38)
  • get (10-10)
  • get (30-30)
  • invoke (18-18)
basalt/ressources/monitor/trace_types.py (2)
  • TraceParams (26-38)
  • Trace (41-301)
basalt/ressources/monitor/experiment_types.py (2)
  • ExperimentParams (5-7)
  • Experiment (10-14)
basalt/objects/trace.py (7)
  • Trace (12-341)
  • experiment (102-104)
  • feature_slug (92-94)
  • name (47-49)
  • metadata (77-79)
  • evaluators (112-114)
  • evaluation_config (107-109)
basalt/endpoints/monitor/create_experiment.py (3)
  • Experiment (7-21)
  • CreateExperimentEndpoint (32-55)
  • CreateExperimentDTO (24-26)
basalt/objects/experiment.py (3)
  • Experiment (4-22)
  • feature_slug (17-18)
  • name (13-14)
basalt/utils/flusher.py (1)
  • Flusher (11-114)
🪛 Ruff (0.11.9)
basalt/sdk/monitorsdk.py

9-9: Redefinition of unused Generation from line 6

(F811)


10-10: Redefinition of unused Log from line 7

(F811)

🔇 Additional comments (2)
basalt/sdk/monitorsdk.py (2)

153-156: Support for experiment and evaluations data

The changes correctly implement new experiment-related functionality by adding support for experiments, evaluators, and evaluation configuration in trace creation. The implementation aligns with the changes to the Trace class as shown in the relevant code snippets.


158-158: Added logger to Trace construction

Good improvement to pass the logger to the Trace constructor, which allows for better error reporting within the Trace class.

Comment thread basalt/sdk/monitorsdk.py
Comment on lines 60 to +64
if params is None:
params = {}

trace_params = TraceParams(**params)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Inconsistent handling of TraceParams

The method first initializes params to an empty dict if None, but then immediately converts it to TraceParams. This is inconsistent with the type annotation.

    if params is None:
-       params = {}
-
-   trace_params = TraceParams(**params)
+       params = TraceParams()

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In basalt/sdk/monitorsdk.py around lines 60 to 64, the code initializes params
as an empty dictionary if None, but then converts it to a TraceParams object,
which conflicts with the expected type annotation. To fix this, ensure that
params is consistently treated as a TraceParams instance by either initializing
params as TraceParams() when None or adjusting the type annotation and usage to
match the dictionary input before conversion.

Comment thread basalt/sdk/monitorsdk.py
Comment on lines +114 to +122
dto = CreateExperimentDTO(
feature_slug=feature_slug,
name=params.get("name"),
)

# Call the API endpoint
err, result = self._api.invoke(CreateExperimentEndpoint, dto)

if err is None:
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

Verify error handling in _create_experiment

The method returns (None, Experiment) when successful and (err, None) when there's an error. This is a good pattern that follows the API protocol, but make sure all consumers of this method handle both cases correctly.


🏁 Script executed:

#!/bin/bash
# Find all usages of create_experiment to verify error handling
rg -A 5 "create_experiment\(" --type py

Length of output: 2223


Fix error handling in the create_experiment usage example

The only consumer of _create_experiment we found is the sample in the docstring at basalt/ressources/monitor/monitorsdk_types.py, which currently does:

experiment = basalt.monitor.create_experiment('user-query', {'name': 'my-experiment'})
# Create a trace and add it to the experiment
trace = basalt.monitor.create_trace('user-query', {'experiment': experiment})

This ignores the (err, experiment) tuple and will break if an error occurs. Please update the snippet to unpack and handle errors:

• File: basalt/ressources/monitor/monitorsdk_types.py

  • Replace the single assignment with:
- experiment = basalt.monitor.create_experiment('user-query', {'name': 'my-experiment'})
+ err, experiment = basalt.monitor.create_experiment('user-query', {'name': 'my-experiment'})
+ if err is not None:
+     # TODO: handle error (e.g., log or raise)
+ else:
+     # proceed with valid experiment
+     trace = basalt.monitor.create_trace('user-query', {'experiment': experiment})

This ensures every consumer properly checks for errors before using the returned Experiment.

🤖 Prompt for AI Agents
In basalt/ressources/monitor/monitorsdk_types.py around the usage example of
create_experiment, the code currently assigns the result directly to a variable
without unpacking the (err, experiment) tuple returned by create_experiment.
Update the example to unpack the tuple into err and experiment variables, then
add a conditional check to handle the error case before proceeding. This ensures
proper error handling consistent with the API protocol.

Comment thread basalt/sdk/monitorsdk.py
Comment on lines +99 to +125
def _create_experiment(
self,
feature_slug: str,
params: ExperimentParams
) -> Tuple[Optional[Exception], Optional[Experiment]]:
"""
Internal implementation for creating an experiment.

Args:
feature_slug (str): The feature slug for the experiment.
params (ExperimentParams): Parameters for the experiment.

Returns:
Experiment: A new Experiment instance.
"""
dto = CreateExperimentDTO(
feature_slug=feature_slug,
name=params.get("name"),
)

# Call the API endpoint
err, result = self._api.invoke(CreateExperimentEndpoint, dto)

if err is None:
return None, Experiment(result.experiment)

return err, None
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Incorrect usage of ExperimentParams and docstring inconsistency

There are several issues with the implementation:

  1. The docstring return type doesn't match the actual return type
  2. params.get("name") assumes ExperimentParams is a dictionary-like object, but it appears to be a dataclass based on the imports
    def _create_experiment(
        self,
        feature_slug: str,
        params: ExperimentParams
    ) -> Tuple[Optional[Exception], Optional[Experiment]]:
        """
        Internal implementation for creating an experiment.

        Args:
            feature_slug (str): The feature slug for the experiment.
            params (ExperimentParams): Parameters for the experiment.

        Returns:
-           Experiment: A new Experiment instance.
+           Tuple[Optional[Exception], Optional[Experiment]]: A tuple containing an optional error and an optional Experiment instance.
        """
        dto = CreateExperimentDTO(
            feature_slug=feature_slug,
-           name=params.get("name"),
+           name=params.name,
        )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def _create_experiment(
self,
feature_slug: str,
params: ExperimentParams
) -> Tuple[Optional[Exception], Optional[Experiment]]:
"""
Internal implementation for creating an experiment.
Args:
feature_slug (str): The feature slug for the experiment.
params (ExperimentParams): Parameters for the experiment.
Returns:
Experiment: A new Experiment instance.
"""
dto = CreateExperimentDTO(
feature_slug=feature_slug,
name=params.get("name"),
)
# Call the API endpoint
err, result = self._api.invoke(CreateExperimentEndpoint, dto)
if err is None:
return None, Experiment(result.experiment)
return err, None
def _create_experiment(
self,
feature_slug: str,
params: ExperimentParams
) -> Tuple[Optional[Exception], Optional[Experiment]]:
"""
Internal implementation for creating an experiment.
Args:
feature_slug (str): The feature slug for the experiment.
params (ExperimentParams): Parameters for the experiment.
Returns:
Tuple[Optional[Exception], Optional[Experiment]]: A tuple containing an optional error and an optional Experiment instance.
"""
dto = CreateExperimentDTO(
feature_slug=feature_slug,
name=params.name,
)
# Call the API endpoint
err, result = self._api.invoke(CreateExperimentEndpoint, dto)
if err is None:
return None, Experiment(result.experiment)
return err, None
🤖 Prompt for AI Agents
In basalt/sdk/monitorsdk.py around lines 99 to 125, the method
_create_experiment has a docstring that incorrectly states the return type as
Experiment instead of a tuple of Optional[Exception] and Optional[Experiment].
Also, the code incorrectly uses params.get("name") assuming ExperimentParams is
a dictionary, but it is a dataclass. Fix this by updating the docstring to
reflect the correct return type and access the name attribute directly from
params using params.name instead of params.get("name").

@MarquisG MarquisG merged commit 60864e8 into master May 19, 2025
1 of 7 checks passed
@MarquisG MarquisG deleted the feature/experiment branch May 19, 2025 06:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant