Skip to content

Releases: cuga-project/cuga-agent

v0.2.11

30 Mar 13:33

Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v0.2.9...v0.2.11

v0.2.10

26 Feb 20:42

Choose a tag to compare

Supervisor & Multi-Agent Orchestration

  • Supervisor SDK — Run multiple CUGA agents. A supervisor coordinates sub-agents over the A2A protocol so you can build multi-agent workflows without custom orchestration.
  • YAML configuration — Define supervisor workflows and sub-agent configs in YAML. No code changes needed to add or adjust agents.
  • SDK support — Use CugaAgent.invoke_supervisor() to run multi-agent flows from Python with a single call.

Manage Experience & Deployment

  • Manage mode — Default experience with a single UI for tools, policies, and configuration.
  • Carbon Chat integration — Chat interface for interacting with the agent.
  • Manage dashboard — Configure MCP tools, policies, and apps from the UI.
  • Helm charts — Deploy CUGA on Kubernetes with Helm (Docker Desktop, minikube, kind, GKE, EKS, AKS).
  • PostgreSQL + pgvector — Optional production storage for policies and embeddings.
  • Manual release workflow — GitHub Action for version bumps and releases with auto-generated notes.

Memory & Storage

  • Memory HTTP backend refactor — Improved memory service architecture.
  • Policy storage — Local and production backends with PostgreSQL support.
  • Filesystem sync for policies — Policies synced to the filesystem for version control and portability.

SDK & API

  • Tool call trackingInvokeResult return type for tool call metadata.
  • Multi-turn by thread ID — Correct conversation threading in the SDK.
  • Output formatter SDK — Policy output formatter integration in the SDK.
  • Forced apps — Pin specific apps for the agent via configuration.

Improvements & Fixes

  • LangChain 1.0 — Upgrade to LangChain 1.0.
  • Security — Dependency updates (e.g. axios CVE-2026-25639).
  • Windows — UTF-8 encoding fixes in read_yaml_file().
  • Crypto fallback — More robust crypto handling.
  • Demo UX — Opens with ?mode=advanced, improved first-time CRM demo.
  • Policy DB reset — Fixed policy database reset behavior.

DevOps & CI

  • Deploy image workflow — Build and push UBI-based Docker images to IBM Cloud Container Registry.
  • Test workflow — Removed parallel tests from GitHub Actions.

Full Changelog: v0.2.9...release/v0.2.10

v0.2.7

12 Jan 13:04

Choose a tag to compare

🎉 Major Features

🔐 Enterprise Policy System

CUGA now includes a comprehensive policy framework that enables enterprise-grade governance, safety, and compliance controls for AI agents. The policy system provides declarative, configurable policies that can guide, block, modify, or format agent behavior based on various triggers and conditions.

Policy Types

Policy Type Purpose Enterprise Value
Intent Guard Block unauthorized actions Data deletion prevention, access restrictions, compliance enforcement
Playbook Standardize workflows Onboarding, audit workflows, regulatory compliance
Tool Approval Human oversight Financial transactions, data modifications
Tool Guide Domain knowledge Compliance notes, domain context
Output Formatter Format, redirect, govern outputs Report generation, response routing, output masking

Key Capabilities

  • Multiple Trigger Types: Support for keyword, natural language (semantic), app, state, tool, and always triggers
  • Intelligent Matching: Uses embeddings and LLM-based conflict resolution for semantic policy matching
  • Priority System: Handles conflicts when multiple policies match, with Intent Guards having highest priority
  • Vector Storage: Uses Milvus for efficient policy retrieval and semantic search
  • Human-in-the-Loop: Tool Approval policies create interrupts for human approval before sensitive operations
  • Context-Aware: Uses PolicyContext to extract relevant information from agent state for policy evaluation

Integration Points

Policies are seamlessly integrated into the CugaLite graph execution flow:

  1. prepare_tools_and_apps Node: Checks for Intent Guards, Playbooks, and Tool Guides before tool preparation
  2. call_model Node: Checks for Tool Approval requirements after code generation
  3. CugaLiteCallback Node: Applies Output Formatters to final responses

SDK Support

Policies can be configured programmatically via the Python SDK:

from cuga import CugaAgent

agent = CugaAgent(tools=[...])

# Add an Intent Guard
await agent.policies.add_intent_guard(
    name="Block Delete Operations",
    description="Prevents deletion of critical data",
    keywords=["delete", "remove", "erase"],
    response="Deletion operations are not permitted for security reasons.",
    priority=100
)

# Add a Playbook
await agent.policies.add_playbook(
    name="Budget Analysis Workflow",
    description="Multi-step process for analyzing financial budgets",
    natural_language_trigger=["When user asks to analyze their budget"],
    content="""# Budget Analysis Workflow
    ## Step 1: Calculate Total Expenses
    ...
    """,
    priority=50
)

Documentation

  • Policy System Guide: Comprehensive HTML documentation with interactive diagrams explaining policy matching, enactment, and integration
  • SDK Documentation: SDK Guide | Policies Guide

☁️ E2B Cloud Sandbox Integration

CUGA now supports E2B for cloud-based code execution in secure, ephemeral sandboxes. This provides better isolation than local execution while being faster than Docker/Podman containers.

Key Features

  • Cloud-Native Execution: Execute code in secure, isolated cloud sandboxes without requiring Docker/Podman
  • Flexible Sandbox Modes: Three modes for different use cases:
    • per-session (default): One sandbox per conversation thread, cached for reuse
    • single: Single shared sandbox across all threads (most cost-effective)
    • per-call: New sandbox for each execution (most isolated, highest cost)
  • Automatic Caching: Per-session sandboxes are cached and reused, optimizing costs
  • TTL Management: Configurable idle timeout and max age for sandbox lifecycle management
  • Tool Integration: E2B sandboxes can call back to local API registry via ngrok tunneling

Benefits

  • ✅ No Docker/Podman required
  • ✅ Faster than container-based sandboxing
  • ✅ Cloud-native with automatic scaling
  • ✅ Better isolation than local execution
  • ✅ Supports per-session caching for cost optimization

Configuration

Enable E2B in ./src/cuga/settings.toml:

[advanced_features]
e2b_sandbox = true
e2b_sandbox_mode = "per-session"  # Options: "per-session" | "single" | "per-call"
e2b_sandbox_idle_ttl = 600  # Idle timeout in seconds (default: 10 min)
e2b_sandbox_max_age = 86400  # Max age for "single" mode in seconds (24h)
e2b_sandbox_ttl_buffer = 60  # Safety buffer in seconds before E2B timeout
e2b_cleanup_on_create = true  # Run periodic cleanup when creating new sandboxes
e2b_cleanup_frequency = 0  # Check all sandboxes every N get_or_create calls

Setup Requirements

  1. E2B API Key: Sign up at e2b.dev and create an API key
  2. E2B Template: Create the cuga-langchain template using E2B CLI
  3. Registry Exposure: Expose local API registry via ngrok for tool execution
  4. Dependencies: Install with uv sync --group e2b

Usage

Once configured, E2B automatically executes code in cloud sandboxes. You'll see logs indicating "CODE SENT TO E2B SANDBOX" when E2B is active.

Note: E2B is a paid service with a free tier. Check e2b.dev/pricing for details.


🔧 Configuration Updates

Policy System Configuration

Enable/disable the policy system in ./src/cuga/settings.toml:

[policy]
enabled = true  # Enable/disable policy system
playbook_refine = false  # Enable playbook refinement based on user progress

E2B Configuration

New settings in ./src/cuga/settings.toml:

[advanced_features]
e2b_sandbox = false  # Enable E2B cloud sandbox
e2b_sandbox_mode = "per-session"  # Sandbox lifecycle mode
e2b_sandbox_idle_ttl = 600  # Idle timeout in seconds
e2b_sandbox_max_age = 86400  # Max age for single mode
e2b_sandbox_ttl_buffer = 60  # Safety buffer before timeout
e2b_cleanup_on_create = true  # Enable cleanup on sandbox creation
e2b_cleanup_frequency = 0  # Cleanup frequency (0 = only on create)

[server_ports]
function_call_host = ""  # ngrok URL for exposing registry to E2B

📚 Documentation

Updated Documentation

  • README.md: Updated with Policy System and E2B sections
  • SDK Documentation: Enhanced with policy management examples
  • Configuration Guides: Added E2B setup instructions

🧪 Testing

Policy System Tests

Comprehensive test suite in src/cuga/backend/cuga_graph/policy/tests/:

  • Intent Guard: Blocking behavior, priority resolution, multiple guard scenarios
  • Playbook: Guidance injection, plan refinement, workflow execution
  • Tool Approval: Human-in-the-loop approval flows (approve/deny)
  • Tool Guide: Context enhancement and metadata injection
  • Output Formatter: Response formatting and routing
  • NL Trigger Conflict Resolution: Embedding-based similarity search with LLM conflict resolution
  • Embedding Similarity: Vector search, policy matching, threshold validation
  • Keyword Operators: AND/OR logic, case sensitivity, multi-keyword matching

E2B Integration Tests

  • Direct E2B Tests: src/cuga/backend/cuga_graph/nodes/cuga_lite/executors/tests/test_e2b_direct.py
  • E2B Lite Tests: src/cuga/backend/cuga_graph/nodes/cuga_lite/executors/tests/test_e2b_lite.py
  • Code Executor Tests: Local sandbox and E2B execution scenarios

🚀 Migration Guide

Enabling Policies

  1. Ensure policy system is enabled in settings.toml:

    [policy]
    enabled = true
  2. Policies are automatically checked during agent execution - no code changes required

  3. To add policies programmatically, use the SDK:

    await agent.policies.add_intent_guard(...)

Enabling E2B

  1. Install E2B dependencies:

    uv sync --group e2b
  2. Set up E2B API key and template (see README.md for detailed instructions)

  3. Configure ngrok tunnel for registry exposure

  4. Enable in settings.toml:

    [advanced_features]
    e2b_sandbox = true

📊 Performance & Benchmarks

Policy System

  • Matching Performance: Keyword triggers provide fast exact matching, while NL triggers use vector search for semantic understanding
  • Storage: Milvus vector database enables efficient policy retrieval and similarity search
  • Conflict Resolution: LLM-based conflict resolution ensures accurate policy selection when multiple policies match

E2B Integration

  • Execution Speed: Faster than Docker/Podman containers while providing better isolation
  • Cost Optimization: Per-session caching reduces sandbox creation costs
  • Scalability: Cloud-native architecture with automatic scaling

🙏 Acknowledgments

Special thanks to the CUGA community for feedback, testing, and contributions that made these features possible.


📝 Notes

  • Policy system requires Milvus for vector storage (supports both Milvus server and Milvus Lite)
  • E2B requires an active API key and ngrok for registry exposure
  • Both features are optional and can be enabled/disabled via configuration

For detailed setup instructions, see the README.md and SDK Documentation.

v0.2.0

11 Dec 13:46

Choose a tag to compare

🚀 CUGA v0.2.0 is Here!
We're excited to announce CUGA v0.2.0, packed with powerful new features and improvements!

🎯 Release Summary

This release brings significant improvements to CUGA's UI/UX, sandbox capabilities, evaluation framework, and LLM provider support. Key highlights include E2B sandbox integration for enhanced code execution isolation, OpenRouter API support, improved concurrent request handling, and comprehensive UI enhancements.


Major Features

🔧 Sandbox & Execution

  • E2B sandbox integration (#157) - Added secure cloud-based code execution environment with E2B Code Interpreter support
  • Dockerized CUGA with CRM demo (#147) - Complete containerization support for easier deployment and testing

🤖 LLM Support

  • OpenRouter API support - New LLM provider integration expanding model accessibility

🎨 UI/UX Improvements

  • UI landing page & proxy call for registry (#158) - New landing page interface and improved registry proxy handling
  • Improved UI & stabilized agent (#160) - Enhanced user interface with better stability
  • Debug panel & thread handling fixes (#154) - Added debugging capabilities and fixed thread management issues
  • Filesystem autocomplete - Improved file path input with autocomplete functionality

📊 Evaluation & Monitoring

  • Cost metric tracking - Added cost monitoring to evaluation tracker
  • Langfuse metrics integration (#151) - Enhanced evaluation with Langfuse observability metrics

🧠 Memory System

  • Memory demo use case - Added demonstration of memory-enabled CUGA capabilities

🐛 Bug Fixes

  • Fixed UI improvements and proxy bugs
  • Resolved concurrent request handling issues in server
  • Fixed stop handling with concurrent users (#150)
  • Fixed variable manager migration to agent state (#145)
  • Resolved evaluation import statement issues (#152)
  • Fixed browser prompts to remove domain-specific references
  • Fixed auth manager to use correct port from settings (#141)

📚 Documentation

  • Updated README with latest features and examples (#156)
  • Improved documentation for setup and configuration

🔄 Improvements

  • Concurrent request handling (#148) - Enhanced support for multiple simultaneous users
  • Package version updates (#142) - Updated dependencies to latest stable versions
  • Better thread handling and debugging capabilities
  • Improved proxy call mechanisms for tool registry

🛠️ Technical Details

New Dependencies:

  • e2b-code-interpreter>=2.4.1 - For E2B sandbox integration

Contributors:
Special thanks to @samimarreed, @haroldship, @idoLevy, @offerakrabi, @Gaodanfang, @aviyaeli and @segevshlomov for their contributions to this release.


📦 Installation & Upgrade

# Upgrade to latest version
uv pip install --upgrade cuga

# Or install with specific extras
uv sync --group sandbox  # For sandbox support
uv sync --group memory   # For memory features

🔗 Resources


This discussion was created from the release v0.2.0.

v0.1.11

11 Dec 13:12

Choose a tag to compare

Full Changelog: v0.1.10...v0.1.11

v0.1.10

22 Nov 18:37

Choose a tag to compare

Full Changelog: v0.1.9...v0.1.10

v0.1.7

14 Nov 13:43

Choose a tag to compare

Full Changelog: v0.1.6...v0.1.7

v0.1.4

11 Nov 15:16

Choose a tag to compare

Full Changelog: v0.1.3...v0.1.4

v0.1.3

04 Nov 17:26

Choose a tag to compare

Full Changelog: v0.1.1...v0.1.3

v0.1.1

04 Nov 17:25

Choose a tag to compare