Releases · cuga-project/cuga-agent

30 Mar 13:33

sami-marreed

v0.2.11

9272848

v0.2.11 Latest

Latest

What's Changed

fix: crypto fallback by @sami-marreed in #35
feat: manage experience as default, manual release, license fix, and deployment updates by @sami-marreed in #37
feat: add manual release workflow and LiteLLM connection by @sami-marreed in #40
feat: add deploy image workflow for IBM Cloud by @sami-marreed in #41
chore: release v0.2.10 by @github-actions[bot] in #42
feat: add optional OIDC/BFF authentication with user profile UI by @sami-marreed in #44
feat: add Vault and multi-backend secrets integration by @sami-marreed in #48
feat: add ui config for logo and improve login [skip ci] by @sami-marreed in #54
fix: Pydantic model support in code execution and variable handling by @sami-marreed in #56
Feat/openlit integration option by @offerakrabi in #52
fix: fix dependency vulnerabilities by @offerakrabi in #55
feat: enable authorization by @sami-marreed in #60
feat: add agent name customization and demo mode by @sami-marreed in #63
fix: address Dependabot high and moderate security issues by @sami-marreed in #64
fix: optimize image size by @sami-marreed in #68
feat: add deployment options for vault [skip ci] by @sami-marreed in #79
feat: support k8s auth method for vaults by @sami-marreed in #80
feat: improve authorization workflow by @sami-marreed in #92
feat(dev-process): add AI agent commands and streamline GitHub templates by @sami-marreed in #94
Fix that enables correct evaluation for appworld with policies by @Sergey-Zeltyn in #98
feat: add versioned image tagging to deploy-image-ghcr workflow by @sami-marreed in #106

New Contributors

@github-actions[bot] made their first contribution in #42
@offerakrabi made their first contribution in #52
@Sergey-Zeltyn made their first contribution in #98

Full Changelog: v0.2.9...v0.2.11

Contributors

Sergey-Zeltyn, offerakrabi, and sami-marreed

Assets 2

26 Feb 20:42

sami-marreed

v0.2.10

16fd08a

v0.2.10

Supervisor & Multi-Agent Orchestration

Supervisor SDK — Run multiple CUGA agents. A supervisor coordinates sub-agents over the A2A protocol so you can build multi-agent workflows without custom orchestration.
YAML configuration — Define supervisor workflows and sub-agent configs in YAML. No code changes needed to add or adjust agents.
SDK support — Use CugaAgent.invoke_supervisor() to run multi-agent flows from Python with a single call.

Manage Experience & Deployment

Manage mode — Default experience with a single UI for tools, policies, and configuration.
Carbon Chat integration — Chat interface for interacting with the agent.
Manage dashboard — Configure MCP tools, policies, and apps from the UI.
Helm charts — Deploy CUGA on Kubernetes with Helm (Docker Desktop, minikube, kind, GKE, EKS, AKS).
PostgreSQL + pgvector — Optional production storage for policies and embeddings.
Manual release workflow — GitHub Action for version bumps and releases with auto-generated notes.

Memory & Storage

Memory HTTP backend refactor — Improved memory service architecture.
Policy storage — Local and production backends with PostgreSQL support.
Filesystem sync for policies — Policies synced to the filesystem for version control and portability.

SDK & API

Tool call tracking — InvokeResult return type for tool call metadata.
Multi-turn by thread ID — Correct conversation threading in the SDK.
Output formatter SDK — Policy output formatter integration in the SDK.
Forced apps — Pin specific apps for the agent via configuration.

Improvements & Fixes

LangChain 1.0 — Upgrade to LangChain 1.0.
Security — Dependency updates (e.g. axios CVE-2026-25639).
Windows — UTF-8 encoding fixes in read_yaml_file().
Crypto fallback — More robust crypto handling.
Demo UX — Opens with ?mode=advanced, improved first-time CRM demo.
Policy DB reset — Fixed policy database reset behavior.

DevOps & CI

Deploy image workflow — Build and push UBI-based Docker images to IBM Cloud Container Registry.
Test workflow — Removed parallel tests from GitHub Actions.

Full Changelog: v0.2.9...release/v0.2.10

Assets 2

12 Jan 13:04

sami-marreed

v0.2.7

28bc971

v0.2.7

🎉 Major Features

🔐 Enterprise Policy System

CUGA now includes a comprehensive policy framework that enables enterprise-grade governance, safety, and compliance controls for AI agents. The policy system provides declarative, configurable policies that can guide, block, modify, or format agent behavior based on various triggers and conditions.

Policy Types

Policy Type	Purpose	Enterprise Value
Intent Guard	Block unauthorized actions	Data deletion prevention, access restrictions, compliance enforcement
Playbook	Standardize workflows	Onboarding, audit workflows, regulatory compliance
Tool Approval	Human oversight	Financial transactions, data modifications
Tool Guide	Domain knowledge	Compliance notes, domain context
Output Formatter	Format, redirect, govern outputs	Report generation, response routing, output masking

Key Capabilities

Multiple Trigger Types: Support for keyword, natural language (semantic), app, state, tool, and always triggers
Intelligent Matching: Uses embeddings and LLM-based conflict resolution for semantic policy matching
Priority System: Handles conflicts when multiple policies match, with Intent Guards having highest priority
Vector Storage: Uses Milvus for efficient policy retrieval and semantic search
Human-in-the-Loop: Tool Approval policies create interrupts for human approval before sensitive operations
Context-Aware: Uses PolicyContext to extract relevant information from agent state for policy evaluation

Integration Points

Policies are seamlessly integrated into the CugaLite graph execution flow:

prepare_tools_and_apps Node: Checks for Intent Guards, Playbooks, and Tool Guides before tool preparation
call_model Node: Checks for Tool Approval requirements after code generation
CugaLiteCallback Node: Applies Output Formatters to final responses

SDK Support

Policies can be configured programmatically via the Python SDK:

from cuga import CugaAgent

agent = CugaAgent(tools=[...])

# Add an Intent Guard
await agent.policies.add_intent_guard(
    name="Block Delete Operations",
    description="Prevents deletion of critical data",
    keywords=["delete", "remove", "erase"],
    response="Deletion operations are not permitted for security reasons.",
    priority=100
)

# Add a Playbook
await agent.policies.add_playbook(
    name="Budget Analysis Workflow",
    description="Multi-step process for analyzing financial budgets",
    natural_language_trigger=["When user asks to analyze their budget"],
    content="""# Budget Analysis Workflow
    ## Step 1: Calculate Total Expenses
    ...
    """,
    priority=50
)

Documentation

Policy System Guide: Comprehensive HTML documentation with interactive diagrams explaining policy matching, enactment, and integration
SDK Documentation: SDK Guide | Policies Guide

☁️ E2B Cloud Sandbox Integration

CUGA now supports E2B for cloud-based code execution in secure, ephemeral sandboxes. This provides better isolation than local execution while being faster than Docker/Podman containers.

Key Features

Cloud-Native Execution: Execute code in secure, isolated cloud sandboxes without requiring Docker/Podman
Flexible Sandbox Modes: Three modes for different use cases:
- per-session (default): One sandbox per conversation thread, cached for reuse
- single: Single shared sandbox across all threads (most cost-effective)
- per-call: New sandbox for each execution (most isolated, highest cost)
Automatic Caching: Per-session sandboxes are cached and reused, optimizing costs
TTL Management: Configurable idle timeout and max age for sandbox lifecycle management
Tool Integration: E2B sandboxes can call back to local API registry via ngrok tunneling

Benefits

✅ No Docker/Podman required
✅ Faster than container-based sandboxing
✅ Cloud-native with automatic scaling
✅ Better isolation than local execution
✅ Supports per-session caching for cost optimization

Configuration

Enable E2B in ./src/cuga/settings.toml:

[advanced_features]
e2b_sandbox = true
e2b_sandbox_mode = "per-session"  # Options: "per-session" | "single" | "per-call"
e2b_sandbox_idle_ttl = 600  # Idle timeout in seconds (default: 10 min)
e2b_sandbox_max_age = 86400  # Max age for "single" mode in seconds (24h)
e2b_sandbox_ttl_buffer = 60  # Safety buffer in seconds before E2B timeout
e2b_cleanup_on_create = true  # Run periodic cleanup when creating new sandboxes
e2b_cleanup_frequency = 0  # Check all sandboxes every N get_or_create calls

Setup Requirements

E2B API Key: Sign up at e2b.dev and create an API key
E2B Template: Create the cuga-langchain template using E2B CLI
Registry Exposure: Expose local API registry via ngrok for tool execution
Dependencies: Install with uv sync --group e2b

Usage

Once configured, E2B automatically executes code in cloud sandboxes. You'll see logs indicating "CODE SENT TO E2B SANDBOX" when E2B is active.

Note: E2B is a paid service with a free tier. Check e2b.dev/pricing for details.

🔧 Configuration Updates

Policy System Configuration

Enable/disable the policy system in ./src/cuga/settings.toml:

[policy]
enabled = true  # Enable/disable policy system
playbook_refine = false  # Enable playbook refinement based on user progress

E2B Configuration

New settings in ./src/cuga/settings.toml:

[advanced_features]
e2b_sandbox = false  # Enable E2B cloud sandbox
e2b_sandbox_mode = "per-session"  # Sandbox lifecycle mode
e2b_sandbox_idle_ttl = 600  # Idle timeout in seconds
e2b_sandbox_max_age = 86400  # Max age for single mode
e2b_sandbox_ttl_buffer = 60  # Safety buffer before timeout
e2b_cleanup_on_create = true  # Enable cleanup on sandbox creation
e2b_cleanup_frequency = 0  # Cleanup frequency (0 = only on create)

[server_ports]
function_call_host = ""  # ngrok URL for exposing registry to E2B

📚 Documentation

Updated Documentation

README.md: Updated with Policy System and E2B sections
SDK Documentation: Enhanced with policy management examples
Configuration Guides: Added E2B setup instructions

🧪 Testing

Policy System Tests

Comprehensive test suite in src/cuga/backend/cuga_graph/policy/tests/:

Intent Guard: Blocking behavior, priority resolution, multiple guard scenarios
Playbook: Guidance injection, plan refinement, workflow execution
Tool Approval: Human-in-the-loop approval flows (approve/deny)
Tool Guide: Context enhancement and metadata injection
Output Formatter: Response formatting and routing
NL Trigger Conflict Resolution: Embedding-based similarity search with LLM conflict resolution
Embedding Similarity: Vector search, policy matching, threshold validation
Keyword Operators: AND/OR logic, case sensitivity, multi-keyword matching

E2B Integration Tests

Direct E2B Tests: src/cuga/backend/cuga_graph/nodes/cuga_lite/executors/tests/test_e2b_direct.py
E2B Lite Tests: src/cuga/backend/cuga_graph/nodes/cuga_lite/executors/tests/test_e2b_lite.py
Code Executor Tests: Local sandbox and E2B execution scenarios

🚀 Migration Guide

Enabling Policies

Ensure policy system is enabled in settings.toml:
```
[policy]
enabled = true
```
Policies are automatically checked during agent execution - no code changes required
To add policies programmatically, use the SDK:
```
await agent.policies.add_intent_guard(...)
```

Enabling E2B

Install E2B dependencies:
```
uv sync --group e2b
```
Set up E2B API key and template (see README.md for detailed instructions)
Configure ngrok tunnel for registry exposure
Enable in settings.toml:
```
[advanced_features]
e2b_sandbox = true
```

📊 Performance & Benchmarks

Policy System

Matching Performance: Keyword triggers provide fast exact matching, while NL triggers use vector search for semantic understanding
Storage: Milvus vector database enables efficient policy retrieval and similarity search
Conflict Resolution: LLM-based conflict resolution ensures accurate policy selection when multiple policies match

E2B Integration

Execution Speed: Faster than Docker/Podman containers while providing better isolation
Cost Optimization: Per-session caching reduces sandbox creation costs
Scalability: Cloud-native architecture with automatic scaling

🙏 Acknowledgments

Special thanks to the CUGA community for feedback, testing, and contributions that made these features possible.

📝 Notes

Policy system requires Milvus for vector storage (supports both Milvus server and Milvus Lite)
E2B requires an active API key and ngrok for registry exposure
Both features are optional and can be enabled/disabled via configuration

For detailed setup instructions, see the README.md and SDK Documentation.

Assets 2

0 Join discussion

11 Dec 13:46

sami-marreed

v0.2.0

bf226a2

v0.2.0

🚀 CUGA v0.2.0 is Here!
We're excited to announce CUGA v0.2.0, packed with powerful new features and improvements!

🎯 Release Summary

This release brings significant improvements to CUGA's UI/UX, sandbox capabilities, evaluation framework, and LLM provider support. Key highlights include E2B sandbox integration for enhanced code execution isolation, OpenRouter API support, improved concurrent request handling, and comprehensive UI enhancements.

✨ Major Features

🔧 Sandbox & Execution

E2B sandbox integration (#157) - Added secure cloud-based code execution environment with E2B Code Interpreter support
Dockerized CUGA with CRM demo (#147) - Complete containerization support for easier deployment and testing

🤖 LLM Support

OpenRouter API support - New LLM provider integration expanding model accessibility

🎨 UI/UX Improvements

UI landing page & proxy call for registry (#158) - New landing page interface and improved registry proxy handling
Improved UI & stabilized agent (#160) - Enhanced user interface with better stability
Debug panel & thread handling fixes (#154) - Added debugging capabilities and fixed thread management issues
Filesystem autocomplete - Improved file path input with autocomplete functionality

📊 Evaluation & Monitoring

Cost metric tracking - Added cost monitoring to evaluation tracker
Langfuse metrics integration (#151) - Enhanced evaluation with Langfuse observability metrics

🧠 Memory System

Memory demo use case - Added demonstration of memory-enabled CUGA capabilities

🐛 Bug Fixes

Fixed UI improvements and proxy bugs
Resolved concurrent request handling issues in server
Fixed stop handling with concurrent users (#150)
Fixed variable manager migration to agent state (#145)
Resolved evaluation import statement issues (#152)
Fixed browser prompts to remove domain-specific references
Fixed auth manager to use correct port from settings (#141)

📚 Documentation

Updated README with latest features and examples (#156)
Improved documentation for setup and configuration

🔄 Improvements

Concurrent request handling (#148) - Enhanced support for multiple simultaneous users
Package version updates (#142) - Updated dependencies to latest stable versions
Better thread handling and debugging capabilities
Improved proxy call mechanisms for tool registry

🛠️ Technical Details

New Dependencies:

e2b-code-interpreter>=2.4.1 - For E2B sandbox integration

Contributors:
Special thanks to @samimarreed, @haroldship, @idoLevy, @offerakrabi, @Gaodanfang, @aviyaeli and @segevshlomov for their contributions to this release.

📦 Installation & Upgrade

# Upgrade to latest version
uv pip install --upgrade cuga

# Or install with specific extras
uv sync --group sandbox  # For sandbox support
uv sync --group memory   # For memory features