Skip to content

undertheseanlp/underthesea

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1,183 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation




Open-source Agentic AI Toolkit

Underthesea is:

🌊 An Agentic AI Toolkit. Since v9.3.0, Underthesea is an open-source Agentic AI Toolkit with built-in Vietnamese NLP capabilities. It provides multi-provider AI Agent support and a suite of Python modules for Vietnamese Natural Language Processing.

🎁 Support Us! Every bit of support helps us achieve our goals. Thank you so much. 💝💝💝

Installation

$ pip install underthesea

Agent

Multi-provider AI Agent with zero external dependencies. Communicates with LLM APIs using only Python stdlib (urllib + json) — no openai, anthropic, or google-genai packages required.

Providers: OpenAI | Azure OpenAI | Anthropic Claude | Google Gemini

Quick Start

# Pick one provider:
$ export OPENAI_API_KEY=sk-...
# or Azure:
$ export AZURE_OPENAI_API_KEY=... && export AZURE_OPENAI_ENDPOINT=https://...
# or Anthropic:
$ export ANTHROPIC_API_KEY=sk-ant-...
# or Gemini:
$ export GOOGLE_API_KEY=...
from underthesea.agent import Agent, LLM

agent = Agent(name="assistant", provider=LLM())
agent("Hello!")

Providers

Each provider is its own class, following the Anthropic SDK pattern.

from underthesea.agent import Agent, OpenAI, AzureOpenAI, Anthropic, Gemini, LLM

# OpenAI
agent = Agent(name="bot", provider=OpenAI(api_key="sk-..."))

# Azure OpenAI
agent = Agent(name="bot", provider=AzureOpenAI(
    api_key="...",
    endpoint="https://my.openai.azure.com",
    deployment="gpt-4",
))

# Anthropic Claude
agent = Agent(name="bot", provider=Anthropic(api_key="sk-ant-..."))

# Google Gemini
agent = Agent(name="bot", provider=Gemini(api_key="..."))

# Auto-detect from environment variables
agent = Agent(name="bot", provider=LLM())

Streaming

for chunk in agent.stream("Explain what an AI agent is"):
    print(chunk, end="", flush=True)

Tool Calling

from underthesea.agent import Agent, Tool, OpenAI

def get_weather(location: str) -> dict:
    """Get current weather for a location."""
    return {"location": location, "temp": 25, "condition": "sunny"}

agent = Agent(
    name="assistant",
    provider=OpenAI(),
    tools=[Tool(get_weather)],
    instruction="You are a helpful assistant.",
)

agent("What's the weather in Hanoi?")
# 'The weather in Hanoi is 25°C and sunny.'

Default Tools

12 built-in tools: calculator, datetime, web search, wikipedia, file I/O, shell, python exec.

from underthesea.agent import Agent, default_tools, LLM

agent = Agent(name="assistant", provider=LLM(), tools=default_tools)
agent("Calculate sqrt(144) + 10")

Multi-Session

Long-running agents with context reset and structured handoff between sessions, following Anthropic harness patterns.

from underthesea.agent import Agent, Session, AzureOpenAI

agent = Agent(name="researcher", provider=AzureOpenAI(...))
session = Session(agent, progress_file="progress.json")
session.create_task("Analyze documents", [
    "Read and classify documents",
    "Summarize each group",
    "Write final report",
])
session.run_until_complete(max_sessions=5)

Tracing

Every agent call is automatically traced to ~/.underthesea/traces/. Disable with UNDERTHESEA_TRACE_DISABLED=1.

from underthesea.agent import Agent, LangfuseTracer, calculator_tool

# Auto local trace (default) — zero config
agent = Agent(name="bot", tools=[calculator_tool])
agent("What is 2+2?")
# >> Trace [a1b2c3] bot
#    |-- Generation: llm.chat #1 (gpt-4.1-mini) ... 1200ms | 100->18 tokens
#    |-- Tool: tool.calculator ... 0ms
#    |-- Generation: llm.chat #2 (gpt-4.1-mini) ... 800ms | 150->12 tokens
# << Trace [a1b2c3] [ok] 2000ms -> ~/.underthesea/traces/20260411_trace_a1b2c3.json

# Langfuse (pip install langfuse)
agent = Agent(name="bot", tools=[calculator_tool], tracer=LangfuseTracer())

# @trace decorator — nested functions become child spans
from underthesea.agent.trace import trace, LocalTracer

@trace(LocalTracer())
def pipeline(text):
    return Agent(name="bot")(text)  # auto-inherits trace context

Architecture

underthesea.agent
├── providers/
│   ├── OpenAI          # api.openai.com
│   ├── AzureOpenAI     # *.openai.azure.com
│   ├── Anthropic       # api.anthropic.com
│   └── Gemini          # generativelanguage.googleapis.com
├── trace/
│   ├── LocalTracer     # JSON files to ~/.underthesea/traces/
│   ├── LangfuseTracer  # Langfuse v4 observability
│   └── @trace          # Decorator with auto-nesting
├── Agent               # Tool calling loop + streaming
├── LLM                 # Auto-detect provider from env vars
├── Session             # Multi-session orchestration
├── Tool                # Function → tool wrapper
└── default_tools       # 12 built-in tools

Vietnamese NLP

See full documentation at NLP.md.

Pipeline Usage
Sentence Segmentation sent_tokenize(text)
Text Normalization text_normalize(text)
Word Segmentation word_tokenize(text)
POS Tagging pos_tag(text)
Chunking chunk(text)
Named Entity Recognition ner(text)
Text Classification classify(text)
Sentiment Analysis sentiment(text)
Language Detection lang_detect(text)
Dependency Parsing dependency_parse(text)
Translation translate(text)
Text-to-Speech tts(text)
from underthesea import word_tokenize, ner, sentiment

word_tokenize("Chàng trai 9X Quảng Trị khởi nghiệp từ nấm sò")
# ["Chàng trai", "9X", "Quảng Trị", "khởi nghiệp", "từ", "nấm", "sò"]

ner("Chưa tiết lộ lịch trình tới Việt Nam của Tổng thống Mỹ Donald Trump")
# [... ('Việt Nam', 'Np', 'B-NP', 'B-LOC'), ... ('Donald', 'Np', 'B-NP', 'B-PER'), ('Trump', 'Np', 'B-NP', 'I-PER')]

sentiment("Sản phẩm hơi nhỏ nhưng chất lượng tốt, đóng gói cẩn thận.")
# 'positive'

Contributing

Do you want to contribute with underthesea development? Great! Please read more details at Contributing Guide

💝 Support Us

If you found this project helpful and would like to support our work, you can just buy us a coffee ☕.

Your support is our biggest encouragement 🎁!

Sponsor this project

Packages

 
 
 

Contributors