Underthesea is:
🌊 An Agentic AI Toolkit. Since v9.3.0, Underthesea is an open-source Agentic AI Toolkit with built-in Vietnamese NLP capabilities. It provides multi-provider AI Agent support and a suite of Python modules for Vietnamese Natural Language Processing.
🎁 Support Us! Every bit of support helps us achieve our goals. Thank you so much. 💝💝💝
$ pip install undertheseaMulti-provider AI Agent with zero external dependencies. Communicates with LLM APIs using only Python stdlib (urllib + json) — no openai, anthropic, or google-genai packages required.
Providers: OpenAI | Azure OpenAI | Anthropic Claude | Google Gemini
# Pick one provider:
$ export OPENAI_API_KEY=sk-...
# or Azure:
$ export AZURE_OPENAI_API_KEY=... && export AZURE_OPENAI_ENDPOINT=https://...
# or Anthropic:
$ export ANTHROPIC_API_KEY=sk-ant-...
# or Gemini:
$ export GOOGLE_API_KEY=...from underthesea.agent import Agent, LLM
agent = Agent(name="assistant", provider=LLM())
agent("Hello!")Each provider is its own class, following the Anthropic SDK pattern.
from underthesea.agent import Agent, OpenAI, AzureOpenAI, Anthropic, Gemini, LLM
# OpenAI
agent = Agent(name="bot", provider=OpenAI(api_key="sk-..."))
# Azure OpenAI
agent = Agent(name="bot", provider=AzureOpenAI(
api_key="...",
endpoint="https://my.openai.azure.com",
deployment="gpt-4",
))
# Anthropic Claude
agent = Agent(name="bot", provider=Anthropic(api_key="sk-ant-..."))
# Google Gemini
agent = Agent(name="bot", provider=Gemini(api_key="..."))
# Auto-detect from environment variables
agent = Agent(name="bot", provider=LLM())for chunk in agent.stream("Explain what an AI agent is"):
print(chunk, end="", flush=True)from underthesea.agent import Agent, Tool, OpenAI
def get_weather(location: str) -> dict:
"""Get current weather for a location."""
return {"location": location, "temp": 25, "condition": "sunny"}
agent = Agent(
name="assistant",
provider=OpenAI(),
tools=[Tool(get_weather)],
instruction="You are a helpful assistant.",
)
agent("What's the weather in Hanoi?")
# 'The weather in Hanoi is 25°C and sunny.'12 built-in tools: calculator, datetime, web search, wikipedia, file I/O, shell, python exec.
from underthesea.agent import Agent, default_tools, LLM
agent = Agent(name="assistant", provider=LLM(), tools=default_tools)
agent("Calculate sqrt(144) + 10")Long-running agents with context reset and structured handoff between sessions, following Anthropic harness patterns.
from underthesea.agent import Agent, Session, AzureOpenAI
agent = Agent(name="researcher", provider=AzureOpenAI(...))
session = Session(agent, progress_file="progress.json")
session.create_task("Analyze documents", [
"Read and classify documents",
"Summarize each group",
"Write final report",
])
session.run_until_complete(max_sessions=5)Every agent call is automatically traced to ~/.underthesea/traces/. Disable with UNDERTHESEA_TRACE_DISABLED=1.
from underthesea.agent import Agent, LangfuseTracer, calculator_tool
# Auto local trace (default) — zero config
agent = Agent(name="bot", tools=[calculator_tool])
agent("What is 2+2?")
# >> Trace [a1b2c3] bot
# |-- Generation: llm.chat #1 (gpt-4.1-mini) ... 1200ms | 100->18 tokens
# |-- Tool: tool.calculator ... 0ms
# |-- Generation: llm.chat #2 (gpt-4.1-mini) ... 800ms | 150->12 tokens
# << Trace [a1b2c3] [ok] 2000ms -> ~/.underthesea/traces/20260411_trace_a1b2c3.json
# Langfuse (pip install langfuse)
agent = Agent(name="bot", tools=[calculator_tool], tracer=LangfuseTracer())
# @trace decorator — nested functions become child spans
from underthesea.agent.trace import trace, LocalTracer
@trace(LocalTracer())
def pipeline(text):
return Agent(name="bot")(text) # auto-inherits trace contextunderthesea.agent
├── providers/
│ ├── OpenAI # api.openai.com
│ ├── AzureOpenAI # *.openai.azure.com
│ ├── Anthropic # api.anthropic.com
│ └── Gemini # generativelanguage.googleapis.com
├── trace/
│ ├── LocalTracer # JSON files to ~/.underthesea/traces/
│ ├── LangfuseTracer # Langfuse v4 observability
│ └── @trace # Decorator with auto-nesting
├── Agent # Tool calling loop + streaming
├── LLM # Auto-detect provider from env vars
├── Session # Multi-session orchestration
├── Tool # Function → tool wrapper
└── default_tools # 12 built-in tools
See full documentation at NLP.md.
| Pipeline | Usage |
|---|---|
| Sentence Segmentation | sent_tokenize(text) |
| Text Normalization | text_normalize(text) |
| Word Segmentation | word_tokenize(text) |
| POS Tagging | pos_tag(text) |
| Chunking | chunk(text) |
| Named Entity Recognition | ner(text) |
| Text Classification | classify(text) |
| Sentiment Analysis | sentiment(text) |
| Language Detection | lang_detect(text) |
| Dependency Parsing | dependency_parse(text) |
| Translation | translate(text) |
| Text-to-Speech | tts(text) |
from underthesea import word_tokenize, ner, sentiment
word_tokenize("Chàng trai 9X Quảng Trị khởi nghiệp từ nấm sò")
# ["Chàng trai", "9X", "Quảng Trị", "khởi nghiệp", "từ", "nấm", "sò"]
ner("Chưa tiết lộ lịch trình tới Việt Nam của Tổng thống Mỹ Donald Trump")
# [... ('Việt Nam', 'Np', 'B-NP', 'B-LOC'), ... ('Donald', 'Np', 'B-NP', 'B-PER'), ('Trump', 'Np', 'B-NP', 'I-PER')]
sentiment("Sản phẩm hơi nhỏ nhưng chất lượng tốt, đóng gói cẩn thận.")
# 'positive'Do you want to contribute with underthesea development? Great! Please read more details at Contributing Guide
If you found this project helpful and would like to support our work, you can just buy us a coffee ☕.
Your support is our biggest encouragement 🎁!

