HinksBot Gateway

██╗  ██╗███████╗███████╗████████╗    ████████╗███████╗██████╗ ███╗   ███╗
██║  ██║██╔════╝██╔════╝╚══██╔══╝    ╚══██╔══╝██╔════╝██╔══██╗████╗ ████║
███████║█████╗  █████╗     ██║          ██║   █████╗  ██████╔╝██╔████╔██║
██╔══██║██╔══╝  ██╔══╝     ██║          ██║   ██╔══╝  ██╔══██╗██║╚██╔╝██║
██║  ██║███████╗███████╗   ██║          ██║   ███████╗██║  ██║██║ ╚═╝ ██║
╚═╝  ╚═╝╚══════╝╚══════╝   ╚═╝          ╚═╝   ╚══════╝╚═╝  ╚═╝╚═╝     ╚═╝

Self-hosted AI gateway platform. Chat with your 122B model, call tools, manage sessions, and build persistent memory — entirely on your own hardware.

Recent Changes

2026-04-19 — Session Manager Fixes:

Session manager now always gets a fresh SQLite session per operation, eliminating "prepared state" errors from transaction nesting
WebSocket chat now auto-creates sessions if they don't exist instead of returning an error

What It Does

HinksBot Gateway is a self-hosted AI platform that turns a raw llama-server endpoint into a full-featured conversational workspace — streaming responses, tool calling, conversation memory, session management, and a polished web UI. Everything runs locally on your hardware.

Architecture

┌──────────────────────────────────────────────────────────────┐
│  Frontend (React + Vite + Tailwind, port 5173)               │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────────┐  │
│  │  ChatView    │  │  Sidebar     │  │  SettingsPanel   │  │
│  │  (streaming) │  │  (sessions)  │  │  (model/config)  │  │
│  └──────┬───────┘  └──────┬───────┘  └────────┬─────────┘  │
└─────────┼─────────────────┼────────────────────┼────────────┘
          │  WebSocket + REST API                 │
          ▼                                        ▼
┌──────────────────────────────────────────────────────────────┐
│  Backend (FastAPI + Uvicorn, port 8000)                      │
│  ┌──────────────┐  ┌──────────────┐  ┌──────────────────┐  │
│  │ /ws/chat/*   │  │ /sessions/*  │  │ /tools/*         │  │
│  │ Agent loop   │  │ CRUD ops     │  │ Registry         │  │
│  └──────┬───────┘  └──────┬───────┘  └────────┬─────────┘  │
│         │                 │                     │             │
│  ┌──────▼─────────────────▼─────────────────────▼─────────┐  │
│  │  SessionManager  │  ToolRegistry  │  LlamaServerClient │  │
│  │  (SQLite)        │  (hot-reload)  │  (HTTP to :8080)   │  │
│  └────────────────────────┬───────────────────────────────┘  │
└───────────────────────────┼──────────────────────────────────┘
                            │ HTTP (port 8080)
                            ▼
                 ┌──────────────────────┐
                 │   llama-server       │
                 │   (Qwen 122B Q4)     │
                 │   Context: 262K      │
                 └──────────────────────┘

Features

Streaming Chat — Real-time token-by-token responses via WebSocket
Tool Calling — Model-driven tool execution with server-side execution and result injection
Session Management — Create, resume, search, rename, and delete conversation sessions
Persistent Memory — SQLite-backed message history; full conversation context on reconnect
Dynamic Tool Registry — Hot-reloadable tools registered via @register_tool decorator
Session Search — Full-text search across all sessions and messages
Session Export — Export conversations as JSON or Markdown
Usage Stats — Track messages, tool calls, and session activity over time
Dark Theme UI — Terminal-aesthetic web interface (Linear meets WezTerm)
Configurable — All settings via config.yaml, no hardcoded values
Context Management — Automatic truncation to fit within context window

Quick Start

Prerequisites

Python 3.11+
Node.js 18+ and npm
A running llama-server instance at http://localhost:8080
~122B Q4_K_XL model or compatible GGUF model file

1. Backend Setup

cd /home/alexanderh/projects/hinksbot-gateway

# Create and activate virtual environment
python3 -m venv .venv
source .venv/bin/activate

# Install dependencies
pip install -r requirements.txt

2. Frontend Setup

cd /home/alexanderh/projects/hinksbot-gateway/frontend
npm install

3. Configuration

Edit config.yaml in the project root:

server:
  host: "0.0.0.0"
  port: 8000
  debug: false

llama_server:
  base_url: "http://localhost:8080"
  api_key: ""          # empty for local llama-server

database:
  path: "~/.hermes/hinksbot-gateway.db"

defaults:
  model: "Qwen3.5-122B-A10B-Opus-Reasoning-Q4_K_M.gguf"
  system_prompt: "You are HinksBot, a helpful AI assistant..."
  max_turns: 20
  context_window: 262144
  tool_groups:
    - file_tools
    - wiki_tools
    - system_tools

frontend:
  port: 5173
  title: "HinksBot Gateway"

4. Start Backend

source .venv/bin/activate
cd /home/alexanderh/projects/hinksbot-gateway
uvicorn backend.main:app --host 0.0.0.0 --port 8000 --reload

Backend starts at http://localhost:8000. API docs at http://localhost:8000/docs.

5. Start Frontend

cd /home/alexanderh/projects/hinksbot-gateway/frontend
npm run dev

Frontend available at http://localhost:5173.

Configuration Reference

All configuration lives in config.yaml at the project root.

Key	Type	Default	Description
`server.host`	string	`"0.0.0.0"`	Backend bind address
`server.port`	int	`8000`	Backend port
`server.debug`	bool	`false`	Enable debug mode (auto-reload)
`llama_server.base_url`	string	`"http://localhost:8080"`	llama-server HTTP endpoint
`llama_server.api_key`	string	`""`	Auth token (empty for local)
`database.path`	string	`"~/.hermes/hinksbot-gateway.db"`	SQLite database path
`defaults.model`	string	(GGUF filename)	Model to use
`defaults.system_prompt`	string	"...HinksBot..."	System prompt injected into every session
`defaults.max_turns`	int	`20`	Max agent loop iterations per message
`defaults.context_window`	int	`262144`	Context window size (tokens)
`defaults.tool_groups`	list[str]	`[]`	Default enabled tool groups
`frontend.port`	int	`5173`	Vite dev server port
`frontend.title`	string	`"HinksBot Gateway"`	Browser tab title

Project Structure

hinksbot-gateway/
├── config.yaml                  # All configuration (no hardcoding)
├── requirements.txt              # Python dependencies
├── backend/
│   ├── main.py                  # FastAPI app + lifespan + CORS
│   ├── config.py                # YAML config loader (Pydantic models)
│   └── api/
│       ├── sessions.py          # Session CRUD + export
│       ├── chat.py              # WebSocket handler + ConnectionManager
│       ├── tools.py             # Tool registry REST API
│       ├── models.py            # Model switching endpoints
│       ├── uploads.py           # File upload handling
│       └── stats.py             # Usage statistics
│   └── core/
│       ├── session_manager.py   # Agent loop (tool calls, history)
│       └── model_client.py     # llama-server HTTP client
│   └── db/
│       ├── database.py          # SQLAlchemy async engine + session
│       ├── models.py            # ORM: Session, Message, ToolExecution
│       └── init_db.py           # Schema creation
│   └── tools/
│       ├── registry.py          # ToolRegistry + @register_tool decorator
│       ├── file_tools.py        # file_read, file_write, search_files
│       ├── terminal_tools.py    # run_shell
│       ├── web_tools.py        # web_search, web_extract
│       ├── wiki_tools.py       # wiki_search, session_history_search
│       └── system_tools.py     # sys_metrics, list_processes
└── frontend/
    ├── package.json
    ├── vite.config.ts
    ├── tailwind.config.js
    └── src/
        ├── App.tsx
        ├── types/index.ts
        ├── hooks/
        │   ├── useChat.ts
        │   ├── useSessions.ts
        │   └── useTools.ts
        └── components/
            ├── ChatView.tsx
            ├── ChatInput.tsx
            ├── MessageBubble.tsx
            ├── ToolCallCard.tsx
            ├── SessionList.tsx
            ├── Sidebar.tsx
            ├── SettingsPanel.tsx
            └── StatsDashboard.tsx

API Reference

Base URL: http://localhost:8000/api/v1

Health Check

GET /api/v1/health
Response: { "status": "ok", "model": "Qwen3.5-122B..." }

REST Endpoints

Sessions

Method	Path	Description
`GET`	`/sessions`	List all sessions (paginated: `?page=1&limit=50`)
`POST`	`/sessions`	Create new session (`{title?, model?}`)
`GET`	`/sessions/{id}`	Get session metadata
`PATCH`	`/sessions/{id}`	Update session (`{title?, model?, tool_options?}`)
`DELETE`	`/sessions/{id}`	Delete session
`GET`	`/sessions/{id}/messages`	Get messages for session (`?page=1&limit=50`)
`GET`	`/sessions/{id}/export`	Export session (`?format=json\|md`)
`GET`	`/sessions/search?q=query`	Full-text session/message search

Tools

Method	Path	Description
`GET`	`/tools`	List all registered tools (`?group=file_tools`)
`GET`	`/tools/groups`	List all tool groups
`GET`	`/tools/{name}`	Get specific tool details
`GET`	`/tools/{name}/schema`	Get OpenAI-compatible tool schema
`POST`	`/tools/{name}/execute`	Execute tool directly (`{arguments}`)

Stats

Method	Path	Description
`GET`	`/stats`	Usage statistics (sessions, messages, tool calls)

WebSocket Protocol

Endpoint: /ws/chat/{session_id}

Connect, then send JSON messages. All messages are JSON objects.

Client -> Server

Send a message:

{
  "type": "user_message",
  "content": "What files changed in the last commit?",
  "tool_options": ["file_tools", "git_tools"]
}

Send tool result (after receiving a tool_call):

{
  "type": "tool_result",
  "tool_call_id": "call_abc123",
  "result": "23 files found...",
  "error": null
}

Server -> Client

Stream start:

{
  "type": "stream_start",
  "message_id": "msg_xyz789"
}

Token delta:

{
  "type": "content_block_delta",
  "delta": "Looking",
  "block_type": "text"
}

Tool call:

{
  "type": "tool_call",
  "tool": "search_files",
  "tool_call_id": "call_abc123",
  "args": {"pattern": "*.py", "path": "/home/alexanderh"}
}

Tool result:

{
  "type": "tool_result",
  "tool_call_id": "call_abc123",
  "result": "23 files found...",
  "success": true
}

Stream end:

{
  "type": "stream_end",
  "message_id": "msg_xyz789",
  "usage": {"turns": 3}
}

Error:

{
  "type": "error",
  "error": "model_unavailable",
  "message": "LLM endpoint not responding"
}

Tool Registry

All tools are registered via the @register_tool decorator. Each tool has a name, description, JSON Schema for arguments, handler function, and group memberships.

Available Tools

Tool	Groups	Description
`file_read`	`file`, `read`	Read file contents with line numbers and pagination
`file_write`	`file`, `write`	Write content to a file (overwrites)
`search_files`	`search`, `file`	Regex search inside files, or glob by filename
`run_shell`	`terminal`, `shell`	Execute a shell command in a specified working directory
`web_search`	`web`, `search`	Search the web, returns titles/URLs/descriptions
`web_extract`	`web`, `extract`	Extract full content from web pages as markdown
`wiki_search`	`wiki`, `memory`, `search`	Search the mempalace knowledge graph
`session_history_search`	`search`, `history`, `memory`	Full-text search over conversation history
`sys_metrics`	`system`, `metrics`, `monitoring`	CPU, RAM, and disk usage
`list_processes`	`system`, `processes`, `monitoring`	List running processes (like `ps aux`)

Tool Groups

Tools are organized into groups for session-level toggling:

file / read / write / search — File operations
terminal / shell — Shell execution
web / search / extract — Web access
wiki / memory — Knowledge graph / memory
system / metrics / processes / monitoring — System info

Adding a New Tool

from backend.tools.registry import register_tool

@register_tool(
    name="my_tool",
    description="Does something useful",
    parameters={
        "type": "object",
        "properties": {
            "arg1": {"type": "string", "description": "An argument"},
        },
        "required": ["arg1"],
    },
    groups=["my_group"],
)
async def my_tool(arg1: str) -> str:
    return f"did {arg1}"

The tool is automatically registered on server startup when backend/tools/__init__.py imports all tool modules.

Development

Running in Development

# Backend with auto-reload
source .venv/bin/activate
uvicorn backend.main:app --host 0.0.0.0 --port 8000 --reload

# Frontend with HMR
cd frontend && npm run dev

Database

The SQLite database is at ~/.hermes/hinksbot-gateway.db. It is created automatically on first startup.

Schema tables:

sessions — Session metadata (id, title, model, created_at, updated_at, tool_options)
messages — All messages (id, session_id, role, content, tool_call_id, tool_name, token_count)
tool_executions — Tool call records (id, message_id, tool_name, args, result, duration_ms, success, error)

Tool Development

Tools are hot-reloadable in dev mode (restart backend to pick up new tools). Add a new tool by creating a function decorated with @register_tool in the appropriate backend/tools/*.py file.

Adding Dependencies

Backend: Add to requirements.txt then pip install -r requirements.txt

Frontend: Add to frontend/package.json then cd frontend && npm install

Environment Variables

Variable	Description
`HINKSBOT_GATEWAY_CONFIG`	Path to alternate config.yaml

License

MIT License

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
README.md		README.md
SPEC.md		SPEC.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

HinksBot Gateway

Recent Changes

What It Does

Architecture

Features

Quick Start

Prerequisites

1. Backend Setup

2. Frontend Setup

3. Configuration

4. Start Backend

5. Start Frontend

Configuration Reference

Project Structure

API Reference

Health Check

REST Endpoints

Sessions

Tools

Stats

WebSocket Protocol

Client -> Server

Server -> Client

Tool Registry

Available Tools

Tool Groups

Adding a New Tool

Development

Running in Development

Database

Tool Development

Adding Dependencies

Environment Variables

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages