Skip to content

Masriyan/Octopus-Ai

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

11 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Octopus AI Banner

πŸ™ Because Two Hands Just Aren't Enough

Version License Python Stars Issues


Gemini berkata

πŸ™ Meet Octopus AI: Because Two Hands Just Aren't Enough

Let's be honest: being a human is exhausting. You only have two arms, one brain, and a desperate, daily need for caffeine. How are you supposed to handle a never-ending to-do list with hardware like that?

Enter Octopus AI.

The Philosophy (Why an Octopus?)

We looked at the animal kingdom for the ultimate productivity guru and found the undisputed multitasking ninja of the sea. Why? Because octopuses are freakishly smart and boast eight highly capable arms.

They can open child-proof jars from the inside, solve puzzles, and juggle multiple tasks without breaking a sweat (mostly because they live underwater, but you get the point). We took that big-brained, multi-limbed brilliance and turned it into an AI tool designed to do your heavy lifting.

What Can the Tentacles Do for You?

🦾 Eight-Armed Multitasking: While your clumsy human hands are still typing a single sentence, Octopus AI is already crunching data, drafting emails, organizing your schedule, and virtually high-fiving itself.

🧠 Escape-Artist Intelligence: Got a problem that feels like you're stuck in a locked box? Octopus AI uses its massive, squishy digital brain to squeeze through complex problems and find elegant solutions.

πŸ”„ Total Flexibility: It adapts to your workflow seamlessly. No rigid bones, no frictionβ€”just smooth, intelligent automation wrapping around your daily tasks.

🧹 100% Mess-Free: All the genius of a cephalopod, with absolutely zero ink squirted on your nice clean desk when it gets surprised.

Stop drowning in a sea of tabs and endless tasks. Let Octopus AI wrap its virtual tentacles around your workload, so you can go back to doing what humans do best: taking naps and drinking coffee. β˜•


πŸ—οΈ Architecture

Octopus AI Architecture

graph TB
    subgraph Frontend["🎨 Frontend (HTML/CSS/JS)"]
        UI[Chat Interface]
        Settings[Settings Panel]
    end

    subgraph Backend["βš™οΈ FastAPI Backend"]
        Agent[πŸ™ Agent Engine]
        Config[Config Manager]
        Memory[Memory / Persistence]
    end

    subgraph LLM["🧠 LLM Providers"]
        OpenAI[OpenAI<br/>GPT-4o / GPT-4o-mini]
        Anthropic[Anthropic<br/>Claude 3.5 Sonnet]
        Gemini[Google Gemini<br/>Gemini 3 Flash]
        Ollama[Ollama<br/>Llama / Mistral]
    end

    subgraph Tools["πŸ¦‘ Tentacle Tools"]
        Shell[🐚 Shell]
        FileOps[πŸ“ File Ops]
        WebBrowse[🌐 Web Browse]
        CodeRun[πŸ’» Code Runner]
        Search[πŸ” Web Search]
    end

    UI -- WebSocket --> Agent
    Settings -- REST API --> Config
    Agent --> LLM
    Agent --> Tools
    Agent --> Memory
Loading

πŸ¦‘ Features

πŸ”§ Five Powerful Tentacles

Tentacle Capability Description
🐚 Shell Commands Execute system commands with real-time output streaming
πŸ“ File Operations Read, write, list, search, and manage files & directories
🌐 Web Browse Fetch, parse, and summarize any web page
πŸ’» Code Execution Run Python code in a sandboxed environment
πŸ” Web Search Search the internet via DuckDuckGo

🧠 Multi-Provider LLM Support

Switch between AI providers on the fly β€” no restart needed:

Provider Models Authentication
OpenAI GPT-4o, GPT-4o-mini, GPT-4-Turbo API Key
Anthropic Claude 3.5 Sonnet, Claude 3.5 Haiku, Claude 3 Opus API Key
Google Gemini Gemini 3 Flash, Gemini 2.5 Pro/Flash API Key or Google Sign-In
Ollama Llama 3.2, Mistral, Code Llama + any local model Local (free!)

🎨 Premium Dark-Ocean GUI

  • Glassmorphism design with deep-ocean dark theme
  • Animated octopus welcome screen with CSS tentacle animation
  • Real-time streaming chat with full Markdown rendering
  • Live tool execution visualization β€” see each tentacle in action
  • Settings panel with provider/model/temperature selection
  • Responsive design optimized for desktop & mobile
  • Google Sign-In for seamless Gemini integration

πŸ’Ύ Persistent Memory

  • Conversations automatically saved to disk as JSON
  • Auto-generated conversation titles
  • Full-text searchable conversation history
  • Configurable context window (up to 50 messages)

πŸ›‘οΈ Security & Sandboxing (v2.1+)

  • Path Isolation: All File and Shell interactions are securely jailed strictly to the data/workspace directory.
  • Network Containment: Python subprocess environments are spawned using Linux unshare -rn to sever network access completely and neutralize code-based data extraction.
  • Robust Prompt Armor: Extracted texts from DuckDuckGo queries and Web navigations are structurally isolated within XML wrappers (like <untrusted> or <external_content>), effectively stripping them of instruct-override privileges and preventing Prompt Injections.
  • Local IP Anchoring: To harden the backend against unauthenticated external hijacking, Octopus utilizes a strict LocalhostRestrictionMiddleware layer blocking all non-local connections.

πŸš€ Installation

Prerequisites

Requirement Version
Python 3.10 or higher
pip Latest recommended
API Key At least one (OpenAI / Anthropic / Gemini) β€” or Ollama for free local models

Quick Start

# 1. Clone the repository
git clone https://github.com/Masriyan/Octopus-Ai.git
cd Octopus-Ai

# 2. Make the start script executable
chmod +x start.sh

# 3. Launch everything (auto-installs deps, starts backend + frontend)
./start.sh

Then open http://localhost:5500 in your browser. πŸŽ‰

Manual Setup

If you prefer to set things up manually:

# Create virtual environment
python3 -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r backend/requirements.txt

# Copy environment config
cp .env.example .env
# Edit .env and add your API key(s)

# Start the backend
cd backend
python3 -m uvicorn main:app --host 0.0.0.0 --port 8000 --reload &

# Start the frontend (in another terminal)
cd frontend
python3 -m http.server 5500

Configure API Keys

  1. Open http://localhost:5500
  2. Click the βš™οΈ Settings button in the sidebar
  3. Select your preferred LLM provider
  4. Enter your API key and click Save
  5. Start chatting! πŸ™

Using Ollama (Free / Local)

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Pull a model
ollama pull llama3.2

# In Octopus AI settings β†’ select "Ollama" as provider

πŸ“– Usage

Basic Chat

Simply type your message and Octopus AI will respond. It automatically detects when tools would be helpful and uses them proactively.

Capability Quick-Start Cards

The welcome screen features interactive cards that demonstrate each tentacle:

Card Example Prompt
🐚 Shell "List all files in my home directory"
πŸ“ Files "Read and summarize the README.md in the current project"
πŸ” Search "Search the web for the latest AI news"
πŸ’» Code "Write a Python script to calculate Fibonacci numbers and run it"
🌐 Web "Fetch and summarize the contents of https://news.ycombinator.com"
πŸ¦‘ Multi "Help me analyze my system information"

Settings & Configuration

Setting Description
LLM Provider Switch between OpenAI, Anthropic, Gemini, or Ollama
Model Choose the specific model for the selected provider
Temperature Control response creativity (0.0 = focused, 1.0 = creative)
Tentacle Permissions Enable/disable individual tools
API Keys Securely save provider API keys
Google Sign-In Authenticate with Google for Gemini access

WebSocket Streaming

Octopus AI uses WebSocket connections for real-time, token-by-token streaming β€” no polling, no delays.


πŸ“ Project Structure

Octopus-Ai/
β”œβ”€β”€ backend/
β”‚   β”œβ”€β”€ main.py              # FastAPI server + WebSocket endpoints
β”‚   β”œβ”€β”€ agent.py             # Core agent engine with tool loop
β”‚   β”œβ”€β”€ llm_providers.py     # OpenAI / Anthropic / Gemini / Ollama
β”‚   β”œβ”€β”€ config.py            # Configuration manager
β”‚   β”œβ”€β”€ memory.py            # Conversation persistence (JSON)
β”‚   β”œβ”€β”€ requirements.txt     # Python dependencies
β”‚   └── tools/
β”‚       β”œβ”€β”€ __init__.py      # Tool registry & schema builder
β”‚       β”œβ”€β”€ shell_tool.py    # 🐚 Shell command execution
β”‚       β”œβ”€β”€ file_tool.py     # πŸ“ File system operations
β”‚       β”œβ”€β”€ web_tool.py      # 🌐 HTTP page fetching
β”‚       β”œβ”€β”€ code_tool.py     # πŸ’» Python code execution
β”‚       └── search_tool.py   # πŸ” DuckDuckGo web search
β”œβ”€β”€ frontend/
β”‚   β”œβ”€β”€ index.html           # Main application page
β”‚   β”œβ”€β”€ css/main.css         # Deep-ocean dark theme
β”‚   └── js/app.js            # Frontend logic & WebSocket client
β”œβ”€β”€ data/                    # Created at runtime (git-ignored)
β”‚   β”œβ”€β”€ config.json          # User preferences & API keys
β”‚   └── memory/              # Saved conversations
β”œβ”€β”€ docs/
β”‚   └── images/              # Documentation assets
β”œβ”€β”€ .env.example             # Environment variable template
β”œβ”€β”€ .gitignore               # Git ignore rules
β”œβ”€β”€ start.sh                 # One-command launcher
β”œβ”€β”€ CHANGELOG.md             # Release history
β”œβ”€β”€ CONTRIBUTING.md          # Contribution guidelines
β”œβ”€β”€ LICENSE                  # MIT License
└── README.md                # ← You are here

πŸ› οΈ Development

Backend (FastAPI)

cd backend
python3 -m uvicorn main:app --reload --port 8000

Frontend (Static)

cd frontend
python3 -m http.server 5500

API Documentation

Visit http://localhost:8000/docs for the interactive Swagger UI.

REST API Endpoints

Method Endpoint Description
GET /api/health Health check
GET /api/config Get configuration (keys masked)
POST /api/config Update configuration
POST /api/config/apikey Save an API key
GET /api/conversations List all conversations
POST /api/conversations Create new conversation
GET /api/conversations/{id} Get conversation with messages
DELETE /api/conversations/{id} Delete conversation
GET /api/models/{provider} List available models
POST /api/auth/google Google OAuth authentication
WS /ws/chat/{conv_id} WebSocket for real-time chat

πŸ—ΊοΈ Roadmap

gantt
    title Octopus AI Development Roadmap
    dateFormat  YYYY-MM
    section Core
    Multi-LLM Providers       :done, 2025-01, 2025-02
    Tool System (5 tentacles) :done, 2025-01, 2025-02
    WebSocket Streaming       :done, 2025-02, 2025-03
    section Enhancements
    Plugin System             :active, 2025-03, 2025-05
    RAG / Document Chat       :2025-04, 2025-06
    Voice Input/Output        :2025-05, 2025-07
    section Infrastructure
    Docker Support            :2025-03, 2025-04
    Auth & Multi-User         :2025-05, 2025-07
    Cloud Deployment          :2025-06, 2025-08
Loading

🀝 Contributing

Contributions are welcome! Please see CONTRIBUTING.md for guidelines.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-tentacle)
  3. Commit your changes (git commit -m 'Add amazing tentacle')
  4. Push to the branch (git push origin feature/amazing-tentacle)
  5. Open a Pull Request

πŸ“„ License

This project is licensed under the MIT License β€” see the LICENSE file for details.


πŸ™ Philosophy

An octopus has eight arms, each capable of independent action β€” tasting, gripping, exploring. Octopus AI embodies this: many tools, each specialized, working together to accomplish any task.


Made with πŸ™ by Masriyan

⭐ Star this repo β€’ πŸ› Report Bug β€’ πŸ’‘ Request Feature

About

πŸ™ Meet Octopus AI: Because Two Hands Just Aren't Enough Let’s be honest: being a human is exhausting. You only have two arms, one brain, and a desperate, daily need for caffeine.

Topics

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors