WAVE is a comprehensive platform for web automation and content creation, combining browser automation with an AI-powered visual editor.
Current Configuration: SiliconFlow API with Pro/DeepSeek-V3.1-Terminus model
# Launch IEEE Xplore Agent (Academic Papers)
cd browser_agent
.\venv\Scripts\Activate.ps1
python deepseek_xplore.py
# Launch Xiaohongshu Agent (Social Media)
python deepseek_xhs.py
# Launch Orchestrator (Multi-Agent System)
cd ..\experiments\orchestrator_demo
python main.pyWAVE consists of two main components:
- Browser Agent (
browser_agent/) - Python-based automation for XHS and IEEE Xplore - Visual Frontend (
frontend/) - Next.js application for AI-assisted content creation
- Multi-LLM Support: Works with SiliconFlow, DeepSeek, and OpenAI-compatible APIs
- IEEE Xplore Integration: Automated academic paper search and PDF download with anti-bot evasion
- Xiaohongshu Automation: Automated XHS exploration with intelligent content analysis
- Canva-like Editor: Visual interface for designing XHS posts with AI suggestions
- MCP Integration: Model Context Protocol server for LLM tool integration
- Manager of Managers: Advanced orchestrator coordinating multiple specialized agents
- Cross-Platform: Works on Windows, macOS, and Linux
- Stealth Automation: Anti-detection browser automation with persistent sessions
WAVE/
├── browser_agent/ # Python automation backend
│ ├── config.py # Centralized configuration
│ ├── browser_utils.py # Browser initialization
│ ├── xhs_actions.py # XHS interaction logic
│ ├── xplore_actions.py # IEEE Xplore interaction logic
│ ├── deepseek_xhs.py # Autonomous AI agent for XHS
│ ├── deepseek_xplore.py # Autonomous AI agent for IEEE Xplore
│ ├── xplore_mcp_server.py # MCP server for IEEE Xplore tools
│ ├── tests/ # Test scripts
│ └── README.md # Detailed setup guide
├── frontend/ # Next.js visual editor
│ ├── app/ # Next.js App Router
│ ├── docs/ # API documentation
│ └── README.md # Frontend guide
└── docs/ # Project documentation
The browser agent is the core automation component. Start here to understand the XHS automation capabilities:
# Navigate to browser_agent
cd browser_agent
# Automated setup (choose based on your OS)
# Windows:
.\setup.ps1
# macOS/Linux:
chmod +x setup.sh
./setup.sh
# Run the autonomous agent
# IMPORTANT: You must activate the virtual environment first!
# Windows: .\venv\Scripts\Activate.ps1
# macOS/Linux: source venv/bin/activate
python deepseek_xhs.pyBefore running any Python scripts, you must activate the virtual environment:
Windows (PowerShell):
cd browser_agent
.\venv\Scripts\Activate.ps1
python deepseek_xhs.pymacOS/Linux:
cd browser_agent
source venv/bin/activate
python deepseek_xhs.pyCommon Error: If you see ModuleNotFoundError: No module named 'playwright', it means you forgot to activate the virtual environment.
The browser agent requires a DeepSeek API key. You have two options:
Option A: Using ds_api.txt (Recommended for personal use)
- Create a file named
ds_api.txtin thebrowser_agentdirectory - Paste your DeepSeek API key into the file (just the key, no extra text)
- Run the setup script - it will automatically load the key into
.env
Option B: Using .env file (Recommended for team collaboration)
- Copy
.env.exampleto.env - Edit
.envand setDEEPSEEK_API_KEY=your_api_key_here - Run the setup script
Get your API key from: https://platform.deepseek.com/api_keys
Security Note: Both .env and ds_api.txt are excluded from git via .gitignore
The frontend provides a visual interface for content creation:
# Navigate to frontend
cd frontend
# Install dependencies
npm install
# Start development server
npm run devOpen http://localhost:3000 in your browser.
- Frontend → Backend: Frontend is designed to work with a Python backend on port 8000
- Browser Agent: Standalone automation system with MCP server capabilities
- DeepSeek Agent: Command-line interactive agent for XHS exploration
- Start with Browser Agent: Test XHS automation using
deepseek_agent.py - Explore MCP Tools: Run
xhs_mcp_server.pyto expose tools to LLM clients - Develop Frontend: Work on the visual editor while backend matures
- Automation: Playwright with stealth capabilities
- AI Integration: DeepSeek API with thinking mode
- Protocol: Model Context Protocol (MCP) server
- Configuration: Environment-based with cross-platform support
- Framework: Next.js 16 (App Router)
- UI: Tailwind CSS, Lucide React
- Drag & Drop: React DnD
- Language: TypeScript
If the AI Chat or Publish features are not working, check that:
- The Python server is running on port 8000.
- There are no CORS issues (though the Next.js API route proxies requests to avoid this).
The frontend currently contains several placeholders and fallback mechanisms to allow UI testing without a running backend.
- Status: Mocked.
- Behavior: The API route currently returns an empty list.
- Fallback:
AIDialog.tsxuses hardcoded arrays (singleElementSuggestions,multiElementSuggestions,globalSuggestions) when no suggestions are returned from the API.
- Status: Hybrid (Proxy + Fallback).
- Behavior: Tries to forward requests to
http://127.0.0.1:8000/chat. - Fallback: If the backend is offline,
App.tsxexecutes local keyword-based logic (e.g., "make it bigger", "change color to red") to simulate AI changes.
- Status: Proxy.
- Behavior: Forwards requests to
http://127.0.0.1:8000/publish. - Note: Hardcoded to localhost:8000.
- The API routes currently point to
http://127.0.0.1:8000. This should be moved to environment variables (.env) for production.