An experimental project that lets you give AI agents instructions and have them call people via phone.
This is a voice AI agent that can place outbound phone calls and conduct conversations autonomously. You provide an objective (e.g., "schedule a job interview" or "follow up on a support ticket"), and the AI handles the conversation naturally.
┌──────────────┐ ┌─────────────┐ ┌──────────────┐
│ Web UI │────▶│ Server │────▶│ Twilio │
│ (browser) │ │ (Fastify) │ │ (calls) │
└──────────────┘ └──────┬──────┘ └──────────────┘
│
WebSocket (audio)
│
┌─────▼─────┐
│ Deepgram │
│ Voice AI │
│ Agent API │
└───────────┘
- Twilio places and manages the phone call
- Deepgram Voice Agent API handles the AI conversation:
- Speech-to-text (STT) using Flux model
- LLM thinking via Groq
- Text-to-speech (TTS) using Aura 2 voices
- Fastify server bridges Twilio media streams with Deepgram's WebSocket API
- Web UI lets you enter a phone number, set the call objective, and watch the live transcript
- Place outbound calls to any phone number
- Real-time transcript streaming via Server-Sent Events
- Configurable agent identity (name, whether to reveal as AI)
- Intelligent call handling:
- Barge-in support (interrupts agent when user speaks)
- Prevents premature hangups (silence detection, re-engagement prompts)
- Validates objective completion before ending scheduling calls
- Live call status updates
- Voice & AI: Deepgram Voice Agent API (STT, TTS, conversation)
- LLM: Groq for fast inference
- Telephony: Twilio Programmable Voice
- Server: Fastify with WebSocket support
- Tunnel: Cloudflare Tunnel for exposing local server
- Node.js 18+
- A Twilio account with a phone number
- A Deepgram API key
- A Groq API key
- Cloudflare Tunnel (or ngrok) for webhook URLs
npm installCopy the example environment file and fill in your credentials:
cp .env.example .envEdit .env with your values:
SERVER_URL=https://your-tunnel-domain.com
TWILIO_ACCOUNT_SID=ACxxxxxxxxxx
TWILIO_AUTH_TOKEN=your_auth_token
TWILIO_FROM_NUMBER=+1234567890
DEEPGRAM_API_KEY=your_deepgram_key
GROQ_API_KEY=your_groq_key-
Start the Cloudflare tunnel (in a separate terminal):
npm run tunnel
-
Start the server:
npm start # or for development with auto-reload: npm run dev -
Open
http://localhost:3000in your browser
- Enter a phone number in E.164 format (e.g.,
+14155551234) - Write your call objective - be specific about what you want the agent to accomplish
- Optionally configure the agent name and whether to reveal it's an AI
- Click "Call" and watch the live transcript
- "Call on behalf of Acme Corp to schedule a product demo. Propose Tuesday at 2pm or Wednesday at 10am."
- "Follow up on support ticket #1234. Ask if the issue is resolved and if they need any further assistance."
- "Confirm the appointment scheduled for tomorrow at 3pm. Ask if they need to reschedule."
├── server.js # Entry point
├── index.html # Web UI
├── src/
│ ├── app.js # Fastify app setup
│ ├── core/
│ │ ├── config.js # Environment configuration
│ │ ├── constants.js # Shared constants
│ │ └── callStore.js # In-memory call state
│ ├── routes/
│ │ ├── httpRoutes.js # REST API endpoints
│ │ └── mediaStreamRoute.js # Twilio WebSocket handler
│ ├── services/
│ │ └── twilioService.js # Twilio client wrapper
│ └── voice/
│ └── agent.js # Deepgram agent configuration
- Calls are recorded by Twilio (configurable)
- In-memory call state (resets on server restart)
- Single concurrent call recommended for this demo
This is an experimental project for educational purposes. Be mindful of local laws regarding automated calling and call recording. Always obtain appropriate consent before recording calls.
MIT