Skip to content

flatoy/twilio-call

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AI Voice Agent

An experimental project that lets you give AI agents instructions and have them call people via phone.

Overview

This is a voice AI agent that can place outbound phone calls and conduct conversations autonomously. You provide an objective (e.g., "schedule a job interview" or "follow up on a support ticket"), and the AI handles the conversation naturally.

How It Works

┌──────────────┐     ┌─────────────┐     ┌──────────────┐
│  Web UI      │────▶│  Server     │────▶│  Twilio      │
│  (browser)   │     │  (Fastify)  │     │  (calls)     │
└──────────────┘     └──────┬──────┘     └──────────────┘
                           │
                    WebSocket (audio)
                           │
                     ┌─────▼─────┐
                     │ Deepgram  │
                     │ Voice AI  │
                     │ Agent API │
                     └───────────┘
  1. Twilio places and manages the phone call
  2. Deepgram Voice Agent API handles the AI conversation:
    • Speech-to-text (STT) using Flux model
    • LLM thinking via Groq
    • Text-to-speech (TTS) using Aura 2 voices
  3. Fastify server bridges Twilio media streams with Deepgram's WebSocket API
  4. Web UI lets you enter a phone number, set the call objective, and watch the live transcript

Features

  • Place outbound calls to any phone number
  • Real-time transcript streaming via Server-Sent Events
  • Configurable agent identity (name, whether to reveal as AI)
  • Intelligent call handling:
    • Barge-in support (interrupts agent when user speaks)
    • Prevents premature hangups (silence detection, re-engagement prompts)
    • Validates objective completion before ending scheduling calls
  • Live call status updates

Tech Stack

Setup

Prerequisites

Installation

npm install

Configuration

Copy the example environment file and fill in your credentials:

cp .env.example .env

Edit .env with your values:

SERVER_URL=https://your-tunnel-domain.com
TWILIO_ACCOUNT_SID=ACxxxxxxxxxx
TWILIO_AUTH_TOKEN=your_auth_token
TWILIO_FROM_NUMBER=+1234567890
DEEPGRAM_API_KEY=your_deepgram_key
GROQ_API_KEY=your_groq_key

Running

  1. Start the Cloudflare tunnel (in a separate terminal):

    npm run tunnel
  2. Start the server:

    npm start
    # or for development with auto-reload:
    npm run dev
  3. Open http://localhost:3000 in your browser

Usage

  1. Enter a phone number in E.164 format (e.g., +14155551234)
  2. Write your call objective - be specific about what you want the agent to accomplish
  3. Optionally configure the agent name and whether to reveal it's an AI
  4. Click "Call" and watch the live transcript

Example Objectives

  • "Call on behalf of Acme Corp to schedule a product demo. Propose Tuesday at 2pm or Wednesday at 10am."
  • "Follow up on support ticket #1234. Ask if the issue is resolved and if they need any further assistance."
  • "Confirm the appointment scheduled for tomorrow at 3pm. Ask if they need to reschedule."

Project Structure

├── server.js              # Entry point
├── index.html             # Web UI
├── src/
│   ├── app.js             # Fastify app setup
│   ├── core/
│   │   ├── config.js      # Environment configuration
│   │   ├── constants.js   # Shared constants
│   │   └── callStore.js   # In-memory call state
│   ├── routes/
│   │   ├── httpRoutes.js  # REST API endpoints
│   │   └── mediaStreamRoute.js  # Twilio WebSocket handler
│   ├── services/
│   │   └── twilioService.js     # Twilio client wrapper
│   └── voice/
│       └── agent.js       # Deepgram agent configuration

Limitations

  • Calls are recorded by Twilio (configurable)
  • In-memory call state (resets on server restart)
  • Single concurrent call recommended for this demo

Disclaimer

This is an experimental project for educational purposes. Be mindful of local laws regarding automated calling and call recording. Always obtain appropriate consent before recording calls.

License

MIT

About

An experimental project that lets you give AI agents instructions and have them call people via phone.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors