VidExtract

AI-Based Video Description Extractor

VidExtract is a full-stack application that allows users to upload videos and extract meaningful descriptions of their visual and audio content using various AI models. The extracted information is stored in a searchable database, enabling users to chat with an AI assistant to retrieve relevant moments from the video.

Prerequisites

Docker Desktop (or Docker Engine and Docker Compose) installed on your system.
A valid OpenAI API key for:
- Video analysis and scene description (using GPT models)
- Semantic search capabilities (using OpenAI embeddings)

OpenAI API Key Configuration

The application requires a valid OpenAI API key to function. You can configure it in one of two ways:

Environment Variable (Recommended for Docker): Add the API key to your environment variables before running the containers:
```
export OPENAI_API_KEY='your-api-key-here'
docker compose up
```
Configuration File: Modify config.py in the project root:
```
OPENAI_API_KEY = 'your-api-key-here'
```

Project Structure

The project is organized into three main components as an MVC design:

1. Video Analyzer (`video_analyzer` directory)

This is the core processing unit responsible for analyzing the uploaded videos. It utilizes a combination of powerful AI models to understand the video content:

Visual Analysis:
- YOLOv8: Used for object detection within video frames, identifying and locating various objects present in the scenes.
- BLIP: Employed for scene captioning, generating descriptive text summaries of the visual content in different segments of the video.
Audio Analysis:
- OpenAI Whisper: Transcribes spoken English language from the audio track into text.
- YAMNET: Detects various sound events and effects present in the audio.

The video analyzer processes the video, combines the insights from these models, and generates structured information about key moments using the OpenAI API. This information is then stored in a PostgreSQL database.

2. API (`api` directory)

The API serves as the backend of the application. It is built using FastAPI and handles:

Receiving video uploads from the client.
Interacting with the video_analyzer to process videos.
Storing the extracted event data in a PostgreSQL database with the pgvector extension for semantic search capabilities.
Providing a chat endpoint that allows the client to query the database using natural language.
Utilizing OpenAI embeddings for semantic search to find video moments relevant to user queries.

3. Client (`client` directory)

The client is a single-page application built with React and TypeScript, using Material UI for the user interface. It provides the user interface for:

Uploading videos via a drag-and-drop interface.
Providing a chat interface to interact with the AI assistant and retrieve video highlights based on queries or display all extracted moments.

Deployment with Docker Compose

The application can be easily deployed using Docker Compose, which sets up the client, API, and database services in separate containers.

Instructions:

Navigate to the project root: Open your terminal or command prompt and navigate to the root directory of the vidextract project (where the compose.yaml file is located).
Build and run the containers: Execute the following command:
```
docker compose up --build
```
- docker compose up: Starts the services defined in compose.yaml.
- --build: Builds the Docker images before starting the containers (useful for the first run).
Access the application: Once the containers are running, open your web browser and go to http://localhost:3000. You should see the client application.

NOTE: The first analysis run may take a few minutes as models and weights are being downloaded.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
api		api
clean-cache/.vite/deps		clean-cache/.vite/deps
client		client
models		models
resources		resources
sample vids		sample vids
utils		utils
video_analyzer		video_analyzer
.dockerignore		.dockerignore
.gitattributes		.gitattributes
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
compose.yaml		compose.yaml
config.py		config.py
event_db.py		event_db.py
llm_promt.txt		llm_promt.txt
requirements.txt		requirements.txt
run.py		run.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

VidExtract

Prerequisites

OpenAI API Key Configuration

Project Structure

1. Video Analyzer (`video_analyzer` directory)

2. API (`api` directory)

3. Client (`client` directory)

Deployment with Docker Compose

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

VidExtract

Prerequisites

OpenAI API Key Configuration

Project Structure

1. Video Analyzer (video_analyzer directory)

2. API (api directory)

3. Client (client directory)

Deployment with Docker Compose

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

1. Video Analyzer (`video_analyzer` directory)

2. API (`api` directory)

3. Client (`client` directory)

Packages