✨ Semantic Search Module ✨

Text/Image Embeddings | Vector Search | Relevance Ranking
Supercharge your retrieval system with a deep understanding of natural language and imagery.

📖 Description

This is a high-performance semantic search module, specifically designed for the News Video Retrieval task. The project provides a powerful backend API, built on FastAPI, allowing users to search through a massive video archive (~300 hours) using complex natural language queries or sample images.

The Challenge

This project was built to address two primary challenges in the field of video retrieval:

Handling Massive Data Volumes: With up to 300 hours of video content, the number of frames to be analyzed and indexed can run into the millions, demanding a scalable and computationally efficient solution.
Understanding Complex and Nuanced Queries: Users often describe events using abstract language, involving multiple objects and actions. Traditional keyword-based search systems fail to grasp the context and intent behind these queries.

Our Approach

To overcome these challenges, we implemented an intelligent and efficient processing pipeline:

Semantic Vector Encoding: We leverage state-of-the-art Transformer models (such as CLIP variants) to convert both queries (text or image) and database frames into numerical vectors within a high-dimensional space.
Intelligent Keyframe Filtering: To manage the data volume, we apply a unique pre-processing step. By combining vector embeddings with a Sequential Filter algorithm, we successfully reduced the number of keyframes for indexing from 1.2 million down to 700,000, eliminating redundant frames and retaining only those with the highest semantic value.
Similarity Search: In this vector space, semantically similar items (e.g., a photo of a dog and the text "a dog playing in the park") are positioned closely together. The system performs searches based on geometric proximity within this space.
High-Speed Retrieval: To ensure query speed, the system utilizes FAISS (Facebook AI Similarity Search). Notably, we integrated Faiss-cuVS, an optimized new version of Faiss, delivering search speeds up to 10x faster than the standard faiss-gpu. This allows for near-instantaneous result retrieval.

📊 Benchmarks

The results below were benchmarked after applying the Intelligent Keyframe Filtering (Sequential Filter), which reduced the number of indexed frames to ~266,000. The evaluation was performed on a set of manually crafted queries.

Model Name	Time Response (s)	Recall@1	Recall@5	Recall@10	Recall@20	Recall@50
align	0.704	14.47%	37.95%	44.27%	49.59%	51.35%
coca-clip	0.943	4.39%	30.35%	43.25%	52.37%	53.42%
apple-clip-384	0.943	15.53%	47.46%	50.88%	54.74%	57.89%
beit3	0.997	5.26%	35.26%	47.11%	49.47%	53.33%

Observation: The apple-clip-384 model demonstrates superior performance across most Recall metrics, especially at Recall@5 and Recall@10, indicating its strong ability to rank the most relevant results at the top.

📦 Installation

Prerequisites: conda and pip.

Install in 2 steps:

# 1) Create conda environment from environment.yml
conda env create --file=environment.yml

# 2) Install project wheel
pip install ./lib/dist/uniml-0.1-py3-none-any.whl

🖼️ Demo

Below is a quick demo screenshot and a short example showing image-based retrieval.

📂 Project Structure

Below is a concise, copy-paste friendly project tree:

📦semantic
 ┣ 📂data
 ┣ 📂lib
 ┣ 📂notebook
 ┃ ┣ 📜algo.ipynb
 ┃ ┣ 📜frames_info.txt
 ┃ ┣ ...
 ┣ 📂src
 ┃ ┣ 📂app
 ┃ ┃ ┣ 📜__init__.py
 ┃ ┃ ┣ 📜api.py
 ┃ ┃ ┗ 📜schema.py
 ┃ ┣ 📂common
 ┃ ┃ ┣ 📜__init__.py
 ┃ ┃ ┣ 📜path_loader.py
 ┃ ┃ ┣ 📜registry.py
 ┃ ┃ ┗ 📜utils.py
 ┃ ┣ 📂indexer
 ┃ ┃ ┣ 📜__init__.py
 ┃ ┃ ┣ 📜base.py
 ┃ ┃ ┣ 📜faiss_gpu_index_flat_l2.py
 ┃ ┃ ┗ 📜rmm_manager.py
 ┃ ┣ 📂searcher
 ┃ ┃ ┣ 📜__init__.py
 ┃ ┃ ┣ 📜base.py
 ┃ ┃ ┣ 📜fusion_sematic_searcher.py
 ┃ ┃ ┗ 📜single_semantic_searcher.py
 ┃ ┣ 📂semantic_extractor
 ┃ ┃ ┣ 📜__init__.py
 ┃ ┃ ┣ 📜align_extractor.py
 ┃ ┃ ┣ 📜apple_clip_384_extractor.py
 ┃ ┃ ┣ ...
 ┃ ┣ 📜__init__.py
 ┃ ┣ 📜__main__.py
 ┃ ┣ 📜concat_npy.py
 ┃ ┣ 📜create_mapping_file.py
 ┃ ┣ 📜demo.py
 ┃ ┣ 📜evaluating.py
 ┃ ┣ 📜indexing.py
 ┃ ┗ 📜remove_duplicate_frames.py
 ┣ 📜.env
 ┣ 📜.gitignore
 ┣ 📜README.md
 ┣ 📜environment.yml
 ┣ 📜evaluating.sh
 ┣ 📜evaluation.txt
 ┣ 📜evaluation_after_rm_duplicate_frames.txt
 ┣ 📜evaluation_after_rm_outlier.txt
 ┣ 📜indexing.sh
 ┣ 📜log.txt
 ┣ 📜rmm_log.txt
 ┗ 📜run.sh

Short notes:

Place model wrappers in src/semantic_extractor for a common interface.
Keep FAISS index code under src/indexer and high-level logic in src/indexing.py.
Store mapping JSONs and indices under data/ for reproducibility.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

✨ Semantic Search Module ✨

📖 Description

The Challenge

Our Approach

📊 Benchmarks

📦 Installation

🖼️ Demo

📂 Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
assets		assets
lib		lib
notebook		notebook
src		src
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
environment.yml		environment.yml
evaluating.sh		evaluating.sh
evaluation.txt		evaluation.txt
evaluation_after_rm_duplicate_frames.txt		evaluation_after_rm_duplicate_frames.txt
evaluation_after_rm_outlier.txt		evaluation_after_rm_outlier.txt
index.txt		index.txt
indexing.sh		indexing.sh
rmm_log.txt		rmm_log.txt
run.sh		run.sh

Folders and files

Latest commit

History

Repository files navigation

✨ Semantic Search Module ✨

📖 Description

The Challenge

Our Approach

📊 Benchmarks

📦 Installation

🖼️ Demo

📂 Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages