Skip to content

Vrundali-R/semantic-search-using-embeddings

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 

Repository files navigation

Semantic Search Using Embeddings

A lightweight NLP project that demonstrates semantic search using Hugging Face sentence embeddings and cosine similarity.

The application converts both documents and user queries into vector embeddings and retrieves the most semantically relevant document based on contextual similarity.


Tech Stack

  • Python
  • LangChain HuggingFace
  • Sentence Transformers
  • Scikit-learn

Workflow

  • Generate embeddings for documents
  • Generate embedding for user query
  • Compute cosine similarity scores
  • Retrieve the closest matching document

Project Structure

semantic-search-using-embeddings/
│
├── app.py
├── requirements.txt
├── README.md
└── .gitignore

Installation

git clone https://github.com/your-username/semantic-search-using-embeddings.git

cd semantic-search-using-embeddings

pip install -r requirements.txt

python app.py

Sample Query

query = "Who is the best bowler?"

Retrieved Result

Jasprit Bumrah is widely recognized for his unique bowling action and deadly yorkers in death overs.

Concepts Used

  • Sentence Embeddings
  • Semantic Similarity
  • Cosine Similarity
  • Vector Retrieval

Future Scope

  • Vector database integration
  • Streamlit interface
  • PDF/document retrieval
  • RAG-based search pipeline

Vrundali Rahangdale

About

Semantic search project using Hugging Face embeddings and cosine similarity for vector-based document retrieval.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages