A lightweight NLP project that demonstrates semantic search using Hugging Face sentence embeddings and cosine similarity.
The application converts both documents and user queries into vector embeddings and retrieves the most semantically relevant document based on contextual similarity.
- Python
- LangChain HuggingFace
- Sentence Transformers
- Scikit-learn
- Generate embeddings for documents
- Generate embedding for user query
- Compute cosine similarity scores
- Retrieve the closest matching document
semantic-search-using-embeddings/
│
├── app.py
├── requirements.txt
├── README.md
└── .gitignoregit clone https://github.com/your-username/semantic-search-using-embeddings.git
cd semantic-search-using-embeddings
pip install -r requirements.txt
python app.pyquery = "Who is the best bowler?"Jasprit Bumrah is widely recognized for his unique bowling action and deadly yorkers in death overs.
- Sentence Embeddings
- Semantic Similarity
- Cosine Similarity
- Vector Retrieval
- Vector database integration
- Streamlit interface
- PDF/document retrieval
- RAG-based search pipeline
Vrundali Rahangdale