GitHub - Yash49hdfmj/colgate_hackathon_2025

# Sentiment Analysis of Amazon Oral Care Product Reviews

Overview

This project aims to analyze customer sentiment from Amazon reviews of oral care products using multiple machine learning and NLP models. The objective is to classify customer reviews into positive, negative, or neutral sentiments and visualize the results in a Power BI dashboard.

Hackathon Problem Statement

E-commerce platforms receive vast amounts of customer feedback daily. Understanding customer sentiment helps brands:

Improve products
Address customer concerns
Enhance user experience

Challenge

Develop an automated solution to process and analyze thousands of product reviews efficiently and generate meaningful insights for decision-making.

Dataset

Source: Amazon oral care product reviews dataset
Size: 1000+ reviews

Attributes

Id: Unique identifier for each review
ProductId: Unique product identifier
UserId: Unique user identifier
ProfileName: Reviewer's name
HelpfulnessNumerator & HelpfulnessDenominator: Measures of review helpfulness
Score: User-provided rating (1-5 stars)
Time: Timestamp of the review
Summary: Short description of the review
Text: Full review text

Sentiment Analysis Models Used

VADER (Valence Aware Dictionary and sEntiment Reasoner) – Rule-based sentiment analysis model.
TextBlob – Lexicon-based approach for polarity detection.
BERT (Bidirectional Encoder Representations from Transformers) – Deep learning-based NLP model for contextual sentiment analysis.
Other ML models (if applicable) – Additional models like Logistic Regression, SVM, or LSTM-based classifiers.

Expected Results and Insights

Sentiment Distribution: Percentage of positive, negative, and neutral reviews.
Model Performance Comparison: Accuracy, precision, recall, and F1-score for each model.
Product Sentiment Trends: Time-based sentiment trends for different oral care products.
Common Keywords & Topics: Frequently occurring words and phrases in positive and negative reviews.
Helpfulness Score Analysis: Relationship between sentiment and review helpfulness ratings.
Power BI Visualization: Interactive dashboard to explore sentiment trends, word clouds, and review insights.

Implementation Steps

Data Preprocessing
- Cleaning, tokenization, and feature extraction from review text.
Sentiment Classification
- Applying multiple models to classify sentiment.
Result Aggregation
- Consolidating outputs from different models for comparison.
Exporting Results
- Saving analysis results into a CSV file for Power BI integration.
Visualization in Power BI
- Creating graphs, charts, and dashboards for data-driven insights.

Technologies Used

Python: Data processing and sentiment analysis
NLTK, TextBlob, Transformers (Hugging Face): NLP libraries
Pandas, NumPy: Data handling and manipulation
Matplotlib, Seaborn: Data visualization in Python
Power BI: Interactive dashboard creation

How to Use

Run the Jupyter Notebook or Colab script to perform sentiment analysis.
Generate the all_model_results.csv file.
Load the CSV file into Power BI.
Explore insights using the Power BI dashboard.

Conclusion

This project provides a scalable approach to analyze customer sentiment using multiple NLP models. The insights help businesses to:

Understand user feedback
Improve product quality
Enhance customer satisfaction

Future Enhancements

Advanced deep learning models
Real-time sentiment tracking
Multilingual sentiment analysis

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Code Brigade_new_ppt (1).pptx		Code Brigade_new_ppt (1).pptx
Colgate_sentiment (3).ipynb		Colgate_sentiment (3).ipynb
README.md		README.md
Report.pdf		Report.pdf
all_model_results.csv		all_model_results.csv
amazon_reviews.csv		amazon_reviews.csv
colgate_retail_store_data_numeric_ids.csv		colgate_retail_store_data_numeric_ids.csv
plots - Google Docs.pdf		plots - Google Docs.pdf
ps-2.pbix		ps-2.pbix
ps3_new.pbix		ps3_new.pbix
sahil yadav.pbix		sahil yadav.pbix
scores - Google Docs.pdf		scores - Google Docs.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Hackathon Problem Statement

Challenge

Dataset

Attributes

Sentiment Analysis Models Used

Expected Results and Insights

Implementation Steps

Technologies Used

How to Use

Conclusion

Future Enhancements

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Overview

Hackathon Problem Statement

Challenge

Dataset

Attributes

Sentiment Analysis Models Used

Expected Results and Insights

Implementation Steps

Technologies Used

How to Use

Conclusion

Future Enhancements

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages