This is a Book Recommendation System that suggests books to users using two main techniques: Popularity-Based Filtering and Collaborative Filtering. It demonstrates the application of recommender system logic with real-world datasets and includes a simple web interface built with Flask.
- Source: Book Recommendation Dataset – Kaggle
- The dataset includes:
Books.csv: Book metadata (title, author, publisher)Users.csv: User demographic informationRatings.csv: User ratings for books
- The dataset was cleaned and filtered for meaningful recommendations (based on minimum number of ratings by users and for books).
pandas,numpy– data manipulation and analysissklearn– building collaborative filtering model (cosine similarity)flask– for creating a simple web interfacepickle– saving and loading precomputed models/data
- Filters books that received at least 200 ratings.
- Sorts them by average rating.
- Displays the top 50 most popular books universally.
- Filters users who rated 150+ books and books with 50+ ratings.
- Constructs a user-book matrix and computes cosine similarity between users.
- Provides personalized recommendations based on similar user behavior.
The following preprocessed models and data are stored as .pkl files and included in the repo via Git LFS:
top_50_books.pkl– for popularity-based recommendationspvt.pkl– user-book pivot tablesimilarity.pkl– cosine similarity matrixbooks.pkl– book metadata used in recommendation display
A simple Flask-based interface is included:
- The homepage shows the top 50 most popular books.
- The recommendation page allows users to input a book and receive similar book suggestions.
- The project uses Git LFS to manage large
.pklfiles — make sure Git LFS is installed to access them. - Further enhancements can include:
- Using matrix factorization (e.g., SVD)
- Adding login/user profiles
- Cloud deployment for public access
- Ali Shayan – GitHub Profile
Feel free to fork the repo, open issues, or suggest improvements!