This project leverages a Random Forests model to classify music genres.
The app utilizes the GTZAN Dataset from Kaggle, which includes:
- 10 Genres with 100 audio files per genre (30 seconds each).
- Features extracted into two CSV files:
- 30-second files: Contains mean and variance of audio features.
- 3-second files: Splits songs into smaller segments, providing 10x the data.
The web app allows users to:
- Upload an audio clip.
- Automatically extract relevant audio features from the clip.
- Pass these features to the Random Forest model for genre prediction.
- Display the predicted genre to the user.
- Create a Python Virtual Environment:
python -m venv .venv - Activate the Virtual Environment:
.venv\Scripts\activate - Install Dependencies:
pip install -r requirements.txt - Run the web app with:
python app.py - Open http://127.0.0.1:5000 in your browser to see the app
- Take a look at the Notebook.ipynb to see how it all works!
- The model relies on pre-extracted features, which limits how well it can understand the full complexity of audio data. The features I used may not capture the subtle differences between genres very effectively.
- When I tried adding new features, like rhythm or tone-based ones, these didn’t improve the model much, showing that just using features like this with Random Forest has its limits.
- If I had more time, using a CNN could greatly improve accuracy because it can better understand the detailed patterns in audio.
