This project implements an ETL (Extract, Transform, Load) pipeline using Python, MySQL, and Pandas to fetch and store track data from Spotify Web API. The extracted data is processed and stored in a CSV file and a MySQL database for further analysis.
✅ Extracts track details from Spotify Web API using track URLs.
✅ Transforms and cleans data using Pandas.
✅ Saves track details as a CSV file for easy access.
✅ Loads processed data into a MySQL database for structured storage.
- Python (ETL scripting)
- Spotify Web API (Data Extraction)
- Pandas (Data Transformation, CSV export)
- MySQL (Data Storage)
- Spotipy (Python client for Spotify API)
📦 Spotify_ETL
├── 📄 playlist.txt # List of Spotify track URLs
├── 📄 Spotify_CSV.py # Extracts data & saves to CSV
├── 📄 Spotify_MySQL.py # Loads data into MySQL
├── 📄 trackData.csv # Processed track data in CSV format
├── 📄 README.md # Project Documentation
git clone https://github.com/your-username/Spotify_ETL.git
cd Spotify_ETLpip install pandas mysql-connector-python spotipy- Create an account at Spotify Developer
- Generate Client ID and Client Secret
- Add them to
Spotify_CSV.pyandSpotify_MySQL.py
sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(
client_id='your_client_id',
client_secret='your_client_secret'
))CREATE DATABASE spotifyDb;
USE spotifyDb;
CREATE TABLE tracks (
track_id INT AUTO_INCREMENT PRIMARY KEY,
track_name VARCHAR(255),
album_name VARCHAR(255),
release_date DATE,
artists_name VARCHAR(255),
popularity INT,
duration_in_mins FLOAT
);- Extract & Save Data to CSV
python Spotify_CSV.py
- Load Data into MySQL
python Spotify_MySQL.py
| track_name | album_name | release_date | artists_name | popularity | duration (mins) |
|---|---|---|---|---|---|
| Perfect | ÷ (Deluxe) | 2017-03-03 | Ed Sheeran | 84 | 4.39 |
| Pouraadalaam | M.S.Dhoni - The Untold Story | 2016-09-13 | Armaan Malik | 38 | 4.27 |