In this week, I implemented gradient descent from scratch to predict product sales based on advertising spending across TV, Radio, and Newspaper channels.
This project demonstrates how calculus (gradients) and probability/statistics (error minimization) combine to form the backbone of machine learning optimization.
Advertising Dataset (UCI Repository)
Each record includes:
TV— advertising budget spent on TVRadio— advertising budget spent on RadioNewspaper— advertising budget spent on NewspaperSales— resulting sales value
Source: UCI ML Repository
- Loaded the Advertising dataset using pandas.
- Normalized features to improve gradient descent convergence.
- Initialized random weights and bias.
- Implemented gradient descent manually:
- Predicted sales
- Computed cost using Mean Squared Error (MSE)
- Updated weights and bias using gradients
- Plotted cost reduction over iterations.
- Visualized regression line comparing predictions vs actual sales.
Shows how the model error (cost) decreases with each iteration, proving the optimization worked.
This plot compares the model’s predicted sales to the actual sales values from the test dataset.
Each point represents one observation.
The closer the points are to the diagonal dashed line, the better the model’s predictions.
A perfectly accurate model would have all points exactly on that line..
- The cost vs iteration plot shows smooth convergence, indicating a well-tuned learning rate.
- TV and Radio had stronger predictive influence on sales than Newspaper.
- Gradient Descent effectively optimized weights to minimize prediction error.
- This exercise reinforces how derivatives (gradients) guide learning in models.
Install all dependencies using:
pip install -r requirements.txt
