This project investigates the striking difference in serious accident rates between Geelong and Melbourne using real crash data from the Government of Victoria open dataset (2012–2024).
Through data cleaning, EDA, and machine learning, we uncover the key factors contributing to Geelong's higher accident severity.
Dataset documentation and reference can be found at: https://opendata.transport.vic.gov.au/dataset/victoria-road-crash-data
Click here to open the full report, or here for PDF file.
-
Goal: Investigate why certain cities—particularly Geelong and Melbourne—exhibit significant differences in the rate of serious accidents, and uncover the underlying factors driving these disparities.
-
Methodology:
-
Exploratory Data Analysis (EDA)
-
Accident-level data aggregation
-
Feature engineering
-
Random Forest classification to identify key risk factors
- Findings:
-
Heavier and older vehicles in Geelong
-
More serious accidents in high-speed zones and poor lighting conditions
-
Geelong accidents peak during non-commuting hours when headlights are often off
-
Python, Pandas, Seaborn, Matplotlib, scikit-learn
-
Jupyter Notebook for analysis and visualization
-
Data from VicRoads and Victoria State Government
-
Guided by a question-first, insight-driven approach