Skip to content

smohan-21/HR-Employee-Attrition-SQL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HR-Employee-Attrition-SQL

HR Employee Attrition Analysis Project using SQL

Project Overview

This project is a complete end-to-end HR Employee Attrition analysis using MySQL. The dataset contains detailed employee information, and this project demonstrates skills in:

  • Database design and schema creation
  • Data loading and validation
  • Data cleaning and preprocessing
  • Basic, intermediate, and advanced analytical queries

The goal of this project is to showcase SQL expertise and analytical thinking suitable for entry-level data analyst roles.


Dataset

  • Source file: hr_employee_attrition.csv
  • File format: CSV (UTF-8)
  • Contains columns such as Age, Department, JobRole, MonthlyIncome, Attrition, OverTime, YearsAtCompany, and many others.

Project Structure

HR-Employee-Attrition-SQL/
│
├── dataset/
│   └── hr_employee_attrition.csv
│
├── schema/
│   └── create_schema.sql
│
├── data_loading/
│   ├── load_data.sql          # Documentation of CSV import
│   └── validate_data.sql      # Validation queries post-import
│
├── cleaning/
│   └── data_cleaning.sql      # Data cleaning & preprocessing queries
│
├── analysis/
│   ├── basic_queries.sql      # Basic analysis queries
│   ├── intermediate_queries.sql # Intermediate analysis queries
│   └── advanced_queries.sql   # Advanced analysis queries
│
└── README.md                 # Project documentation

Step 1: Database Creation

  • Database: hr_analytics
  • Table: employee_attrition
  • Primary Key: Employee_id (AUTO_INCREMENT)
  • Columns include all HR features such as Age, Attrition, Department, JobRole, MonthlyIncome, OverTime, YearsAtCompany, etc.
  • Schema stored in: schema/create_schema.sql

Step 2: Data Loading

  • CSV file imported using MySQL Workbench Table Data Import Wizard

  • Header row included, Employee_id auto-generated

  • Post-import validation queries ensure:

    • Correct row count
    • Auto-increment integrity
    • No NULL or invalid ENUM values
  • Files:

    • data_loading/load_data.sql (documentation)
    • data_loading/validate_data.sql (validation queries)

Step 3: Data Cleaning & Preprocessing

  • Detect duplicates excluding Employee_id
  • Check NULL values across columns
  • Standardize categorical data (Attrition, Gender, OverTime, BusinessTravel, MaritalStatus, Department)
  • Validate numeric ranges (Age, YearsAtCompany, MonthlyIncome, PercentSalaryHike)
  • Logical consistency checks (e.g., YearsInCurrentRole ≤ YearsAtCompany)
  • Derived columns: age groups, income bands, attrition flags
  • File: cleaning/data_cleaning.sql

Step 4: Analysis Queries

Basic Queries

  • Total employees
  • Attrition counts & percentages
  • Department-wise counts
  • Average income by job role
  • Distinct job roles
  • File: analysis/basic_queries.sql

Intermediate Queries

  • Attrition rate by department
  • Top 5 job roles with highest attrition
  • Average income by Attrition status
  • Employees with >10 years who left
  • Average age by job level
  • File: analysis/intermediate_queries.sql

Advanced Queries

  • Department ranking by attrition rate
  • Overtime correlation with attrition
  • High-risk employee segments based on multiple factors
  • File: analysis/advanced_queries.sql

Tools & Technologies

  • MySQL 8.x
  • MySQL Workbench
  • GitHub (for version control and project publishing)
  • Dataset: HR Employee Attrition CSV

Key Skills Demonstrated

  • Database design & schema creation
  • Data loading, validation, and cleaning
  • Advanced SQL queries: aggregates, window functions, conditional logic
  • Analytical thinking for HR attrition insights
  • Professional GitHub project structure

How to Use This Project

  1. Clone the repository from GitHub.
  2. Use schema/create_schema.sql to create the database and table.
  3. Load the CSV using MySQL Workbench as documented in data_loading/load_data.sql.
  4. Run data_loading/validate_data.sql to verify successful import.
  5. Execute queries from cleaning/data_cleaning.sql for data quality checks.
  6. Perform analyses using analysis/*.sql files.

Author: S Mohan

About

"End-to-end HR employee attrition analysis using MySQL with advanced SQL queries"

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors