Skip to content

ankitsinghh007/Redrob_Hackathon

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

Resume Matching Engine

Redrob AI Campus Hackathon — Individual Competition


Problem Statement

Built a Resume Matching Engine that:

  • Normalizes noisy resume skill data from 10 Indian university students
  • Computes TF-IDF vectors for resumes
  • Builds binary vectors for 3 Job Descriptions from Korean tech companies
  • Calculates cosine similarity between resumes and JDs
  • Outputs the Top 3 matching candidates per JD

How It Works

Step 1 — Skill Normalization

  • Split raw skills on commas
  • Convert to lowercase
  • Match multi-word phrases before single tokens (sorted by length descending)
  • Apply SKILL_ALIASES mapping exactly as provided
  • Discard tokens not present in the alias map

Step 2 — Deduplication

  • Each canonical skill appears only once per resume

Step 3 — Vocabulary Construction

  • Built from normalized + deduplicated resume skills only
  • Sorted alphabetically (48 unique terms)

Step 4 — TF-IDF Vectors (Resumes only)

TF  = 1 / N               (N = total unique skills in resume)
IDF = ln(10 / df)         (natural log, no smoothing)
TF-IDF = TF × IDF

Step 5 — JD Binary Vectors

  • 1 if skill present in JD, 0 if not
  • Built over same shared vocabulary

Step 6 — Cosine Similarity & Ranking

Cosine(A, B) = (A · B) / (|A| × |B|)
  • A = Resume TF-IDF vector
  • B = JD binary vector
  • Top 3 ranked per JD, ties broken alphabetically

Results

JD-1 — Kakao (ML Engineer)
Sneha Patel(0.57), Karan Mehta(0.53), Arjun Sharma(0.40)

JD-2 — Naver (Backend Engineer)
Rahul Gupta(0.81), Ananya Krishnan(0.28), Deepika Rao(0.19)

JD-3 — Line (Frontend Engineer)
Aditya Kumar(0.67), Priya Nair(0.58), Ananya Krishnan(0.35)

Tech Stack

Item Detail
Language Python
Libraries Standard library only (math)
External Libraries None (not allowed)
AI Tool Used Redrob AI

File Structure

├── resume_matcher.py   # Main solution file
└── README.md           # This file

How to Run

Google Colab:

  1. Open colab.research.google.com
  2. Create a new notebook
  3. Paste resume_matcher.py contents into a cell
  4. Press Shift + Enter to run

VS Code / Terminal:

python resume_matcher.py

No installations needed — uses Python standard library only.


Rules Followed

  • ✅ Only standard library used (math) — no numpy, pandas, sklearn
  • ✅ SKILL_ALIASES used exactly as provided, not modified
  • ✅ Multi-word phrases matched before single tokens
  • ✅ Vocabulary built from resume skills only
  • ✅ TF-IDF computed for resumes only, not JDs
  • ✅ IDF = ln(10/df), natural log, no smoothing
  • ✅ Cosine similarity uses Euclidean norm of resume vector
  • ✅ Scores rounded to 2 decimal places
  • ✅ Ties broken alphabetically by candidate name

Redrob AI Campus Hackathon · Powered by McKinley Rice

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors