JobHunter

Automated job monitoring tool for work-study (alternance) opportunities in Systems & Network Administration. Aggregates multiple sources, filters offers by criteria, and provides an interactive tracking dashboard.

Overview

JobHunter is a self-hosted job search assistant that automates the tedious part of job hunting. It collects offers from official APIs, job boards, and company career pages, filters them against predefined criteria, and presents everything in a web dashboard with full application tracking.

Key principles:

Semi-automated — The tool finds and filters offers; the user decides when and where to apply
Privacy-first — Runs locally with SQLite, no data sent to external services (except Claude API for cover letter generation)
Multi-source — Aggregates France Travail API, Welcome to the Jungle, Indeed, and company career sites
Trackable — Built-in application tracker with status management, follow-up reminders, and statistics

What this tool does NOT do:

It does not send applications automatically
It does not log into user accounts on job platforms
It does not store sensitive data online

Features

Job Aggregation

Multi-source collection via official APIs and web scraping
Keyword-based filtering (title and description matching)
Location, contract type, and education level filters
Cross-source duplicate detection
Daily automated execution via scheduler

Application Tracker

Interactive tracking table with per-offer status management
Checkbox columns: CV sent, follow-up done
Date fields: date sent, follow-up date
Status workflow: New → Applied → Followed up → Interview → Accepted / Rejected / No response
Free-text notes per offer
Filters by status, source, company, and date range
Column sorting and full-text search
CSV export

Cover Letter Generation

AI-generated draft per offer via Anthropic Claude API
Personalized based on user CV + job description
Saved in database to avoid regeneration
One-click copy to clipboard

Statistics

Total offers found and new offers this week
CVs sent, follow-ups done, interviews obtained
Response rate tracking

Tech Stack

Layer	Technology
Backend	Flask with Python 3.11+
Database	SQLite (lightweight, no server required)
Scraping	Requests + BeautifulSoup
Advanced Scraping	Selenium (JS-rendered sites)
Scheduler	APScheduler
Frontend	HTML/CSS/JS with Jinja2 + DataTables.js
AI	Anthropic API (Claude)

Job Sources

Source	Method	Priority
France Travail	Official REST API (OAuth2)	High
Welcome to the Jungle	Web scraping	High
Company career sites (see below)	Custom scrapers	High
Indeed	Web scraping	Medium
LinkedIn	Public listings scraping	Low / Optional

Target Company Career Sites

Company	Career Page	Sector
Thales	https://careers.thalesgroup.com	Defense / Aerospace
Safran	https://www.safran-group.com/fr/emplois	Aerospace
Capgemini	https://www.capgemini.com/fr-fr/carrieres	IT Services
Sopra Steria	https://www.soprasteria.com/rejoignez-nous	IT Services
Atos / Eviden	https://jobs.atos.net	IT Services
Orange	https://orange.jobs	Telecom
Airbus	https://www.airbus.com/en/careers	Aerospace
CGI	https://www.cgi.com/france/fr-fr/carrieres	IT Services
Alten	https://www.alten.com/rejoignez-nous	IT Services
Bouygues Telecom	https://www.bouyguestelecom.fr/groupe/recrutement	Telecom

Search Criteria

Keywords

KEYWORDS = [
    "administrateur systèmes et réseaux",
    "administrateur systèmes",
    "administrateur réseaux",
    "admin sys",
    "admin réseau",
    "technicien systèmes et réseaux",
    "ingénieur systèmes",
    "ingénieur infrastructure",
    "technicien infrastructure",
    "technicien informatique",
    "administrateur infrastructure",
    "ingénieur réseaux",
    "sysadmin",
]

Filters

FILTERS = {
    "contract_type": "alternance",
    "location": "Île-de-France",
    "departments": ["75", "78", "91", "92", "93", "94", "95", "77"],
    "min_level": "bac+3",
    "max_level": "bac+5",
    "duration": "24 months",
}

Target Companies (Bonus Scoring)

Offers from major companies receive a higher relevance score:

TARGET_COMPANIES = [
    "Thales", "Safran", "Capgemini", "Sopra Steria", "Atos", "Eviden",
    "Orange", "Airbus", "CGI", "Alten", "Bouygues Telecom", "SFR",
    "Société Générale", "BNP Paribas", "AXA", "Engie", "EDF",
    "Dassault", "Naval Group", "SNCF", "RATP", "Renault", "PSA",
]

Architecture

┌─────────────────────────────────────────────────────┐
│                  SCHEDULER (APScheduler)             │
│               Daily execution at 8:00 AM             │
└──────────────────────┬──────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────┐
│                   COLLECTORS                        │
│                                                     │
│  ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────┐ │
│  │ France   │ │ Welcome  │ │  Career  │ │ Indeed │ │
│  │ Travail  │ │ to the   │ │  Sites   │ │        │ │
│  │  (API)   │ │ Jungle   │ │ (custom) │ │        │ │
│  └────┬─────┘ └────┬─────┘ └────┬─────┘ └───┬────┘ │
└───────┼─────────────┼───────────┼────────────┼──────┘
        │             │           │            │
        ▼             ▼           ▼            ▼
┌─────────────────────────────────────────────────────┐
│                   FILTER ENGINE                     │
│                                                     │
│  Keywords · Location · Contract type · Dedup        │
└──────────────────────┬──────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────┐
│                 DATABASE (SQLite)                   │
│                                                     │
│  offers ──── tracking ──── cover_letter_drafts      │
└──────────────────────┬──────────────────────────────┘
                       │
                       ▼
┌─────────────────────────────────────────────────────┐
│                WEB DASHBOARD (Flask)                │
│                                                     │
│  Offer table · Tracking · Filters · Stats · Export  │
│                                                     │
│  http://localhost:5000                              │
└─────────────────────────────────────────────────────┘

Project Structure

JobHunter/
├── app/
│   ├── __init__.py              # Flask initialization
│   ├── routes.py                # Dashboard routes
│   ├── models.py                # SQLite models (offers, tracking)
│   ├── database.py              # Database connection and init
│   ├── scrapers/
│   │   ├── __init__.py
│   │   ├── base_scraper.py      # Abstract base class
│   │   ├── france_travail.py    # France Travail API
│   │   ├── wttj.py              # Welcome to the Jungle
│   │   ├── indeed.py            # Indeed
│   │   ├── linkedin.py          # LinkedIn (optional)
│   │   └── career_sites/
│   │       ├── __init__.py
│   │       ├── thales.py
│   │       ├── safran.py
│   │       ├── capgemini.py
│   │       └── ...
│   ├── services/
│   │   ├── __init__.py
│   │   ├── filter_engine.py     # Offer filtering
│   │   ├── deduplication.py     # Duplicate detection
│   │   ├── cover_letter.py      # Claude API integration
│   │   └── scheduler.py         # Task scheduling
│   ├── templates/
│   │   ├── base.html
│   │   ├── dashboard.html
│   │   ├── offer_detail.html
│   │   └── stats.html
│   └── static/
│       ├── css/
│       │   └── style.css
│       └── js/
│           └── dashboard.js
├── data/
│   ├── jobhunter.db             # SQLite database (gitignored)
│   └── cv.txt                   # CV for cover letter generation
├── scripts/
│   ├── run_scrapers.py          # Manual scraper execution
│   └── init_db.py               # Database initialization
├── tests/
│   ├── test_scrapers.py
│   └── test_filters.py
├── .env.example
├── .gitignore
├── config.py
├── requirements.txt
├── ROADMAP.md
└── README.md

Installation

Prerequisites

Python 3.11+
France Travail developer account (francetravail.io)
Anthropic API key (console.anthropic.com) — for cover letter generation

Setup

git clone https://github.com/Kiwi6212/JobHunter.git
cd JobHunter

python -m venv venv
# Windows: venv\Scripts\activate
# macOS/Linux: source venv/bin/activate

pip install -r requirements.txt
cp .env.example .env

Edit .env with your API credentials:

FRANCE_TRAVAIL_CLIENT_ID=your_client_id
FRANCE_TRAVAIL_CLIENT_SECRET=your_client_secret
ANTHROPIC_API_KEY=your_api_key
FLASK_SECRET_KEY=a_random_secret_key
FLASK_DEBUG=true

Running

# Initialize the database
python scripts/init_db.py

# Run scrapers manually
python scripts/run_scrapers.py

# Launch the dashboard
python -m flask run

The dashboard is available at http://localhost:5000.

Backup & Restore

Automatic daily backup (cron)

scripts/backup.py copies data/jobhunter.db to a timestamped file in /home/ubuntu/backups/ and automatically deletes backups older than the 7 most recent ones.

Add this line to your crontab (crontab -e) to run the backup every night at 02:00:

0 2 * * * cd /home/ubuntu/JobHunter && /home/ubuntu/JobHunter/venv/bin/python scripts/backup.py >> /home/ubuntu/backups/backup.log 2>&1

Weekly dead link check (cron)

scripts/check_dead_links.py verifies offer URLs and marks dead links (HTTP 404/410, connection refused) as inactive. Only offers from the last 30 days are checked. Add this to your crontab to run every Sunday at 3:00 UTC:

0 3 * * 0 cd /home/ubuntu/JobHunter && /home/ubuntu/JobHunter/venv/bin/python scripts/check_dead_links.py >> /home/ubuntu/logs/dead_links.log 2>&1

Monthly inactive user cleanup (cron)

scripts/cleanup_inactive_users.py deletes non-admin user accounts that have been inactive for more than 90 days (no login), along with all associated data (tracking, documents, password resets). Add this to your crontab to run on the 1st of each month at 04:00 UTC:

0 4 1 * * cd /home/ubuntu/JobHunter && /home/ubuntu/JobHunter/venv/bin/python scripts/cleanup_inactive_users.py >> /home/ubuntu/logs/cleanup_users.log 2>&1

Weekly email digest (cron)

scripts/weekly_email.py sends a weekly email digest to subscribed users with the top 10 job offers (Match IA > 50%) from the last 7 days. Users can unsubscribe via a one-click link in the email or from their profile. Add this to your crontab to run every Monday at 09:00 UTC:

0 9 * * 1 cd /home/ubuntu/JobHunter && /home/ubuntu/JobHunter/venv/bin/python scripts/weekly_email.py >> /home/ubuntu/logs/weekly_email.log 2>&1

You can override the backup directory via the BACKUP_DIR environment variable:

BACKUP_DIR=/mnt/nas/backups python scripts/backup.py

Manual restore

scripts/restore.py restores a chosen backup over the live database. It creates a pre-restore safety copy before overwriting.

python scripts/restore.py /home/ubuntu/backups/jobhunter_20260308_020000.db

The script prompts for explicit confirmation (YES) before any data is overwritten.

Development Roadmap

Phase 1 — Foundations

Project setup (folder structure, dependencies, configuration)
Database models (offers and tracking tables)
Basic Flask dashboard with empty table

Phase 2 — First Source

France Travail API integration (OAuth2, search, parsing)
Filter engine (keywords, location, contract type)
Offer display in dashboard table

Phase 3 — Tracking

Interactive columns (checkboxes, date fields, status dropdown)
AJAX persistence (save changes without page reload)
Filters and column sorting

Phase 4 — Additional Sources

Welcome to the Jungle scraper
Career site scrapers (Thales, Safran, then others)
Cross-source deduplication

Phase 5 — Intelligence

Cover letter generation (Claude API)
Relevance scoring based on profile match

Phase 6 — Automation & Polish

APScheduler for daily execution
Statistics dashboard header
CSV export
Indeed scraper
LinkedIn scraper (optional)
UI/UX improvements

See ROADMAP.md for detailed progress.

Maintenance & Error Pages

Maintenance mode (Nginx)

To enable maintenance mode without stopping the app, create a flag file:

touch /home/ubuntu/JobHunter/maintenance_on

To disable maintenance mode:

rm /home/ubuntu/JobHunter/maintenance_on

Nginx configuration

Add the following to your Nginx server block to serve custom error pages and enable maintenance mode:

# --- Maintenance mode ---
# If the flag file exists, return 503 for all requests
if (-f /home/ubuntu/JobHunter/maintenance_on) {
    return 503;
}

# --- Custom error pages ---
error_page 502 /static/error.html;
error_page 503 /static/maintenance.html;

location = /static/maintenance.html {
    root /home/ubuntu/JobHunter/app;
    internal;
}

location = /static/error.html {
    root /home/ubuntu/JobHunter/app;
    internal;
}

502 (Bad Gateway) — served when Gunicorn/Flask is down or unresponsive
503 (Service Unavailable) — served when maintenance mode is enabled
404 and 500 — handled by Flask with custom templates (templates/404.html, templates/500.html)

The internal directive ensures these pages are only served by Nginx error handling, not directly accessible via URL.

Security

API keys — Stored in .env, never committed to version control
Database — Local SQLite file excluded from Git
Scraping — Respects robots.txt, includes delays between requests, realistic user-agent headers
Rate limiting — Built-in delays to avoid IP blocking
Personal data — CV stored locally only, transmitted exclusively to Claude API for cover letter generation

License

MIT License — see LICENSE for details.

Credits

Mathias Quillateau — GitHub · LinkedIn

Code assisted by Claude Code (Anthropic).

Name		Name	Last commit message	Last commit date
Latest commit History 132 Commits
.claude		.claude
app		app
data/cv		data/cv
logs		logs
nginx		nginx
scripts		scripts
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
UserspcJobHunterswagger_lba.json		UserspcJobHunterswagger_lba.json
config.py		config.py
requirements-lock.txt		requirements-lock.txt
requirements.txt		requirements.txt
run.py		run.py

Folders and files

Latest commit

History

Repository files navigation

JobHunter

Overview

Features

Job Aggregation

Application Tracker

Cover Letter Generation

Statistics

Tech Stack

Job Sources

Target Company Career Sites

Search Criteria

Keywords

Filters

Target Companies (Bonus Scoring)

Architecture

Project Structure

Installation

Prerequisites

Setup

Running

Backup & Restore

Automatic daily backup (cron)

Weekly dead link check (cron)

Monthly inactive user cleanup (cron)

Weekly email digest (cron)

Manual restore

Development Roadmap

Phase 1 — Foundations

Phase 2 — First Source

Phase 3 — Tracking

Phase 4 — Additional Sources

Phase 5 — Intelligence

Phase 6 — Automation & Polish

Maintenance & Error Pages

Maintenance mode (Nginx)

Nginx configuration

Security

License

Credits

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages