Automated job monitoring tool for work-study (alternance) opportunities in Systems & Network Administration. Aggregates multiple sources, filters offers by criteria, and provides an interactive tracking dashboard.
JobHunter is a self-hosted job search assistant that automates the tedious part of job hunting. It collects offers from official APIs, job boards, and company career pages, filters them against predefined criteria, and presents everything in a web dashboard with full application tracking.
Key principles:
- Semi-automated — The tool finds and filters offers; the user decides when and where to apply
- Privacy-first — Runs locally with SQLite, no data sent to external services (except Claude API for cover letter generation)
- Multi-source — Aggregates France Travail API, Welcome to the Jungle, Indeed, and company career sites
- Trackable — Built-in application tracker with status management, follow-up reminders, and statistics
What this tool does NOT do:
- It does not send applications automatically
- It does not log into user accounts on job platforms
- It does not store sensitive data online
- Multi-source collection via official APIs and web scraping
- Keyword-based filtering (title and description matching)
- Location, contract type, and education level filters
- Cross-source duplicate detection
- Daily automated execution via scheduler
- Interactive tracking table with per-offer status management
- Checkbox columns: CV sent, follow-up done
- Date fields: date sent, follow-up date
- Status workflow:
New→Applied→Followed up→Interview→Accepted/Rejected/No response - Free-text notes per offer
- Filters by status, source, company, and date range
- Column sorting and full-text search
- CSV export
- AI-generated draft per offer via Anthropic Claude API
- Personalized based on user CV + job description
- Saved in database to avoid regeneration
- One-click copy to clipboard
- Total offers found and new offers this week
- CVs sent, follow-ups done, interviews obtained
- Response rate tracking
| Layer | Technology |
|---|---|
| Backend | Flask with Python 3.11+ |
| Database | SQLite (lightweight, no server required) |
| Scraping | Requests + BeautifulSoup |
| Advanced Scraping | Selenium (JS-rendered sites) |
| Scheduler | APScheduler |
| Frontend | HTML/CSS/JS with Jinja2 + DataTables.js |
| AI | Anthropic API (Claude) |
| Source | Method | Priority |
|---|---|---|
| France Travail | Official REST API (OAuth2) | High |
| Welcome to the Jungle | Web scraping | High |
| Company career sites (see below) | Custom scrapers | High |
| Indeed | Web scraping | Medium |
| Public listings scraping | Low / Optional |
| Company | Career Page | Sector |
|---|---|---|
| Thales | https://careers.thalesgroup.com | Defense / Aerospace |
| Safran | https://www.safran-group.com/fr/emplois | Aerospace |
| Capgemini | https://www.capgemini.com/fr-fr/carrieres | IT Services |
| Sopra Steria | https://www.soprasteria.com/rejoignez-nous | IT Services |
| Atos / Eviden | https://jobs.atos.net | IT Services |
| Orange | https://orange.jobs | Telecom |
| Airbus | https://www.airbus.com/en/careers | Aerospace |
| CGI | https://www.cgi.com/france/fr-fr/carrieres | IT Services |
| Alten | https://www.alten.com/rejoignez-nous | IT Services |
| Bouygues Telecom | https://www.bouyguestelecom.fr/groupe/recrutement | Telecom |
KEYWORDS = [
"administrateur systèmes et réseaux",
"administrateur systèmes",
"administrateur réseaux",
"admin sys",
"admin réseau",
"technicien systèmes et réseaux",
"ingénieur systèmes",
"ingénieur infrastructure",
"technicien infrastructure",
"technicien informatique",
"administrateur infrastructure",
"ingénieur réseaux",
"sysadmin",
]FILTERS = {
"contract_type": "alternance",
"location": "Île-de-France",
"departments": ["75", "78", "91", "92", "93", "94", "95", "77"],
"min_level": "bac+3",
"max_level": "bac+5",
"duration": "24 months",
}Offers from major companies receive a higher relevance score:
TARGET_COMPANIES = [
"Thales", "Safran", "Capgemini", "Sopra Steria", "Atos", "Eviden",
"Orange", "Airbus", "CGI", "Alten", "Bouygues Telecom", "SFR",
"Société Générale", "BNP Paribas", "AXA", "Engie", "EDF",
"Dassault", "Naval Group", "SNCF", "RATP", "Renault", "PSA",
]┌─────────────────────────────────────────────────────┐
│ SCHEDULER (APScheduler) │
│ Daily execution at 8:00 AM │
└──────────────────────┬──────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────┐
│ COLLECTORS │
│ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌────────┐ │
│ │ France │ │ Welcome │ │ Career │ │ Indeed │ │
│ │ Travail │ │ to the │ │ Sites │ │ │ │
│ │ (API) │ │ Jungle │ │ (custom) │ │ │ │
│ └────┬─────┘ └────┬─────┘ └────┬─────┘ └───┬────┘ │
└───────┼─────────────┼───────────┼────────────┼──────┘
│ │ │ │
▼ ▼ ▼ ▼
┌─────────────────────────────────────────────────────┐
│ FILTER ENGINE │
│ │
│ Keywords · Location · Contract type · Dedup │
└──────────────────────┬──────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────┐
│ DATABASE (SQLite) │
│ │
│ offers ──── tracking ──── cover_letter_drafts │
└──────────────────────┬──────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────┐
│ WEB DASHBOARD (Flask) │
│ │
│ Offer table · Tracking · Filters · Stats · Export │
│ │
│ http://localhost:5000 │
└─────────────────────────────────────────────────────┘
JobHunter/
├── app/
│ ├── __init__.py # Flask initialization
│ ├── routes.py # Dashboard routes
│ ├── models.py # SQLite models (offers, tracking)
│ ├── database.py # Database connection and init
│ ├── scrapers/
│ │ ├── __init__.py
│ │ ├── base_scraper.py # Abstract base class
│ │ ├── france_travail.py # France Travail API
│ │ ├── wttj.py # Welcome to the Jungle
│ │ ├── indeed.py # Indeed
│ │ ├── linkedin.py # LinkedIn (optional)
│ │ └── career_sites/
│ │ ├── __init__.py
│ │ ├── thales.py
│ │ ├── safran.py
│ │ ├── capgemini.py
│ │ └── ...
│ ├── services/
│ │ ├── __init__.py
│ │ ├── filter_engine.py # Offer filtering
│ │ ├── deduplication.py # Duplicate detection
│ │ ├── cover_letter.py # Claude API integration
│ │ └── scheduler.py # Task scheduling
│ ├── templates/
│ │ ├── base.html
│ │ ├── dashboard.html
│ │ ├── offer_detail.html
│ │ └── stats.html
│ └── static/
│ ├── css/
│ │ └── style.css
│ └── js/
│ └── dashboard.js
├── data/
│ ├── jobhunter.db # SQLite database (gitignored)
│ └── cv.txt # CV for cover letter generation
├── scripts/
│ ├── run_scrapers.py # Manual scraper execution
│ └── init_db.py # Database initialization
├── tests/
│ ├── test_scrapers.py
│ └── test_filters.py
├── .env.example
├── .gitignore
├── config.py
├── requirements.txt
├── ROADMAP.md
└── README.md
- Python 3.11+
- France Travail developer account (francetravail.io)
- Anthropic API key (console.anthropic.com) — for cover letter generation
git clone https://github.com/Kiwi6212/JobHunter.git
cd JobHunter
python -m venv venv
# Windows: venv\Scripts\activate
# macOS/Linux: source venv/bin/activate
pip install -r requirements.txt
cp .env.example .envEdit .env with your API credentials:
FRANCE_TRAVAIL_CLIENT_ID=your_client_id
FRANCE_TRAVAIL_CLIENT_SECRET=your_client_secret
ANTHROPIC_API_KEY=your_api_key
FLASK_SECRET_KEY=a_random_secret_key
FLASK_DEBUG=true# Initialize the database
python scripts/init_db.py
# Run scrapers manually
python scripts/run_scrapers.py
# Launch the dashboard
python -m flask runThe dashboard is available at http://localhost:5000.
scripts/backup.py copies data/jobhunter.db to a timestamped file in
/home/ubuntu/backups/ and automatically deletes backups older than the 7
most recent ones.
Add this line to your crontab (crontab -e) to run the backup every night at
02:00:
0 2 * * * cd /home/ubuntu/JobHunter && /home/ubuntu/JobHunter/venv/bin/python scripts/backup.py >> /home/ubuntu/backups/backup.log 2>&1scripts/check_dead_links.py verifies offer URLs and marks dead links
(HTTP 404/410, connection refused) as inactive. Only offers from the last
30 days are checked. Add this to your crontab to run every Sunday at 3:00 UTC:
0 3 * * 0 cd /home/ubuntu/JobHunter && /home/ubuntu/JobHunter/venv/bin/python scripts/check_dead_links.py >> /home/ubuntu/logs/dead_links.log 2>&1scripts/cleanup_inactive_users.py deletes non-admin user accounts that have
been inactive for more than 90 days (no login), along with all associated data
(tracking, documents, password resets). Add this to your crontab to run on the
1st of each month at 04:00 UTC:
0 4 1 * * cd /home/ubuntu/JobHunter && /home/ubuntu/JobHunter/venv/bin/python scripts/cleanup_inactive_users.py >> /home/ubuntu/logs/cleanup_users.log 2>&1scripts/weekly_email.py sends a weekly email digest to subscribed users
with the top 10 job offers (Match IA > 50%) from the last 7 days. Users
can unsubscribe via a one-click link in the email or from their profile.
Add this to your crontab to run every Monday at 09:00 UTC:
0 9 * * 1 cd /home/ubuntu/JobHunter && /home/ubuntu/JobHunter/venv/bin/python scripts/weekly_email.py >> /home/ubuntu/logs/weekly_email.log 2>&1You can override the backup directory via the BACKUP_DIR environment variable:
BACKUP_DIR=/mnt/nas/backups python scripts/backup.pyscripts/restore.py restores a chosen backup over the live database. It
creates a pre-restore safety copy before overwriting.
python scripts/restore.py /home/ubuntu/backups/jobhunter_20260308_020000.dbThe script prompts for explicit confirmation (YES) before any data is
overwritten.
- Project setup (folder structure, dependencies, configuration)
- Database models (
offersandtrackingtables) - Basic Flask dashboard with empty table
- France Travail API integration (OAuth2, search, parsing)
- Filter engine (keywords, location, contract type)
- Offer display in dashboard table
- Interactive columns (checkboxes, date fields, status dropdown)
- AJAX persistence (save changes without page reload)
- Filters and column sorting
- Welcome to the Jungle scraper
- Career site scrapers (Thales, Safran, then others)
- Cross-source deduplication
- Cover letter generation (Claude API)
- Relevance scoring based on profile match
- APScheduler for daily execution
- Statistics dashboard header
- CSV export
- Indeed scraper
- LinkedIn scraper (optional)
- UI/UX improvements
See ROADMAP.md for detailed progress.
To enable maintenance mode without stopping the app, create a flag file:
touch /home/ubuntu/JobHunter/maintenance_onTo disable maintenance mode:
rm /home/ubuntu/JobHunter/maintenance_onAdd the following to your Nginx server block to serve custom error pages and enable maintenance mode:
# --- Maintenance mode ---
# If the flag file exists, return 503 for all requests
if (-f /home/ubuntu/JobHunter/maintenance_on) {
return 503;
}
# --- Custom error pages ---
error_page 502 /static/error.html;
error_page 503 /static/maintenance.html;
location = /static/maintenance.html {
root /home/ubuntu/JobHunter/app;
internal;
}
location = /static/error.html {
root /home/ubuntu/JobHunter/app;
internal;
}- 502 (Bad Gateway) — served when Gunicorn/Flask is down or unresponsive
- 503 (Service Unavailable) — served when maintenance mode is enabled
- 404 and 500 — handled by Flask with custom templates (
templates/404.html,templates/500.html)
The internal directive ensures these pages are only served by Nginx error handling, not directly accessible via URL.
- API keys — Stored in
.env, never committed to version control - Database — Local SQLite file excluded from Git
- Scraping — Respects
robots.txt, includes delays between requests, realistic user-agent headers - Rate limiting — Built-in delays to avoid IP blocking
- Personal data — CV stored locally only, transmitted exclusively to Claude API for cover letter generation
MIT License — see LICENSE for details.
Mathias Quillateau — GitHub · LinkedIn
Code assisted by Claude Code (Anthropic).