diff --git a/AGENTS.md b/AGENTS.md index aa18171..0d755b6 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -317,7 +317,7 @@ Features: - API keys via `.streamlit/secrets.toml` or environment variables ### Streamlit Pages -- `pages/verify.py` — Email verification endpoint (`/verify?token=...`) +- `pages/verify.py` — Email verification endpoint (`/verify?token=...`); sends welcome email on confirmation - `pages/unsubscribe.py` — One-click unsubscribe endpoint (`/unsubscribe?token=...`) - `pages/impressum.py` — Legal notice (§ 5 DDG) - `pages/privacy.py` — Privacy policy @@ -342,7 +342,8 @@ Supported formats: 4. Jobs already displayed in the UI session are pre-seeded into `job_sent_logs` via `db.upsert_jobs()` + `db.log_sent_jobs()` so the first digest doesn't repeat them 5. `emailer.send_verification_email()` sends a confirmation link via Resend 6. User clicks the link → `pages/verify.py` calls `db.confirm_subscriber()` → sets `is_active=True`, then `db.set_subscriber_expiry()` sets `expires_at = now() + 30 days` -7. If email already active, the form shows "already subscribed" (no re-send) +7. `pages/verify.py` sends a best-effort welcome email via `emailer.send_welcome_email()` (fire-and-forget — failure doesn't affect confirmation) +8. If email already active, the form shows "already subscribed" (no re-send) ### Auto-Expiry - The 30-day clock starts at **DOI confirmation**, not signup (prevents wasted days while email is unconfirmed) @@ -381,9 +382,10 @@ Per-subscriber pipeline, designed to run in GitHub Actions (or any cron schedule Required env vars: `GOOGLE_API_KEY`, `SERPAPI_KEY`, `SUPABASE_URL`, `SUPABASE_SERVICE_KEY`, `RESEND_API_KEY`, `RESEND_FROM`, `APP_URL`. ### Email Templates (`emailer.py`) -- `send_daily_digest()` — HTML table of job matches with score badges and apply links -- `send_verification_email()` — CTA button linking to the verify page -- Both include an impressum footer line built from `IMPRESSUM_NAME`, `IMPRESSUM_ADDRESS`, `IMPRESSUM_EMAIL` env vars +- `send_daily_digest(user_email, jobs, unsubscribe_url, target_location)` — card-style job listings with score pill badges, location pins, "View Job" CTA buttons, match summary stats (excellent/good counts), and target location in header +- `send_welcome_email(email, target_location, subscription_days, privacy_url)` — sent after DOI confirmation; explains what to expect, subscription duration, and links to privacy policy +- `send_verification_email(email, verify_url)` — CTA button linking to the verify page +- All three include an impressum footer line built from `IMPRESSUM_NAME`, `IMPRESSUM_ADDRESS`, `IMPRESSUM_EMAIL` env vars --- @@ -482,7 +484,7 @@ Schema setup: run `python setup_db.py` to check tables and print migration SQL. | `test_cv_parser.py` (6 tests) | `cv_parser.py` | `_clean_text()` + `extract_text()` for .txt/.md, error cases | | `test_models.py` (23 tests) | `models.py` | All Pydantic models: validation, defaults, round-trip serialization | | `test_db.py` (35 tests) | `db.py` | Full GDPR lifecycle: add/confirm/expire/purge subscribers, deactivate by token, data deletion, subscription context, job upsert/dedup, sent-log tracking. All DB functions mocked at Supabase client level | -| `test_emailer.py` (7 tests) | `emailer.py` | HTML generation: job row badges, job count, unsubscribe link, impressum line | +| `test_emailer.py` (22 tests) | `emailer.py` | HTML generation: job row badges/cards/location, job count, match stats, unsubscribe link, target location in header, impressum line, welcome email (location, days, privacy, impressum) | | `test_app_consent.py` (5 tests) | `app.py` | GDPR consent checkbox: session state persistence, widget key separation, on_change sync | ### Testing conventions diff --git a/ROADMAP.md b/ROADMAP.md index 16b3c5b..0990790 100644 --- a/ROADMAP.md +++ b/ROADMAP.md @@ -37,22 +37,21 @@ Based on the current state (private repo, hosted on Streamlit Community Cloud) a ### 1.2 — Deploy Daily Digest - [x] **Set up GitHub Actions cron job** for daily_task.py (e.g., `cron: '0 7 * * *'` UTC) -- [ ] **Add secrets to GitHub Actions** — all required env vars from §10 -- [ ] **Test the full digest cycle** — subscribe, verify, receive digest, unsubscribe +- [x] **Add secrets to GitHub Actions** — all required env vars from §10 +- [x] **Test the full digest cycle** — subscribe, verify, receive digest, unsubscribe +- [x] **Create a welcome email after successful subscription** — explain what to expect, how to contact support, link to privacy policy, show some example matches +- [x] **Add unsubscribe link to digest emails** — include unique tokenized URL to securely identify subscriber without exposing email +- [x] **Make digest email prettier** — use HTML formatting, add Stellenscout logo, style job listings for better readability ### 1.3 — UX Quick Wins - [ ] **Personalize the UI** — greet user by first name extracted from CV profile - [ ] **Add "Edit Profile" step** — let user tweak skills/roles/preferences before searching (this is already in Open Issues) -- [ ] **Add a "Preferences" text input** — free-form like *"I want remote fintech jobs, no big corporations"* → append to Headhunter prompt +- [ ] **Add a "Preferences" text input** — free-form like *"I want remote fintech jobs, no big corporations"* → append to profile prompt - [ ] **Show job age warning** — if `posted_at` is >30 days, badge it as "possibly expired" - [ ] **Improve job cards** — show apply links more prominently, add company logos via Clearbit/Logo.dev - [ ] **Add digest preferences UI** — allow users to change `min_score` and cadence (daily/weekly) after subscription - -### 1.4 — Monitoring & Observability -- [ ] **Add structured logging** — replace `print()` with `logging` module, include run IDs -- [ ] **Track pipeline metrics** — jobs found per query, avg scores, API latency, cache hit rates -- [ ] **Set up error alerting** — GitHub Actions failure notifications (email or Slack webhook) -- [ ] **Add cost dashboard** — track daily SerpAPI + Gemini usage and estimated monthly spend +- [ ] **Remove random jobs from homepage before CV is entered** — show a friendly welcome message instead of empty job cards +- [ ] **Add filter/sort options for publishing date and score** — both in the digest email and on the homepage after search --- diff --git a/daily_task.py b/daily_task.py index d8f14fa..ca54651 100644 --- a/daily_task.py +++ b/daily_task.py @@ -214,6 +214,7 @@ def main() -> int: "company": ej.job.company_name, "url": _job_url(ej), "score": ej.evaluation.score, + "location": ej.job.location, } for ej in good_matches ] @@ -233,7 +234,12 @@ def main() -> int: log.info(" sub=%s — sending %d matches (score >= %d)", sub_id, len(email_jobs), sub_min_score) try: - send_daily_digest(sub_email, email_jobs, unsubscribe_url=unsubscribe_url) + send_daily_digest( + sub_email, + email_jobs, + unsubscribe_url=unsubscribe_url, + target_location=sub.get("target_location", ""), + ) except Exception: log.exception(" sub=%s — failed to send daily digest, continuing", sub_id) diff --git a/stellenscout/emailer.py b/stellenscout/emailer.py index 8e94c8e..70a7747 100644 --- a/stellenscout/emailer.py +++ b/stellenscout/emailer.py @@ -2,59 +2,108 @@ import os from datetime import datetime, timezone +from html import escape as _esc import resend +def _safe_url(url: str) -> str: + """Sanitise a URL for use in an HTML href attribute. + + Only ``http`` and ``https`` schemes are allowed. Anything else + (e.g. ``javascript:``, ``data:``) is replaced with ``#``. + """ + stripped = url.strip() + if stripped and not stripped.lower().startswith(("http://", "https://")): + return "#" + return _esc(stripped, quote=True) + + def _build_job_row(job: dict) -> str: - """Return an HTML table row for a single job.""" + """Return an HTML card block for a single job.""" score = job.get("score") badge_color = "#22c55e" if (score or 0) >= 80 else "#eab308" if (score or 0) >= 70 else "#f97316" score_html = ( - f'{score}/100' + f'{score}/100' if score else "" ) - apply_url = job.get("url", "#") + apply_url = _safe_url(job.get("url", "#")) + location = _esc(job.get("location", "")) + location_html = ( + f'
|
+ {title}
+ {company}
+ {location_html}
+ |
+ + {score_html} + | +
Jobs in {safe_location}
' if safe_location else "" + ) + + excellent = sum(1 for j in jobs if (j.get("score") or 0) >= 80) + good = sum(1 for j in jobs if 70 <= (j.get("score") or 0) < 80) + stats_parts: list[str] = [] + if excellent: + stats_parts.append( + f'' + f"{excellent} excellent" + ) + if good: + stats_parts.append( + f'' + f"{good} good" + ) + stats_html = f'{" ".join(stats_parts)}
' if stats_parts else "" + return f"""\ - + @@ -65,28 +114,20 @@ def _build_html(jobs: list[dict], unsubscribe_url: str = "") -> str:{today}
+ {location_subtitle}+
We found {len(jobs)} new job match{"es" if len(jobs) != 1 else ""} for you today:
+ {stats_html} -| Position | -Score | -Link | -
|---|
| ' + f'📍 ' + f"Daily AI-matched jobs in {safe_location} |