DigestFlow

AI-powered local-first Django pipeline for turning a topic into:

a structured digest
a LinkedIn-ready content package

Given a topic, DigestFlow produces:

a structured factual digest
a publish-ready LinkedIn post with hooks, CTAs, and hashtags

What it does

DigestFlow currently supports a full MVP pipeline:

Topic / user input -> demo source items -> cleaner -> dedupe -> ranking -> AI digest -> LinkedIn package -> result page

DigestFlow separates:

synthesis (Digest)
packaging (ContentPackage)

This makes the system easier to validate, debug, and extend.

The system is designed to be debug-friendly and safe to iterate on locally:

demo source instead of real external ingestion
structured validation for AI output
mock fallback when AI is unavailable or returns invalid output
DigestRun metrics and console logging
minimal Django UI for creating a topic, running the pipeline, and viewing the result

Research / Source Discovery

DigestFlow also includes a history-aware source discovery engine for long-lived topics. It can:

track query performance and source quality over time
diagnose weak discovery outcomes, such as duplicate-heavy or quality-heavy results
repair search strategy across bounded discovery rounds
run target-seeking discovery with a safety cap
avoid exhausted search surfaces on repeated Find clicks
expose Research History and Copy full history for debugging and review

For the detailed flow and terminology, see:

docs/research_discovery_flow.md
docs/research_discovery_glossary.md

Current MVP status

Implemented:

Django backend and admin
Topic model and Topic-based runs
demo source stage
cleaner
dedupe by URL and normalized title
deterministic topic-aware ranking
Article storage
AI digest stage
LinkedIn packaging stage
structured validation for digest and package output
mock fallback for AI failures / invalid JSON
token and estimated cost tracking
DigestRun metrics
console logging
minimal web UI
integration tests covering:
- completed
- partial_failed
- failed
unit tests for:
- cleaner
- deduper
- ranker
- digest validators
- packaging validators
- token / cost helpers

Current limitations:

source ingestion is demo-only
URLs are synthetic
real AI calls may fallback to mock
UI is minimal and not production-ready

Tech stack

Python 3.13
Django 5.2
SQLite
OpenAI API (optional)

Project flow

Topic / user input
  ->
Demo source items
  ->
Cleaner
  ->
Dedupe
  ->
Ranking
  ->
AI Digest
  ->
LinkedIn ContentPackage
  ->
Result page / admin / metrics

Local setup

cd C:\Projects\DigestFlow
.\.venv\Scripts\python.exe manage.py migrate
.\.venv\Scripts\python.exe manage.py runserver

Open:

How to use (UI)

Open http://127.0.0.1:8000/
Enter a topic, for example: "AI automation for operations"
Click Generate digest
You will be redirected to /runs/<id>/

On the result page you can:

read the structured digest
copy the LinkedIn post
review hooks, CTAs, and hashtags
inspect validation and metrics

Environment

Create .env in the project root:

OPENAI_API_KEY=your_api_key_here
OPENAI_MODEL=gpt-4o-mini
OPENAI_TIMEOUT_SECONDS=45
AI_DAILY_TOKEN_BUDGET=100000

If OPENAI_API_KEY is missing, placeholder, or the model returns unusable output, DigestFlow falls back to mock responses.

Useful commands

Check project health:

.\.venv\Scripts\python.exe manage.py check

Run tests:

.\.venv\Scripts\python.exe manage.py test

Preview demo source items:

.\.venv\Scripts\python.exe manage.py preview_demo_sources --topic-id 2

Run digest smoke test:

.\.venv\Scripts\python.exe manage.py ai_digest_smoke_test --topic "AI automation"

Run digest stage only:

.\.venv\Scripts\python.exe manage.py run_digest_stage --topic-id 2

Run packaging stage only:

.\.venv\Scripts\python.exe manage.py run_packaging_stage --digest-id 1

Run full demo pipeline:

.\.venv\Scripts\python.exe manage.py run_digest_demo --topic-id 2

Observability

Where to look:

console logs while running commands or the web app
DigestRun.metrics in admin
result page /runs/<id>/

Metrics currently include:

raw items count
count after cleaning / dedupe / ranking
selected articles for prompt
used Article.id list
digest / packaging status
provider (openai / mock)
token usage when available
estimated cost when available

Design principles

pipeline-first
deterministic preprocessing before AI
structured validation before accepting AI output
simple local observability
minimal UI before product polish
no agents
no async orchestration
no external monitoring stack

Why this project exists

DigestFlow is built to explore a production-style AI pipeline:

deterministic preprocessing before LLM
strict output validation
controlled failure handling
traceable execution via metrics

It is intentionally local-first and minimal to make iteration fast and debugging transparent.

Name		Name	Last commit message	Last commit date
Latest commit History 100 Commits
apps		apps
config		config
configs		configs
docs		docs
prompts		prompts
scripts		scripts
services		services
templates/digestflow		templates/digestflow
tests		tests
.env.example		.env.example
.gitignore		.gitignore
AGENTS.md		AGENTS.md
README.md		README.md
manage.py		manage.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DigestFlow

What it does

Research / Source Discovery

Current MVP status

Tech stack

Project flow

Local setup

How to use (UI)

Environment

Useful commands

Observability

Design principles

Why this project exists

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

DigestFlow

What it does

Research / Source Discovery

Current MVP status

Tech stack

Project flow

Local setup

How to use (UI)

Environment

Useful commands

Observability

Design principles

Why this project exists

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages