RAG platform for educational use at DSV, Stockholm University. Teachers upload course materials; students get an AI assistant grounded in those documents, with safeguards designed to support learning.
- Two RAG strategies:
simple(one-shot retrieve-then-answer) andFLARE(multi-turn, logprob-triggered mid-stream retrieval). The legacyparallelstrategy was retired in favour of the tool-use axis below. - Optional agentic phase: orthogonal
tool_use_enabledtoggle on each course. When on, generation splits into a hidden-thinking research phase (model callskeyword_search/ RAG seed / KG-expansion tools, with FLARE's logprob signal injected as a tool event) followed by a clean writeup phase. Per-tool expandable results, research thinking, and a research/writeup token-split are persisted alongside the message and rendered above the assistant bubble. - Inline citations + sources panel: assistant replies get inline
[n]badges (accepts naked-digit and filename-form variants); a right-rail sources panel shows what was actually cited, with a count button that resets the uncited-override on fresh-open. - Course knowledge graph: documents are auto-classified (lecture, transcript, exercise, solution, ...) on
gpt-oss-120band cross-linked withpart_of_unit/solution_of/prerequisite_of/applied_inedges. Retrieval expands top-k along the graph. Gated by thecourse_kgfeature flag. - Aegis prompt coaching: per-keystroke live analyzer (
gpt-oss-120b, JSON-schema-strict,reasoning_effort: low) that returns 0..=2 tagged suggestions for the draft the student is typing, severity-coloured against an 8-kind CLEAR-grounded rubric, with a Beginner/Expert calibration toggle. Soft-blocks Send when suggestions are present;Use ideasrewrites the draft viagpt-oss-120b. Mobile drawer + tablet support; per-iteration history persisted for export. Gated by theaegisfeature flag. - Extraction guard: separate from Aegis. Per-turn intent classifier + per-chunk output check (
gpt-oss-120b, low effort) + Socratic rewriter (gpt-oss-120b), with KG-driven multi-turn proximity tracking. Gated by theextraction_guardfeature flag. - Pluggable embeddings: admin-managed catalog (Snowflake arctic-embed-m-v2.0 default, BGE, BAAI, GTE, mxbai, EmbeddingGemma, multilingual-e5, Qwen3-Embedding, OpenAI) with memory-budgeted LRU cache and on-demand benchmarks. Per-course rotation via lazy re-embed against versioned Qdrant collections.
- Daily AI spending caps: per-student-per-course and per-owner aggregate, both daily. Chat returns 429 with the optimistic bubble preserved and the real error surfaced.
- Conversations UX: theme toggle (light/dark/system) in the header, fresh new-chat as default landing, LLM-grounded suggested questions on the empty state (drawn from the three latest sources), bidirectional unread + explicit acknowledgements, frozen pins for owners, sticky teacher unreviewed-tab list.
- LMS integration: Moodle local plugin (iframe + enrolment sync + MBZ import), site-level Moodle/Canvas LTI 1.3 with first-launch course binding, Canvas REST sync.
- DSV Play transcript pipeline: hourly VTT fetch + index for play.dsv.su.se URLs; teacher-configurable Play designation codes drive automatic discovery of new lecture recordings.
- Auth: Shibboleth (SAML) primary; HMAC-signed external-auth invites validated entirely inside Apache via
mod_lua; attribute-based role auto-promotion rules. - Privacy & i18n: pseudonymisation for
ext:users, in-app data-handling ack, English + Swedish, WCAG 2.1 AA fixes.
Detail figures for the document-ingest and chat/RAG pipelines (including the FLARE multi-turn loop): docs/ARCHITECTURE.md.
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Regenerate with docs/screenshots/regenerate.mjs (see docs/screenshots/README.md).
| Layer | Technology |
|---|---|
| Backend | Rust (Axum, SQLx, Tokio) |
| Frontend | React 19, TypeScript 6, Vite, TanStack Router/Query, Tailwind 4, react-force-graph-2d, i18next 26 |
| Frontend runtime | Node 26 (Alpine) in Docker |
| Database | PostgreSQL 16 |
| Vector DB | Qdrant (per-course versioned collections) |
| LLM | Cerebras (default; gpt-oss-120b across the stack ; classifiers, Aegis, rewrites, writeup) or any OpenAI-compatible endpoint |
| Embeddings | OpenAI or local fastembed (memory-budgeted LRU cache, HuggingFace cache persisted on /data0 in prod) |
| Edge | Apache 2 with mod_shib + mod_lua |
cp .env.example .env # add CEREBRAS_API_KEY, OPENAI_API_KEY
docker compose upBackend on :3000, frontend dev on :5173. With MINERVA_DEV_MODE=true (compose default) Shibboleth is bypassed; the backend reads X-Dev-User and falls back to the first admin in MINERVA_ADMINS.
Production:
docker compose -f docker-compose.prod.yml up -d
# or
docker pull ghcr.io/edwinexd/minerva:masterFor the k3s production layout used at DSV, see k8s/.
| Variable | Description |
|---|---|
DATABASE_URL |
PostgreSQL connection string |
QDRANT_URL |
Qdrant gRPC endpoint |
MINERVA_HMAC_SECRET |
Signs embed/invite/LTI tokens; mirrored to Apache for mod_lua |
MINERVA_ADMINS |
Comma-separated admin eppn prefixes |
MINERVA_DOCS_PATH |
Document storage path |
CEREBRAS_API_KEY |
Inference key |
OPENAI_API_KEY |
Embedding key (optional with fastembed) |
MINERVA_BASE_URL |
Public base URL for LTI tool URLs |
MINERVA_LTI_KEY_SEED |
RSA seed for LTI 1.3 (falls back to HMAC secret) |
MINERVA_SERVICE_API_KEY |
Bearer for /api/service/* pipelines |
MINERVA_DEV_MODE |
true bypasses Shibboleth |
MINERVA_DEFAULT_COURSE_DAILY_TOKEN_LIMIT |
Per-student-per-course default (0 = unlimited) |
MINERVA_DEFAULT_OWNER_DAILY_TOKEN_LIMIT |
Per-owner aggregate default (0 = unlimited) |
MINERVA_CANVAS_AUTO_SYNC_INTERVAL_HOURS |
Canvas re-sync interval |
See .env.example for the rest.
| Path prefix | Auth | Why |
|---|---|---|
/api/integration/* |
Per-course API key | Moodle server-to-server |
/api/service/* |
Global service API key | Automated pipelines |
/api/embed/*, /embed/* |
HMAC-signed embed token | Iframe chat |
/lti/* |
LTI 1.3 (OIDC + JWT) | LMS-driven login |
/api/external-auth/* |
HMAC-signed invite token | External-auth callback |
/embedding-catalog |
Public read-only | Teacher feed of enabled models |
| everything else | Shibboleth | Default |
See apache/README.md for the vhost.
Per DSV-IT policy, new sites must meet the accessibility law; Minerva targets WCAG 2.2 level AA. Compliance is enforced automatically (pre-commit + CI), not just reviewed by hand, across three layers:
| Layer | Tool | Catches |
|---|---|---|
| Static lint | eslint-plugin-jsx-a11y (strict) |
Markup-level issues: alt text, label/control association, ARIA misuse, missing keyboard handlers |
| Rendered components | Vitest + Testing Library + axe-core (WCAG 2.2 AA tags) |
Violations only visible once a component is rendered to the DOM |
| End-to-end | pa11y-ci (htmlcs WCAG2AA + axe) in a real browser |
Color contrast and other render-time criteria jsdom can't compute |
Modal dialogs use the native <dialog> element with showModal(), so focus
trapping, Escape-to-close, top-layer rendering and the ::backdrop come from the
platform rather than hand-rolled code. Details and local commands
(npm run test:run, npm run pa11y) live in frontend/README.md.
CLA in CLA.md. CI runs:
- Backend:
cargo fmt,cargo clippy --all-targets(warnings treated as errors),cargo build(all withSQLX_OFFLINE=true). - Frontend:
eslint --max-warnings 0(incl.jsx-a11ystrict),tsc -b,tsc -p tsconfig.test.json,vitest run(axe-core component checks),vite build, plus apa11y-cijob that audits the built app in a real browser. See Accessibility. - Moodle plugin:
php -l+phpcsagainstmoodlehq/moodle-cs. - Apache: lua syntax + unit tests +
apache2ctl configtestforapache/minerva-app.conf. - Style gates: ban emdashes + ban space-dash-dash-space anywhere a non-whitespace char precedes them on the line.
- Migrations:
migrations-immutableblocks edits to already-committedbackend/migrations/*.sqlfiles (sqlx content-hashes them at startup).
Pre-commit mirrors the same set; install with pre-commit install (the hook is wired via pipx install pre-commit).
After editing any sqlx::query! / query_as! macro:
docker compose up -d postgres
cd backend && DATABASE_URL=postgres://minerva:minerva@localhost:5432/minerva \
cargo sqlx prepare --workspace
git add .sqlx/The committed backend/.sqlx/ cache is what CI and the prod Dockerfile build against; forgetting this step fails locally in the pre-commit cargo check/clippy gate.
AGPL-3.0. Logo by Tilly Makrof-Johansson.







