Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
a6d55b9
Add Mistral as an alternative AI backend
ddulic Mar 10, 2026
098813a
Use Mistral dedicated OCR API and address code review suggestions
ddulic Mar 10, 2026
50d84bf
Address Copilot PR review comments
ddulic Mar 10, 2026
9b6aca5
Fix unused logger and test kwarg assertion
ddulic Mar 10, 2026
8fb46cd
Address Copilot review: candidate zero-norm guard, Mistral tests, con…
ddulic Mar 10, 2026
cf75c77
Fix incorrect ocr_model value in concurrency test
ddulic Mar 10, 2026
5a7e047
Update documentation to reflect Mistral AI backend support
ddulic Mar 10, 2026
8977156
Guard against zero/negative max_concurrency to prevent semaphore dead…
ddulic Mar 10, 2026
515e773
Remove unused config parameter from SummaryModule
ddulic Mar 10, 2026
76c4b22
Address Copilot review: OCR robustness, compact JSON schema, config w…
ddulic Mar 10, 2026
7d5899c
Fix mistralai v2 import, version constraint, and broken integration test
ddulic Mar 10, 2026
6c2c9bc
Update uv.lock for mistralai>=2.0.0
ddulic Mar 10, 2026
0aa2ab0
Use json.dumps for non-string generate_json content to ensure valid J…
ddulic Mar 10, 2026
df6936d
Add missing test coverage for Gemini, Mistral edge cases, embedding v…
ddulic Mar 10, 2026
ee6b4e8
Update docs to reflect provider-agnostic AI backend
ddulic Mar 10, 2026
ec2ec1f
Fix mypy lint errors in Gemini and Mistral services and tests
ddulic Mar 10, 2026
cdb725e
Change default ports from 8080/8081 to 8000/8001
ddulic Mar 11, 2026
c8df416
Address PR review comments
ddulic Mar 11, 2026
1672069
Merge pull request #1 from ddulic/feat/mistral-ai-backend
ddulic Mar 11, 2026
3fec311
Fix Dockerfile ports to match updated default (8080 -> 8000)
ddulic Mar 11, 2026
b67a6a3
Add Docker Compose example, fix README ports and broken server docs link
ddulic Mar 11, 2026
7a94ee5
Fix GHCR image URL to allenporter/supernote, remove invalid JWT secre…
ddulic Mar 11, 2026
0093665
Remove broken server README link
ddulic Mar 11, 2026
8ec1b27
Merge pull request #2 from ddulic/fix/dockerfile-ports
ddulic Mar 11, 2026
e130588
Fix default port to match Dockerfile (8080 -> 8000)
ddulic Mar 11, 2026
1161902
Fix ephemeral mode default port (8080 -> 8000)
ddulic Mar 11, 2026
2d6dfec
Merge pull request #3 from ddulic/fix/dockerfile-ports
ddulic Mar 11, 2026
dac2004
fix(review): address PR #65 review comments
ddulic Apr 13, 2026
cfd26a7
fix(review): simplify concurrency clamping and guard zero-norm at ind…
ddulic Apr 14, 2026
de86fc1
fix(config): set gemini_chat_model default to gemini-3-flash-preview
ddulic Apr 14, 2026
31b3a17
fix(review): restore gemini_ocr_module and mock_gemini_service names …
ddulic Apr 14, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 24 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

**The AI-powered intelligence layer for your Ratta Supernote.**

This toolkit is a self-hosted implementation of the **Supernote Private Cloud** protocol. While Ratta's official private cloud provides a solid and reliable sync foundation, this project extends that experience with an **AI-driven synthesis engine**—transforming your handwritten notes into structured, searchable knowledge using Google Gemini.
This toolkit is a self-hosted implementation of the **Supernote Private Cloud** protocol. While Ratta's official private cloud provides a solid and reliable sync foundation, this project extends that experience with an **AI-driven synthesis engine**—transforming your handwritten notes into structured, searchable knowledge using Google Gemini or Mistral AI.

<p align="center">
<img src="docs/static-assets/hero-overview.jpg" alt="Supernote Overview" width="800">
Expand All @@ -26,7 +26,7 @@ This project is designed to be **fully compatible** with the official Supernote
Beyond simple storage, Supernote provides an active processing pipeline to increase the utility of your notes:

1. **Sync**: Your device uploads `.note` files using the official Private Cloud protocol.
2. **Transcribe**: The server extract pages and use Gemini Vision to OCR your handwriting.
2. **Transcribe**: The server extracts pages and uses an AI provider (Gemini or Mistral) to OCR your handwriting.
3. **Synthesize**: AI Analyzers review your journals to find tasks, themes, and summaries.
4. **Index**: Every word is vectorized, enabling semantic search across your entire library.

Expand All @@ -43,10 +43,15 @@ The integrated frontend allows you to review your notes and AI insights side-by-

### 1. Launch the Cloud

The easiest way to start is with the `all` bundle and a Gemini API key:
The easiest way to start is with the `all` bundle and an AI API key. Choose either Google Gemini or Mistral AI:

```bash
export SUPERNOTE_GEMINI_API_KEY="your-api-key"
# Option A: Google Gemini (default)
export SUPERNOTE_GEMINI_API_KEY="your-gemini-api-key"

# Option B: Mistral AI
export SUPERNOTE_MISTRAL_API_KEY="your-mistral-api-key"

pip install "supernote[all]"
supernote serve
```
Expand Down Expand Up @@ -130,13 +135,26 @@ The notebook parser is a fork and slightly lighter dependency version of [supern

### Run with Docker

The pre-built image is published to the GitHub Container Registry:

```bash
# Pull and run the latest image
docker run -d \
-p 8080:8080 \
-p 8001:8001 \
-v supernote-data:/data \
-e SUPERNOTE_GEMINI_API_KEY="your-api-key" \
ghcr.io/allenporter/supernote:latest
```

Or build from source:

```bash
# Build & Run server
docker build -t supernote .
docker run -d -p 8080:8080 -v $(pwd)/storage:/storage supernote serve
```

See [Server Documentation](https://github.com/allenporter/supernote/blob/main/supernote/server/README.md) for details.
For a full setup with Docker Compose, see [docker-compose.yml](docker-compose.yml).

### Developer API

Expand Down
33 changes: 33 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
---
services:
supernote:
image: ghcr.io/allenporter/supernote:latest
# Alternatively, build from source:
# build: .
restart: unless-stopped
ports:
- "8080:8080" # Main server
- "8001:8001" # MCP server
volumes:
- supernote-data:/data
environment:
# AI Provider — set one of the following:
SUPERNOTE_GEMINI_API_KEY: "" # Google Gemini API key
# SUPERNOTE_MISTRAL_API_KEY: "" # Mistral AI API key (alternative)

# Storage & server
SUPERNOTE_STORAGE_DIR: /data
SUPERNOTE_CONFIG_DIR: /data/config
SUPERNOTE_HOST: 0.0.0.0
SUPERNOTE_PORT: "8080"
SUPERNOTE_MCP_PORT: "8001"

# Optional: set the public-facing base URL (e.g. behind a reverse proxy)
# SUPERNOTE_BASE_URL: "https://supernote.example.com"
# SUPERNOTE_MCP_BASE_URL: "https://mcp.example.com"

# Optional: enable user self-registration
# SUPERNOTE_ENABLE_REGISTRATION: "true"

volumes:
supernote-data:
5 changes: 3 additions & 2 deletions docs/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,9 @@ This script will initialize a virtual environment using `uv`, install dependenci
For rapid iteration, run an ephemeral server. It starts with a clean state and a pre-configured debug user.

```bash
# Enable AI features for development
export SUPERNOTE_GEMINI_API_KEY="your_api_key"
# Enable AI features for development (choose one)
export SUPERNOTE_GEMINI_API_KEY="your-gemini-api-key" # Google Gemini (default)
# export SUPERNOTE_MISTRAL_API_KEY="your-mistral-api-key" # Mistral AI (alternative)

# Start the ephemeral server
supernote serve --ephemeral
Expand Down
8 changes: 4 additions & 4 deletions docs/note_processing_design.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,11 +41,11 @@ If we don't want to parse the large "Transcript Summary" every time we need a si
1. **Diff Phase**: Parser extracts page streams. Each stream is hashed and compared to the database.
2. **Visual Phase**: Generate PNGs for new/changed pages. Assemble full PDF using cached PNGs for unchanged pages.
3. **Intelligence Phase**:
- Send PNG to Gemini (with retry/backoff) for OCR.
- Send PNG to the configured AI provider (with retry/backoff) for OCR.
- **Chunk Embeddings (Page-indexed)**: Generated per-page from raw OCR text. Ideal for "finding the needle in the haystack."
4. **Document Phase**:
- **Transcript Generation**: Aggregate all page text into a single "OCR Transcript" `SummaryDO`.
- **Insight Generation**: Prompt Gemini with the transcript to create an "AI Insights" `SummaryDO`.
- **Insight Generation**: Prompt the AI provider with the transcript to create an "AI Insights" `SummaryDO`.
- **Vector Indexing**:
- **Chunks**: Generate vectors for each page window. Store in-memory index `(file_id, page_index)`.
- **Document**: Generate vector for the Insight Summary. Store in-memory index `(file_id)`.
Expand Down Expand Up @@ -102,7 +102,7 @@ To maintain a resilient pipeline, modules must follow specific error handling pa

### 1. Expectations for `process()`
- **No Internal Try/Except (Mostly)**: Modules should let exceptions bubble up. The base class's `run()` method is the centralized error handler.
- **Descriptive Exceptions**: Raise specific exceptions (e.g., `FileNotFoundError`, `GeminiAPIError`) so the automated logs are useful.
- **Descriptive Exceptions**: Raise specific exceptions (e.g., `FileNotFoundError`, `ValueError`) so the automated logs are useful.
- **Idempotency is Mandatory**: If `process()` fails halfway (e.g., after writing a file but before updating a DB record), the next attempt must be able to resume or overwrite without creating duplicates or corruption.

### 2. Orchestrator Reaction
Expand All @@ -118,5 +118,5 @@ To maintain a resilient pipeline, modules must follow specific error handling pa
### Failure Modes & Corner Cases

1. **Dependency Staleness**: If `PageHashingModule` detects a change, it deletes the `SystemTaskDO` entries for `OCR` and `Embedding`. This causes their `run_if_needed` to return `True` on the next run, forcing a re-poll.
2. **Concurrency Limits**: `ProcessorService` limits the number of files processed in parallel. Modules should use internal semaphores (like `GeminiService`) if they have external API rate limits.
2. **Concurrency Limits**: `ProcessorService` limits the number of files processed in parallel. AI service implementations (like `GeminiService` and `MistralService`) use internal semaphores to respect external API rate limits.
3. **Idempotency Requirement**: If a task fails *after* writing data but *before* updating its status to `COMPLETED`, it will be re-run. `process()` must be safe to call again (e.g., using `UPSERT` or overwriting files).
1 change: 1 addition & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ server = [
"aiofiles>=25.1.0",
"aiohttp-remotes>=1.3.0",
"google-genai>=1.57.0",
"mistralai>=2.0.0",
"mcp>=1.25.0",
"aiohttp-asgi>=0.6.1",
]
Expand Down
34 changes: 30 additions & 4 deletions supernote/server/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ This package provides a self-hosted implementation of the Supernote Cloud server
## Core Features

- **Seamless Sync**: Implements the native Supernote sync protocol.
- **AI Synthesis**: Automatically transcribes handwriting and identifies key insights using Google Gemini.
- **AI Synthesis**: Automatically transcribes handwriting and identifies key insights using Google Gemini or Mistral AI.
- **Knowledge Exploration**: Cross-notebook semantic search and web-based file browsing.
- **Private & Local**: Store your notes and metadata on your own infrastructure.

Expand All @@ -17,7 +17,7 @@ See the main [README.md](../../README.md) for a quick start guide.

- A Supernote device (Nomad, A5 X, A6 X, etc.)
- Python 3.13+ or Docker.
- (Recommended) **Gemini API Key** for OCR and Summarization.
- (Recommended) A **Gemini** or **Mistral AI** API key for OCR and Summarization.

### Configuration

Expand All @@ -26,11 +26,37 @@ The server is configured via `config/config.yaml` or environment variables.
For a comprehensive reference, see the [ServerConfig documentation](https://allenporter.github.io/supernote/supernote/server.html#ServerConfig).

#### AI Configuration
To enable AI features, set the Gemini API key:

AI features require an API key from either Google Gemini (default) or Mistral AI. Set one of the following:

```bash
export SUPERNOTE_GEMINI_API_KEY="your-api-key"
# Option A: Google Gemini (default)
export SUPERNOTE_GEMINI_API_KEY="your-gemini-api-key"

# Option B: Mistral AI (takes priority when set)
export SUPERNOTE_MISTRAL_API_KEY="your-mistral-api-key"
```

> **Note on provider switching**: Gemini embeddings are 3072-dimensional while Mistral embeddings are 1024-dimensional. Switching providers after notes have been indexed requires re-processing all files to regenerate embeddings.
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was imaging that it would use different tasks names/columns for something like changing models to avoid the need to do this, hence why the tasks have specific names that include the models in them.

(You could have multiple modules if you want for example)

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently:

  • One embedding column, one EMBEDDING_GENERATION task type
  • Gemini embeddings (3072-dim) and Mistral embeddings (1024-dim) share the same slot
  • Switching providers corrupts the index — all notes must be re-processed

With model-specific task/column names:

  • EMBEDDING_GENERATION_GEMINI task + embedding_gemini column
  • EMBEDDING_GENERATION_MISTRAL task + embedding_mistral column
  • Both providers indexed simultaneously
  • Search uses whichever column matches the active provider
  • Switching providers = just change the key, existing embeddings untouched
  • Can run both in parallel if desired

Downsides:

  • DB migration required (new columns)
  • More complex search query (must pick the right column)
  • Doubles storage for embeddings if both providers are used
  • Significant implementation effort — touches DB models, alembic migrations, embedding module, search service, processor service

I see the benefit, when I did this, I wanted to avoid the need for a DB migration. I can proceed with this if you think it is the right approach. My logic was that people wouldn't be swapping providers often, and even if they do, a single re-indexing would be worth it to keep everything consistent.


Additional Gemini model settings:

| Env var | Default | Description |
|---|---|---|
| `SUPERNOTE_GEMINI_OCR_MODEL` | `gemini-3-flash-preview` | Vision model for OCR |
| `SUPERNOTE_GEMINI_EMBEDDING_MODEL` | `gemini-embedding-001` | Embedding model |
| `SUPERNOTE_GEMINI_CHAT_MODEL` | `gemini-2.0-flash` | Chat model for summaries |
| `SUPERNOTE_GEMINI_MAX_CONCURRENCY` | `5` | Max concurrent API calls (minimum 1) |

Additional Mistral model settings:

| Env var | Default | Description |
|---|---|---|
| `SUPERNOTE_MISTRAL_OCR_MODEL` | `mistral-ocr-latest` | Dedicated OCR model |
| `SUPERNOTE_MISTRAL_EMBEDDING_MODEL` | `mistral-embed` | Embedding model |
| `SUPERNOTE_MISTRAL_CHAT_MODEL` | `mistral-large-latest` | Chat model for summaries |
| `SUPERNOTE_MISTRAL_MAX_CONCURRENCY` | `5` | Max concurrent API calls (minimum 1) |

### Running the Server

Start the server using the unified `supernote` CLI:
Expand Down
45 changes: 32 additions & 13 deletions supernote/server/app.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,13 +35,15 @@
system,
)
from .routes.decorators import public_route
from .services.ai_service import AIService
from .services.blob import LocalBlobStorage
from .services.coordination import SqliteCoordinationService
from .services.file import FileService
from .services.gemini import GeminiService
from .services.mistral import MistralService
from .services.processor import ProcessorService
from .services.processor_modules.gemini_embedding import GeminiEmbeddingModule
from .services.processor_modules.gemini_ocr import GeminiOcrModule
from .services.processor_modules.embedding import EmbeddingModule
from .services.processor_modules.ocr import OcrModule
from .services.processor_modules.page_hashing import PageHashingModule
from .services.processor_modules.png_conversion import PngConversionModule
from .services.processor_modules.summary import SummaryModule
Expand Down Expand Up @@ -290,15 +292,31 @@ def create_app(config: ServerConfig) -> web.Application:
app["file_service"] = file_service
app["url_signer"] = UrlSigner(config.auth.secret_key, coordination_service)
app["schedule_service"] = ScheduleService(session_manager)
gemini_service = GeminiService(
config.gemini_api_key, max_concurrency=config.gemini_max_concurrency
)
app["gemini_service"] = gemini_service
ai_service: AIService
if config.mistral_api_key:
logger.info("Using Mistral as AI backend")
ai_service = MistralService(
api_key=config.mistral_api_key,
ocr_model=config.mistral_ocr_model,
embedding_model=config.mistral_embedding_model,
chat_model=config.mistral_chat_model,
max_concurrency=config.mistral_max_concurrency,
)
else:
logger.info("Using Gemini as AI backend")
ai_service = GeminiService(
api_key=config.gemini_api_key,
ocr_model=config.gemini_ocr_model,
embedding_model=config.gemini_embedding_model,
chat_model=config.gemini_chat_model,
max_concurrency=config.gemini_max_concurrency,
)
app["ai_service"] = ai_service

summary_service = SummaryService(user_service, session_manager)
app["summary_service"] = summary_service

search_service = SearchService(session_manager, gemini_service, config)
search_service = SearchService(session_manager, ai_service)
app["search_service"] = search_service

app["sync_locks"] = {} # user -> (equipment_no, expiry_time)
Expand All @@ -313,16 +331,17 @@ def create_app(config: ServerConfig) -> web.Application:
processor_service.register_modules(
hashing=PageHashingModule(file_service=file_service),
png=PngConversionModule(file_service=file_service),
ocr=GeminiOcrModule(
file_service=file_service, config=config, gemini_service=gemini_service
ocr=OcrModule(
file_service=file_service,
ai_service=ai_service,
),
embedding=GeminiEmbeddingModule(
file_service=file_service, config=config, gemini_service=gemini_service
embedding=EmbeddingModule(
file_service=file_service,
ai_service=ai_service,
),
summary=SummaryModule(
file_service=file_service,
config=config,
gemini_service=gemini_service,
ai_service=ai_service,
summary_service=summary_service,
),
)
Expand Down
Loading