Synapses Intelligence — AI Brain Sidecar

Synapses Intelligence adds semantic understanding to your codebase via local LLMs. No cloud required. No data leaves your machine. Runs on CPU or GPU.

Synapses (graph) ↔ Intelligence Sidecar
                    ↓
              llama-server | Ollama | Local CGo
                    ↓
            LLM Inference (Qwen, Mistral, etc.)

The brain sidecar enriches code graphs with:

Semantic summaries — prose descriptions of what code does
Architectural insight — how an entity fits into the system
Rule explanations — why architectural violations matter
Context packets — curated 800-token summaries for AI agents
Episodic learning — co-occurrence patterns from past decisions

What is Synapses Intelligence?

Synapses Intelligence is the reasoning layer for the Synapses code intelligence engine. It adds LLM-powered semantic enrichment to raw code graphs:

Input: Graph of code entities (nodes, edges, summaries) Process: 4-tier LLM system (Tier 0/1/2/3) — different models for different tasks Output: Context packets (~800 tokens), rule explanations, architectural insights

The brain is optional but powerful. Without it, Synapses still works via pure graph queries. With it, agents understand not just structure but intent.

4-Tier LLM Architecture

The brain routes tasks to different LLM tiers for efficiency:

Tier	Name	Purpose	GPU Model	CPU Model	Latency
0	Reflex	Fast prose summaries, boilerplate removal	qwen3.5:0.8b	qwen2.5-coder:1.5b	3s
1	Sensory	Explain rule violations (cached)	qwen3.5:2b	qwen2.5-coder:1.5b	5s
2	Specialist	Architectural insight, context packets	qwen3.5:4b	qwen2.5-coder:7b	12s
3	Architect	Multi-agent conflict resolution	qwen3.5:9b	qwen2.5-coder:7b	25s

Key insight: Different tasks need different model sizes. Summaries (Tier 0) run fast on 0.8B. Context packets (Tier 2) need 4B reasoning. Most operations use Tier 0 or Tier 1, keeping inference fast.

Install

Note: Synapses Intelligence is an optional sidecar — Synapses works without it. Install it to add LLM-powered semantic enrichment.

macOS / Linux (Homebrew):

brew tap SynapsesOS/tap
brew install brain

Direct binary download — GitHub Releases:

Platform	File
macOS (Apple Silicon)	`brain_darwin_arm64.tar.gz`
macOS (Intel)	`brain_darwin_x86_64.tar.gz`
Linux (x86_64)	`brain_linux_x86_64.tar.gz`
Linux (ARM64)	`brain_linux_arm64.tar.gz`
Windows	`brain_windows_x86_64.zip`

Extract and place the brain binary on your PATH.

Quick Start

1. Recommended: llama-server (CPU/GPU Auto-Detect)

No Ollama needed. Subprocess-managed llama-server with OpenAI-compatible API. Auto-detects Metal (macOS), CUDA, ROCm, CPU.

# Setup: downloads llama-server binary + GGUF model
brain setup --llama-server

# Configure the model (optional)
brain config hf-repo Qwen/Qwen3.5-4B-Instruct-GGUF
brain config hf-filename qwen3.5-4b-instruct-q4_k_m.gguf
brain config download

# Start the sidecar
brain serve

Then configure Synapses:

{
  "brain": {
    "url": "http://localhost:11435",
    "timeout_sec": 60,
    "enable_llm": true
  }
}

2. Ollama Backend (Requires Ollama Sidecar)

# Start Ollama (separate terminal)
ollama serve

# Setup brain for Ollama
brain setup

# Start the sidecar
brain serve

3. Local CGo (In-Process, No Subprocess)

# Requires C++ toolchain for compilation
CGO_ENABLED=1 go build -tags llamacpp ./cmd/brain/

# Start (uses local binary)
brain setup --local
brain serve

HTTP API Reference

All endpoints run on localhost:11435 by default (port configurable in brain.json).

Health & Status

Endpoint	Method	Description
`/v1/health`	GET	LLM liveness + model name + available status
`/v1/summary/{nodeId}`	GET	Fetch cached semantic summary for one node

Core LLM Tasks

Endpoint	Tier	Params	Response	Description
`/v1/ingest`	0	node_id, code	summary, tags	Generate prose briefing for a code entity
`/v1/enrich`	2	root_id, neighbors, task_context	insight, concerns, llm_used	Architectural insight for entity cluster
`/v1/explain-violation`	1	rule_id, source_file, target_name	explanation, fix	Plain-English rule explanation (cached)
`/v1/coordinate`	3	new_agent_id, conflicting_claims	suggestion, alternative_scope	Multi-agent conflict resolution
`/v1/context-packet`	optional	snapshot, phase, quality_mode	ContextPacket JSON	Phase-aware context assembly
`/v1/prune`	0	content	pruned, length_before, length_after	Strip web boilerplate, return technical content

SDLC & Decision Log

Endpoint	Method	Params	Description
`/v1/sdlc`	GET	—	Get current SDLC phase + quality mode
`/v1/sdlc/phase`	PUT	phase, agent_id	Set phase (planning/development/testing/review/release)
`/v1/sdlc/mode`	PUT	mode, agent_id	Set mode (standard/enterprise)
`/v1/decision`	POST	agent_id, phase, entity_name, action, outcome, notes	Log agent action to learning loop
`/v1/patterns`	GET	trigger (optional), limit	Learned co-occurrence patterns

Architectural Decision Records (ADRs)

Endpoint	Method	Params	Description
`/v1/adr`	POST	id, title, status, decision, context, consequences, linked_files	Create/update ADR
`/v1/adr`	GET	file (optional)	List ADRs; optionally filter by file
`/v1/adr/{id}`	GET	—	Get single ADR by ID

Embeddings

Endpoint	Method	Input	Output	Description
`/v1/embed`	POST	`{"input": "text"}` or `{"input": ["text1", "text2"]}`	`{"embedding": [...]}` or `{"embeddings": [...]}`	Single or batch embeddings (nomic-embed-text, 768-dim)

brain.json Configuration

Default path: ~/.synapses/brain.json. Override with $BRAIN_CONFIG env var.

{
  "enabled": true,
  "backend": "llama-server",
  "port": 11435,
  "timeout_ms": 120000,
  "db_path": "~/.synapses/brain.sqlite",

  "model_ingest": "qwen2.5-coder:1.5b",
  "model_guardian": "qwen2.5-coder:1.5b",
  "model_enrich": "qwen2.5-coder:7b",
  "model_orchestrate": "qwen2.5-coder:7b",

  "ingest": true,
  "enrich": true,
  "guardian": true,
  "orchestrate": true,
  "context_builder": true,
  "learning_enabled": true,

  "default_phase": "development",
  "default_mode": "standard",

  "llama_server_port": 11438,
  "hf_repo": "Qwen/Qwen3.5-4B-Instruct-GGUF",
  "hf_filename": "qwen3.5-4b-instruct-q4_k_m.gguf",

  "embedding_enabled": true,
  "embed_hf_repo": "nomic-ai/nomic-embed-text-v1.5-GGUF",
  "embed_hf_filename": "nomic-embed-text-v1.5.Q4_K_M.gguf",
  "embed_port": 11437
}

Key fields:

backend — llama-server, ollama, or local
timeout_ms — LLM inference timeout (120000 = 120s for CPU)
model_* — Per-tier model configuration
hf_repo / hf_filename — HuggingFace model download (llama-server backend)
embedding_enabled — Enable vector embeddings (requires embed_port)

CLI Commands

Command	Description
`brain serve`	Start HTTP sidecar server on configured port
`brain status`	Show LLM status, model, SQLite stats, feature flags, SDLC config
`brain config <key> <value>`	Set config field and persist to brain.json
`brain setup`	Interactive setup (Ollama backend): probe models, detect GPU, write config
`brain setup --llama-server`	Setup llama-server backend: download binary + GGUF
`brain setup --local`	Setup local CGo backend: configure for in-process inference
`brain ingest <json>`	Manually trigger ingest task (for testing)
`brain summaries`	List all cached semantic summaries
`brain sdlc`	Show/set SDLC phase and mode
`brain decisions [entity]`	List decision log entries (optionally filtered by entity)
`brain patterns`	List learned co-occurrence patterns sorted by confidence
`brain reset`	Clear all brain.sqlite data (prompts for confirmation)
`brain benchmark`	Measure latency of all configured LLM models
`brain version`	Print version

SDLC Phase Awareness

The brain is aware of your project's SDLC phase. Different phases get different context:

Phase	Mode	Checklist	Use Case
planning	standard/enterprise	Requirements, design, dependencies	Architecting new features
development	standard/enterprise	Code review, testing, integration	Daily coding
testing	standard/enterprise	Test coverage, edge cases, performance	QA and validation
review	standard/enterprise	Release notes, changelog, deprecations	Pre-release
release	standard/enterprise	Rollout plan, rollback, monitoring	Deployment

Set the phase via brain sdlc phase <phase> or the /v1/sdlc/phase API.

Privacy Guarantee

✅ All inference is local. No code or context leaves your machine. ✅ No cloud APIs. Models run on localhost (llama-server, Ollama, or in-process). ✅ No telemetry. No metrics, tracking, or logging to external services. ✅ SQLite-only storage. All summaries, insights, and decisions stored locally in brain.sqlite.

Contributing

We welcome contributions! See CONTRIBUTING.md for:

How to add a new LLM backend
How to add a new HTTP route
Testing with MockLLMClient (no Ollama needed)

License

MIT License — See LICENSE for details.

Links

GitHub: https://github.com/SynapsesOS/synapses-intelligence
Core Server: https://github.com/SynapsesOS/synapses
Web Intelligence: https://github.com/SynapsesOS/synapses-scout
Organization: https://github.com/SynapsesOS

Support

Issues: https://github.com/SynapsesOS/synapses-intelligence/issues
Discussions: https://github.com/SynapsesOS/synapses-intelligence/discussions
Security: security@synapsesos.dev (see SECURITY.md)

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.github		.github
cmd/brain		cmd/brain
config		config
internal		internal
pkg/brain		pkg/brain
server		server
.gitignore		.gitignore
.golangci.yml		.golangci.yml
.goreleaser.yaml		.goreleaser.yaml
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Synapses Intelligence — AI Brain Sidecar

What is Synapses Intelligence?

4-Tier LLM Architecture

Install

Quick Start

1. Recommended: llama-server (CPU/GPU Auto-Detect)

2. Ollama Backend (Requires Ollama Sidecar)

3. Local CGo (In-Process, No Subprocess)

HTTP API Reference

Health & Status

Core LLM Tasks

SDLC & Decision Log

Architectural Decision Records (ADRs)

Embeddings

brain.json Configuration

CLI Commands

SDLC Phase Awareness

Privacy Guarantee

Contributing

License

Links

Support

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Synapses Intelligence — AI Brain Sidecar

What is Synapses Intelligence?

4-Tier LLM Architecture

Install

Quick Start

1. Recommended: llama-server (CPU/GPU Auto-Detect)

2. Ollama Backend (Requires Ollama Sidecar)

3. Local CGo (In-Process, No Subprocess)

HTTP API Reference

Health & Status

Core LLM Tasks

SDLC & Decision Log

Architectural Decision Records (ADRs)

Embeddings

brain.json Configuration

CLI Commands

SDLC Phase Awareness

Privacy Guarantee

Contributing

License

Links

Support

About

Topics

Resources

License

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages