flux-llama

FLUX × llama.cpp — Novel integration of FLUX bytecode agents with LLM inference.

The Idea

What if token sampling in language models was driven by bytecode programs running on a virtual machine? What if multiple agents, each running their own sampling strategy as FLUX bytecode, voted on each token via A2A-style consensus?

This is that experiment.

Architecture

LLM Output Logits
       │
       ▼
┌─────────────┐     ┌─────────────┐     ┌─────────────┐
│  Agent 0     │     │  Agent 1     │     │  Agent 2     │
│  (Conserv.)  │     │  (Creative)  │     │  (Penalty)   │
│              │     │              │     │              │
│  FLUX Byte   │     │  FLUX Byte   │     │  FLUX Byte   │
│  code:       │     │  code:       │     │  code:       │
│  logit * 2   │     │  pos-dep     │     │  freq-div    │
│              │     │  temperature │     │              │
└──────┬───────┘     └──────┬───────┘     └──────┬───────┘
       │                    │                    │
       └────────────┬───────┴────────────────────┘
                    │
              ┌─────▼──────┐
              │  Weighted  │
              │   Vote     │
              │  (A2A)     │
              └─────┬──────┘
                    │
              ┌─────▼──────┐
              │  Selected  │
              │   Token    │
              └────────────┘

Features

Multi-Agent Token Sampling

Each agent is a FLUX bytecode program that scores candidate tokens
Agents vote via weighted consensus (A2A-style trust scoring)
Strategies can be swapped at runtime by loading different bytecode

Bytecode Embedding Generation

Each agent's bytecode is converted to a 128-dim embedding vector
Opcode frequency → embedding dimension
Enables similarity comparison between agent strategies

Integration Paths

Standalone (this demo): Simulated logits, pure FLUX VM sampling
llama.cpp hook: Wire into llama_sample_token() callback
ggml tensors: Map FLUX registers to tensor operations
Custom models: Bytecode as a "programming layer" over any LLM

Building

# Standalone (no llama.cpp needed)
gcc -std=c11 -Wall -O2 -DFLUX_STANDALONE -o flux-llama src/flux_llama.c -lm
./flux-llama

# With llama.cpp (requires llama.cpp installed)
gcc -std=c11 -Wall -O2 -I/path/to/llama.cpp/include \
    -o flux-llama src/flux_llama.c -lm -lllama

Example Output

📊 Setting up 3-agent inference swarm...
  Agent 0 (Conservative): weight=0.5 — boosts high-logit tokens
  Agent 1 (Creative):     weight=0.3 — position-dependent temperature
  Agent 2 (Penalty):       weight=0.2 — penalizes high-frequency tokens

📝 Generating text (20 positions) via swarm consensus:
  the the the the the the the the the the the sea sea sea sea sea ...

Why This Matters

Agent-driven creativity: Different sampling strategies create different "voices"
Evolutionary optimization: Bytecode can be mutated and selected for quality
Transparent decisions: You can disassemble exactly why a token was chosen
Composable: Mix and match agent strategies like LEGO blocks
Fast: FLUX VM runs at 48K+ ops/sec on ARM — negligible overhead vs LLM inference

Future Directions

Real llama.cpp integration (sampling callback hook)
GPU-accelerated FLUX VM (CUDA) for batch scoring
Evolutionary agent optimization (mutate bytecode, select by output quality)
Bytecode embeddings as features for model fine-tuning
Multi-model swarms (different base models, FLUX coordination layer)

License

MIT — SuperInstance (DiGennaro et al.)

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
for-fleet		for-fleet
message-in-a-bottle		message-in-a-bottle
src		src
CHARTER.md		CHARTER.md
DOCKSIDE-EXAM.md		DOCKSIDE-EXAM.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
fix_patch.py		fix_patch.py
flux-llama		flux-llama
flux-logo.jpg		flux-logo.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

flux-llama

The Idea

Architecture

Features

Multi-Agent Token Sampling

Bytecode Embedding Generation

Integration Paths

Building

Example Output

Why This Matters

Future Directions

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

flux-llama

The Idea

Architecture

Features

Multi-Agent Token Sampling

Bytecode Embedding Generation

Integration Paths

Building

Example Output

Why This Matters

Future Directions

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages