Skip to content

[T-004] Add comprehensive test suite for flux-py VM#1

Open
SuperInstance wants to merge 1 commit intomainfrom
greenhorn/T-004-flux-py
Open

[T-004] Add comprehensive test suite for flux-py VM#1
SuperInstance wants to merge 1 commit intomainfrom
greenhorn/T-004-flux-py

Conversation

@SuperInstance
Copy link
Copy Markdown
Owner

@SuperInstance SuperInstance commented Apr 12, 2026

Summary

  • Added pytest test framework (previously 0 tests)
  • Created test suite covering VM execution, assembler, disassembler, vocabulary, and interpreter
  • 84 tests across 3 test files, all passing in 0.22s

Test Coverage

VM Core (24 tests)

  • Basic execution: MOVI + HALT, multiple registers, empty bytecode, max cycles
  • Arithmetic: IADD (positive, negative), IMUL (normal, zero), IDIV (normal, truncation, div-by-zero), DEC (normal, negative)
  • Control flow: JNZ taken/not-taken, loop with DEC+JNZ, factorial loop, sum 1..10
  • Error handling: unknown opcode, truncated bytecode (MOVI, IADD), error returns None
  • Register access: all 16 default zero, out-of-bounds returns 0, specific register result
  • Opcodes constant verification

Assembler & Disassembler (32 tests)

  • All 7 opcodes encoded correctly (HALT, MOVI, IADD, IMUL, IDIV, DEC, JNZ)
  • Label resolution: forward jumps, backward jumps, labels on own line, unresolved label error
  • Comments: # and ; styles, comments between instructions
  • Error cases: unknown mnemonic
  • Multi-instruction programs: (3+4)*2=14, factorial(5)=120
  • Round-trips: assemble→execute, assemble→disassemble
  • Disassembler: all opcodes, unknown opcode, address prefixes, empty bytecode

Vocabulary & Interpreter (28 tests)

  • Pattern matching: add/match, return fields (assembly, result_reg, description), no-match, empty vocab
  • Case-insensitive matching, first-match-wins
  • Built-in vocabulary: 5 patterns, group extraction for all patterns
  • Interpreter: all 5 built-in commands (add, mul, factorial, double, square)
  • Edge cases: factorial(1), no-match, custom vocabulary, cycle count in message

Testing

python -m pytest tests/ -v → 84 passed in 0.22s

Staging: Open in Devin

- 84 tests across 3 test files covering VM core, assembler, disassembler, vocabulary, and interpreter
- VM core: basic execution, arithmetic (IADD, IMUL, IDIV, DEC), control flow (JNZ, loops), error handling, register access
- Assembler: all 7 opcodes, label resolution (forward/backward), comments, error cases, multi-instruction programs
- Disassembler: all opcodes, unknown opcode handling, empty bytecode
- Vocabulary: pattern matching, groups extraction, case insensitivity, built-in patterns
- Interpreter: all 5 built-in commands (add, mul, factorial, double, square), custom vocabulary, no-match handling
- Round-trip tests: assemble→execute, assemble→disassemble
Copy link
Copy Markdown

@beta-devin-ai-integration beta-devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 4 additional findings in Devin Review.

Staging: Open in Devin

Comment thread tests/test_vm.py
Comment on lines +79 to +85
def test_max_cycles_limit(self):
"""VM should stop after max_cycles even if not halted."""
# MOVI R0, 1 — no HALT, so it would loop forever
bc = movi(0, 1)
vm = FluxVM(bc, max_cycles=5).execute()
assert not vm.halted
assert vm.cycles == 1 # only 1 instruction in bytecode
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 test_max_cycles_limit doesn't actually test the max_cycles limit

The test claims to verify "VM should stop after max_cycles even if not halted" but the bytecode movi(0, 1) is a single 4-byte instruction. After executing it, PC=4 equals len(bc)=4, so the VM loop exits due to the self.pc < len(self.bc) condition at flux_vm.py:64 — NOT because max_cycles=5 was reached. The self.cycles < self.max_cycles condition is never the limiting factor. If someone removed the max_cycles check from the VM's while loop, this test would still pass, providing a false sense of regression protection for that feature. The test needs bytecode containing a loop (e.g., using JNZ to jump backward) so that the max_cycles limit is actually what terminates execution.

Suggested change
def test_max_cycles_limit(self):
"""VM should stop after max_cycles even if not halted."""
# MOVI R0, 1 — no HALT, so it would loop forever
bc = movi(0, 1)
vm = FluxVM(bc, max_cycles=5).execute()
assert not vm.halted
assert vm.cycles == 1 # only 1 instruction in bytecode
def test_max_cycles_limit(self):
"""VM should stop after max_cycles even if not halted."""
# Infinite loop: MOVI R0, 1; JNZ R0, -4 (jumps back to JNZ itself)
bc = movi(0, 1) + jnz(0, -4)
vm = FluxVM(bc, max_cycles=5).execute()
assert not vm.halted
assert vm.cycles == 5
Staging: Open in Devin

Was this helpful? React with 👍 or 👎 to provide feedback.

Debug

Playground

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant