Summary
The symbolic checker rejects Set[Seq[T]] types. This blocks symbolic checking of specs that model messages as sets of tuples — the natural encoding for message-passing protocols (Raft, Paxos, PBFT, etc.).
check error: unsupported construct for symbolic checking: Set[Seq[T]]
requires exponential encoding (tracks membership of every possible sequence).
hint: use --bfs to check with explicit-state BFS instead
Why this is fundamentally hard
Current set encoding: bitvector membership
Sets in the symbolic checker use an exploded bitvector encoding — one Z3 Bool variable per possible element in the domain:
Set[0..5] → 6 Z3 Bools: [member_0, member_1, ..., member_5]
Set operations map cleanly:
x in S → member_x
S union T → element-wise OR
S intersect T → element-wise AND
S \ T → element-wise AND-NOT
{} (empty) → all false
This is efficient because the domain is small and discrete. For Set[0..N] with N=100, you get 100 Z3 Bools — perfectly manageable.
Why sequences break this
A Seq[0..M] with max length L has Σᵢ₌₀ᴸ (M+1)ⁱ possible values (including the empty sequence). Each distinct sequence needs its own membership Bool.
| M (element range) |
L (max length) |
Possible sequences |
Z3 Bools needed |
| 2 |
3 |
40 |
40 |
| 3 |
3 |
85 |
85 |
| 5 |
5 |
9,331 |
9,331 |
| 10 |
5 |
161,051 |
161,051 |
| 10 |
9 |
~10B |
infeasible |
For Raft messages ([term, src, dest, lastLogTerm, lastLogIndex] — 5 fields, each bounded by N/MaxVal/MaxLogLen), even small constants (N=2, MaxVal=2, MaxLogLen=3) produce thousands of possible tuples. At realistic sizes the bitvector is infeasible.
Current sequence encoding
Sequences use an exploded positional encoding: 1 Z3 Int for length + L Z3 variables for elements:
Seq[Int] with max_len=3 → 4 Z3 vars: [len, elem_0, elem_1, elem_2]
This is efficient for individual sequence variables but doesn't compose with the bitvector set encoding — you'd need one such group per possible sequence value in the set, which is the exponential blowup.
Possible approaches
1. Bounded enumeration (feasible for small domains)
When element type is bounded (0..M) and max_len is known, enumerate all possible sequences and assign each a unique integer ID. Then Set[Seq[0..M]] becomes Set[0..K] using the existing bitvector encoding.
Encoding: Sequence → integer via positional encoding. For [a₀, a₁] with elements in 0..M: id = len_offset + a₀ * (M+1) + a₁.
Pros: Reuses existing bitvector infrastructure. All set operations work unchanged.
Cons: Only works when total sequence count < threshold (e.g. 10,000). Needs encode/decode functions in the encoder. Doesn't help for realistic protocol message spaces.
2. Z3 native Array(Seq, Bool) encoding
Z3 has native Array sort: Array(KeySort, ValueSort). A set can be encoded as Array(SeqSort, Bool) where Select(arr, seq) == true means membership.
Set[Seq[Int]] → Z3 Array(Seq(Int), Bool)
x in S → Select(S, x)
S union T → MapOr(S, T) (element-wise OR via Z3 map)
S = {} → ConstArray(Seq(Int), false)
S ∪ {x} → Store(S, x, true)
Pros: Handles unbounded domains. No exponential blowup. Elegant encoding.
Cons: Z3's reasoning about Array(Seq, Bool) is incomplete — the solver may return Unknown or diverge. Inductive checking (the primary symbolic use case) may hit undecidability. Performance unpredictable.
3. Quantifier-based encoding
Encode set membership as a Z3 predicate: member(S, x) ↔ (x = a ∨ x = b ∨ ...) where a, b, ... are the elements added to S. Union becomes disjunction of membership predicates.
Pros: Works for any element type Z3 can compare.
Cons: Formula size grows with number of set operations in the spec. Quantifier reasoning in Z3 is a known bottleneck. May not terminate for complex properties.
4. Workaround: Dict[Int, Seq[T]] with counter (current recommendation)
Replace Set[Seq[T]] with Dict[Int, Seq[T]] keyed by a monotonic counter. Membership test becomes any i in 0..counter: msgs[i] == target.
Pros: Works today with existing encoder. No symbolic changes needed.
Cons: Changes the model semantics (ordered/indexed vs set-based). Counter variable adds state dimensions. Membership test is O(counter) in the formula.
5. Keep --bfs fallback
The --bfs explicit-state backend handles Set[Seq[T]] natively with no limitations. For bounded model checking (which is the primary use case for protocol verification), BFS is already the right tool.
Pros: Already works. No changes needed.
Cons: Loses symbolic mode benefits (unbounded reasoning, inductive proofs).
Recommendation
Approach 1 (bounded enumeration) is the most practical incremental improvement — it extends the existing architecture and handles the cases where symbolic checking is most useful (small parameter spaces). For large protocol specs, BFS remains the right tool regardless.
Approach 2 (Z3 native arrays) is the theoretically correct solution but carries significant risk of solver incompleteness. Worth investigating as a research direction but not as a reliability-critical feature.
Summary
The symbolic checker rejects
Set[Seq[T]]types. This blocks symbolic checking of specs that model messages as sets of tuples — the natural encoding for message-passing protocols (Raft, Paxos, PBFT, etc.).Why this is fundamentally hard
Current set encoding: bitvector membership
Sets in the symbolic checker use an exploded bitvector encoding — one Z3
Boolvariable per possible element in the domain:Set operations map cleanly:
x in S→member_xS union T→ element-wise ORS intersect T→ element-wise ANDS \ T→ element-wise AND-NOT{} (empty)→ all falseThis is efficient because the domain is small and discrete. For
Set[0..N]with N=100, you get 100 Z3 Bools — perfectly manageable.Why sequences break this
A
Seq[0..M]with max length L has Σᵢ₌₀ᴸ (M+1)ⁱ possible values (including the empty sequence). Each distinct sequence needs its own membership Bool.For Raft messages (
[term, src, dest, lastLogTerm, lastLogIndex]— 5 fields, each bounded by N/MaxVal/MaxLogLen), even small constants (N=2, MaxVal=2, MaxLogLen=3) produce thousands of possible tuples. At realistic sizes the bitvector is infeasible.Current sequence encoding
Sequences use an exploded positional encoding: 1 Z3 Int for length + L Z3 variables for elements:
This is efficient for individual sequence variables but doesn't compose with the bitvector set encoding — you'd need one such group per possible sequence value in the set, which is the exponential blowup.
Possible approaches
1. Bounded enumeration (feasible for small domains)
When element type is bounded (0..M) and max_len is known, enumerate all possible sequences and assign each a unique integer ID. Then
Set[Seq[0..M]]becomesSet[0..K]using the existing bitvector encoding.Encoding: Sequence → integer via positional encoding. For
[a₀, a₁]with elements in 0..M:id = len_offset + a₀ * (M+1) + a₁.Pros: Reuses existing bitvector infrastructure. All set operations work unchanged.
Cons: Only works when total sequence count < threshold (e.g. 10,000). Needs encode/decode functions in the encoder. Doesn't help for realistic protocol message spaces.
2. Z3 native Array(Seq, Bool) encoding
Z3 has native
Arraysort:Array(KeySort, ValueSort). A set can be encoded asArray(SeqSort, Bool)whereSelect(arr, seq) == truemeans membership.Pros: Handles unbounded domains. No exponential blowup. Elegant encoding.
Cons: Z3's reasoning about
Array(Seq, Bool)is incomplete — the solver may returnUnknownor diverge. Inductive checking (the primary symbolic use case) may hit undecidability. Performance unpredictable.3. Quantifier-based encoding
Encode set membership as a Z3 predicate:
member(S, x) ↔ (x = a ∨ x = b ∨ ...)wherea, b, ...are the elements added to S. Union becomes disjunction of membership predicates.Pros: Works for any element type Z3 can compare.
Cons: Formula size grows with number of set operations in the spec. Quantifier reasoning in Z3 is a known bottleneck. May not terminate for complex properties.
4. Workaround: Dict[Int, Seq[T]] with counter (current recommendation)
Replace
Set[Seq[T]]withDict[Int, Seq[T]]keyed by a monotonic counter. Membership test becomesany i in 0..counter: msgs[i] == target.Pros: Works today with existing encoder. No symbolic changes needed.
Cons: Changes the model semantics (ordered/indexed vs set-based). Counter variable adds state dimensions. Membership test is O(counter) in the formula.
5. Keep --bfs fallback
The
--bfsexplicit-state backend handlesSet[Seq[T]]natively with no limitations. For bounded model checking (which is the primary use case for protocol verification), BFS is already the right tool.Pros: Already works. No changes needed.
Cons: Loses symbolic mode benefits (unbounded reasoning, inductive proofs).
Recommendation
Approach 1 (bounded enumeration) is the most practical incremental improvement — it extends the existing architecture and handles the cases where symbolic checking is most useful (small parameter spaces). For large protocol specs, BFS remains the right tool regardless.
Approach 2 (Z3 native arrays) is the theoretically correct solution but carries significant risk of solver incompleteness. Worth investigating as a research direction but not as a reliability-critical feature.