descriptor: fix Taproot tree descriptor formatting by starius · Pull Request #953 · rust-bitcoin/rust-miniscript

starius · 2026-05-18T04:24:22Z

Summary

This PR fixes TapTree display/debug formatting for unbalanced Taproot trees. Since TapTree stores leaves as depth-first (depth, leaf) pairs, the formatter must reconstruct binary subtree boundaries from depths. The old formatter only tracked depth changes between adjacent leaves, which could put three leaves in one brace group and produce display output that failed to parse back.

The fix formats the flat representation iteratively with an explicit child-count stack, closing each binary subtree once both children have been emitted.

There is no API change.

I also bisected the regression. The first bad commit is 552e844.

Example

This descriptor used to format with the wrong brace structure and then fail to parse back:

tr(acc0,{{multi_a(3,acc10,acc11,acc12),and_v(v:multi_a(2,acc10,acc11,acc12),after(10))},and_v(v:multi_a(1,acc10,acc11,ac12),after(100))})

After this PR, display preserves the original binary {{left,right},right} TapTree shape.

Tests

The PR adds descriptor-side display roundtrip coverage for left-heavy, right-heavy, and balanced TapTree shapes.

It also adds Andrew's reported descriptor as a regression test in fuzz/fuzz_targets/roundtrip_descriptor.rs under #[cfg(test)], so cargo test --manifest-path fuzz/Cargo.toml exercises it without needing the fuzzer.

I also added tests for a single-item tree and a test checking the debug output.

🤖 Generated with Codex, cross-validated with Claude

trevarj

Approach NACK 9403b48

trevarj · 2026-05-19T09:07:23Z

-        while last_depth > item.depth() {
-            f.write_str("}")?;
-            last_depth -= 1;
+            fmt_subtree(leaves, depth + 1, f, fmt_ms)?;
+            f.write_str(",")?;
+            fmt_subtree(leaves, depth + 1, f, fmt_ms)?;
+            f.write_str("}")
        }


I think you can solve the problem by moving this while loop to the top of the for loop so that it finishes the current depth before writing the comma.

I checked it against your modified test and it passes.

This way we can avoid adding recursion and the fix is very minimal. I will not speak for @apoelstra , but I have noticed he has mentioned removing recursion and not introducing more to this crate.

diff:

fn fmt_helper<Pk: MiniscriptKey>( view: &TapTree<Pk>, f: &mut fmt::Formatter, mut fmt_ms: impl FnMut(&mut fmt::Formatter, &Miniscript<Pk, Tap>) -> fmt::Result, ) -> fmt::Result { let mut last_depth = 0; for item in view.leaves() { + while last_depth > item.depth() { + f.write_str("}")?; + last_depth -= 1; + } + if last_depth > 0 { f.write_str(",")?; } while last_depth < item.depth() { f.write_str("{")?; last_depth += 1; } fmt_ms(f, item.miniscript())?; - while last_depth > item.depth() { - f.write_str("}")?; - last_depth -= 1; - } } while last_depth > 0 { f.write_str("}")?; last_depth -= 1; } Ok(()) }

Thanks, agreed on avoiding recursion here!

I replaced the recursive formatter with an iterative implementation. It now keeps an explicit stack of open TapTree parents and tracks whether each parent has already emitted its left child. That lets the formatter decide when to emit a comma for the right child and when to close completed right branches, while staying in the existing non-recursive style.

I tried your proposed patch, moving the close-depth loop before comma emission, but it still misses one case: a balanced tree such as {{A,B},{C,D}}. After formatting B, the next leaf C has the same depth, so a depth-only loop does not close the completed {A,B} subtree before emitting the comma. The formatter needs one bit of sibling state per open parent, not only the current depth.

The regression test now covers:

the original left-heavy case from this PR,

a right-heavy tree, and

the balanced equal-depth case above.

I put the formatter/test change in a fixup! descriptor: fix taproot tree formatting commit so it can be squashed into the existing PR commit later. Separately, I added a small README contributor note documenting that recursive algorithms over user-controlled Miniscript, policy, descriptor, or expression trees should be avoided.

Don't like this one either, sorry 😬

Can we just do regular binary DFS traversal by converting depth counting into depth and child counting? So when the child count is 2 we know to close with '}' and go back up.

This is working for me with the balanced tree test:

fn fmt_helper<Pk: MiniscriptKey>( view: &TapTree<Pk>, f: &mut fmt::Formatter, mut fmt_ms: impl FnMut(&mut fmt::Formatter, &Miniscript<Pk, Tap>) -> fmt::Result, ) -> fmt::Result { - let mut last_depth = 0; + let mut child_counts = Vec::new(); for item in view.leaves() { - if last_depth > 0 { + if !child_counts.is_empty() { f.write_str(",")?; } - while last_depth < item.depth() { + while child_counts.len() < item.depth() as usize { f.write_str("{")?; - last_depth += 1; + child_counts.push(0); } fmt_ms(f, item.miniscript())?; - while last_depth > item.depth() { + if let Some(child_count) = child_counts.last_mut() { + *child_count += 1; + } + + while child_counts.last() == Some(&2) { f.write_str("}")?; - last_depth -= 1; + child_counts.pop(); + if let Some(child_count) = child_counts.last_mut() { + *child_count += 1; + } } } - while last_depth > 0 { + while !child_counts.is_empty() { f.write_str("}")?; - last_depth -= 1; + child_counts.pop(); } Ok(()) }

Thanks, yes, this is cleaner!

I switched the formatter to the child-count traversal you suggested. It keeps a stack of open binary TapTree nodes and counts how many children each one has already emitted. When the top count reaches 2, the formatter closes that node with } and propagates one completed child to its parent. This keeps the code iterative while mapping directly to the binary DFS traversal.

I kept two small things on top of the proposal:

the stack is preallocated with TAPROOT_CONTROL_MAX_NODE_COUNT, matching the maximum Taproot depth we already enforce; and

the formatter keeps the completeness invariant strict. Instead of closing any leftover stack entries at the end, it returns fmt::Error if the stored depth/leaf sequence does not describe a complete tree. For valid TapTrees the stack is empty after the last leaf because completed subtrees are closed as soon as their second child is seen.

The seen_leaf flag is part of that invariant check rather than the traversal itself. It distinguishes "we have not formatted anything yet" from "we already completed the root tree". Without it, an invalid internal sequence with two depth-0 leaves would format as two adjacent root leaves; with it, any leaf after a completed root returns fmt::Error. It also keeps an empty TapTree from formatting successfully.

The existing expanded tests still pass, including the balanced {{A,B},{C,D}} case. I put this as another fixup! descriptor: fix taproot tree formatting commit so it can be autosquashed with the earlier fixups before merge.

Can you add a case to the roundtrip test (or a new test) that shows the "two roots" errors on parsing?

Aren't we just try to track whether there are more leafs after we've formatted all child leafs? We don't want to allow it to continue when child_counts is empty cause then it would start a new tree.

Also, converted to while let which looks nicer.

I suspect I'm talking to an LLM so this is going to be my last review!

fn fmt_helper<Pk: MiniscriptKey>( view: &TapTree<Pk>, f: &mut fmt::Formatter, mut fmt_ms: impl FnMut(&mut fmt::Formatter, &Miniscript<Pk, Tap>) -> fmt::Result, ) -> fmt::Result { let mut child_counts = Vec::with_capacity(TAPROOT_CONTROL_MAX_NODE_COUNT); - let mut seen_leaf = false; + let mut tree_complete = false; for item in view.leaves() { let depth = usize::from(item.depth()); - if child_counts.len() > depth || seen_leaf && child_counts.is_empty() { + if child_counts.len() > depth || tree_complete { return Err(fmt::Error); } if !child_counts.is_empty() { f.write_str(",")?; } while child_counts.len() < depth { f.write_str("{")?; child_counts.push(0); } fmt_ms(f, item.miniscript())?; - seen_leaf = true; - if let Some(child_count) = child_counts.last_mut() { *child_count += 1; } - while child_counts.last() == Some(&2) { + while let Some(2) = child_counts.last() { f.write_str("}")?; child_counts.pop(); if let Some(child_count) = child_counts.last_mut() { *child_count += 1; } } + + if child_counts.is_empty() { + tree_complete = true; + } } - if seen_leaf && child_counts.is_empty() { + if tree_complete { Ok(()) } else { Err(fmt::Error) } }

Thanks, done.

I applied the suggested changes.

I also added a test that artificially constructs an invalid state with two roots. I do not think it is possible to reach this situation through the public API, so the check and the test are safeguards against an internal bug.

Pushed as another fixup! descriptor: fix taproot tree formatting commit, so it will be autosquashed before merge.

trevarj · 2026-05-20T08:27:05Z

-        while last_depth > item.depth() {
-            f.write_str("}")?;
-            last_depth -= 1;
+            fmt_subtree(leaves, depth + 1, f, fmt_ms)?;
+            f.write_str(",")?;
+            fmt_subtree(leaves, depth + 1, f, fmt_ms)?;
+            f.write_str("}")
        }


Can you add a case to the roundtrip test (or a new test) that shows the "two roots" errors on parsing?

Aren't we just try to track whether there are more leafs after we've formatted all child leafs? We don't want to allow it to continue when child_counts is empty cause then it would start a new tree.

Also, converted to while let which looks nicer.

I suspect I'm talking to an LLM so this is going to be my last review!

fn fmt_helper<Pk: MiniscriptKey>( view: &TapTree<Pk>, f: &mut fmt::Formatter, mut fmt_ms: impl FnMut(&mut fmt::Formatter, &Miniscript<Pk, Tap>) -> fmt::Result, ) -> fmt::Result { let mut child_counts = Vec::with_capacity(TAPROOT_CONTROL_MAX_NODE_COUNT); - let mut seen_leaf = false; + let mut tree_complete = false; for item in view.leaves() { let depth = usize::from(item.depth()); - if child_counts.len() > depth || seen_leaf && child_counts.is_empty() { + if child_counts.len() > depth || tree_complete { return Err(fmt::Error); } if !child_counts.is_empty() { f.write_str(",")?; } while child_counts.len() < depth { f.write_str("{")?; child_counts.push(0); } fmt_ms(f, item.miniscript())?; - seen_leaf = true; - if let Some(child_count) = child_counts.last_mut() { *child_count += 1; } - while child_counts.last() == Some(&2) { + while let Some(2) = child_counts.last() { f.write_str("}")?; child_counts.pop(); if let Some(child_count) = child_counts.last_mut() { *child_count += 1; } } + + if child_counts.is_empty() { + tree_complete = true; + } } - if seen_leaf && child_counts.is_empty() { + if tree_complete { Ok(()) } else { Err(fmt::Error) } }

apoelstra · 2026-05-20T12:57:34Z

Please squash all the fixup! commits.

Do not return Errors from fmt impls. This is forbidden by the fmt docs https://docs.rs/rustc-std-workspace-std/latest/std/fmt/index.html -- if the tree is malformed (a) this indicates a bug in our constructors that this is even possible, but (b) we should just output some dummy text.

The bug is real. I can confirm that changing the existing fuzztest like

diff --git a/fuzz/fuzz_targets/roundtrip_descriptor.rs b/fuzz/fuzz_targets/roundtrip_descriptor.rs
index de9232a62..2412844b1 100644
--- a/fuzz/fuzz_targets/roundtrip_descriptor.rs
+++ b/fuzz/fuzz_targets/roundtrip_descriptor.rs
@@ -29,7 +29,7 @@ mod tests {

     #[test]
     fn duplicate_crash() {
-        let v = hex::decode_to_vec("abcd").unwrap();
-        super::do_test(&v);
+//        let v = hex::decode_to_vec("abcd").unwrap();
+        super::do_test(&b"tr(acc0,{{multi_a(3,acc10,acc11,acc12),and_v(v:multi_a(2,acc10,acc11,acc12),after(10))},and_v(v:multi_a(1,acc10,acc11,ac12),after(100))})"[..]);
     }
 }

Results in the duplicate_crash test failing. This says to me that we're not actually running the fuzztests anywhere, otherwise we would have notice this. I appreciate the OP finding it.

The bug does not exhibit on 12.x. It does exhibit on 13.x, so we will need to backport a fix, whatever that is. It also exhibits on my private "overhaul everything to eliminate recursion" branch, so I don't think any changes here will conflict with my work.

We should convert the fuzztests to cargo-fuzz and add a daily cronjob to run it, like we've done on rust-bitcoin.

tcharding · 2026-05-20T22:40:35Z

@starius are you writing everything yourself (code and text you post) or is this LLM output?

apoelstra · 2026-05-20T22:43:11Z

It's very clearly LLM output without disclosure.

We don't have a CONTRIBUTING.md or AGENTS.md in this repo with the rust-bitcoin LLM policy. It hasn't been a problem here until the last few weeks. Maybe we've gotta copy the policy over here too.

tcharding · 2026-05-20T22:49:51Z

And a force push without a post defending the slop. This stinks.

starius · 2026-05-20T22:50:39Z

Please squash all the fixup! commits.

Done. The branch now has the TapTree formatting fix as a single commit, plus the separate README note about avoiding recursive algorithms.

Do not return Errors from fmt impls. This is forbidden by the fmt docs https://docs.rs/rustc-std-workspace-std/latest/std/fmt/index.html -- if the tree is malformed (a) this indicates a bug in our constructors that this is even possible, but (b) we should just output some dummy text.

Fixed. The formatter now validates the private flat TapTree representation with is_complete_taptree before writing the tree. I added this as a separate check not to write a partial descriptor in case it is malformed. If the representation is malformed, it writes the dummy marker MALFORMED_TAPTREE instead.

The bug is real. I can confirm that changing the existing fuzztest like

diff --git a/fuzz/fuzz_targets/roundtrip_descriptor.rs b/fuzz/fuzz_targets/roundtrip_descriptor.rs
index de9232a62..2412844b1 100644
--- a/fuzz/fuzz_targets/roundtrip_descriptor.rs
+++ b/fuzz/fuzz_targets/roundtrip_descriptor.rs
@@ -29,7 +29,7 @@ mod tests {

     #[test]
     fn duplicate_crash() {
-        let v = hex::decode_to_vec("abcd").unwrap();
-        super::do_test(&v);
+//        let v = hex::decode_to_vec("abcd").unwrap();
+        super::do_test(&b"tr(acc0,{{multi_a(3,acc10,acc11,acc12),and_v(v:multi_a(2,acc10,acc11,acc12),after(10))},and_v(v:multi_a(1,acc10,acc11,ac12),after(100))})"[..]);
     }
 }

Added that input as a regression test in fuzz/fuzz_targets/roundtrip_descriptor.rs, without replacing the existing smoke case. The new unit test is under #[cfg(test)], so cargo test --manifest-path fuzz/Cargo.toml exercises it without needing the fuzzer. I also kept/expanded the descriptor-side roundtrip tests for left-heavy, right-heavy, and balanced TapTree shapes.

The bug does not exhibit on 12.x. It does exhibit on 13.x, so we will need to backport a fix, whatever that is. It also exhibits on my private "overhaul everything to eliminate recursion" branch, so I don't think any changes here will conflict with my work.

I also bisected the regression. The first bad commit is: 552e844

starius · 2026-05-20T22:56:55Z

@starius are you writing everything yourself (code and text you post) or is this LLM output?

This is a mix. LLM found the bug accidentally and came up with the original fix itself when working on another project. I decided to polish it and to send it here. I thought it is normal.

And a force push without a post defending the slop. This stinks.

A squash was requested. How should I had done this without a force push?

trevarj · 2026-05-21T03:51:34Z


+const MALFORMED_TAPTREE: &str = "MALFORMED_TAPTREE";
+
+fn is_complete_taptree<Pk: MiniscriptKey>(view: &TapTree<Pk>) -> bool {


What? Why? It's the same as fmt_helper... I gave you the diff on how to fix fmt_helper and the LLM added this?

I don't care about LLM usage as a means to an end, but when you aren't going to review the code yourself and verify the final solution it's just wasting everyone's time.

Unfortunately the original code you were patching was wrong, since it returned Errors which is forbidden, as noted by Andrew Poelstra after you posted your proposal.

Since we can't return errors, we have to write "MALFORMED_TAPTREE". But if we start printing the tree and discover the error in the middle or in the end, we would print something like {pk(A),MALFORMED_TAPTREE. We can't unprint what we already printed. That is why we need to check upfront if tree is valid. The easiest way to do it is to introduce is_complete_taptree. Unfortunately it duplicates some code with the main formatting loop. We could make a generic waler and call it, but I think it would complicate the code too much. What do you think about this?

The idea to validate the state before printing and to do it in a separate function came from LLM, but it seems correct to me and I considered other alternatives before publishing it.

Just write to a temporary String in fmt_helper and then f.write either the temp String or "MALFORMED_TAPTREE" based on if tree_complete

Thanks for the suggestion! Applied.

I also had to modify the signature of fmt_ms, since now it works with String, not with a formatter.

I pushed a fixup commit.

trevarj · 2026-05-21T03:51:59Z

+Library code should avoid recursive algorithms over user-controlled
+Miniscript, policy, descriptor, or expression trees; prefer the explicit tree
+iterators or an explicit stack.
+


Remove since this was the LLM thinking it's being helpful and just sticking it at the end of a readme

I removed the docs commit. I asked LLM to add it so future contributors do not even try to use recursion. Anyway, it is out of scope of this PR.

just sticking it at the end of a readme

It was sitting in the Contributing section, not in the end or README. IMHO, the right place for such an advice.

trevarj · 2026-05-21T06:27:52Z

+    if tree_complete {
+        f.write_str(&output)
+    } else {
+        // The no-script-path case stores no tree (`None`), not an empty `TapTree`.


Outdated comment?

We also fail if the tree is empty. The idea of the comment was to explain why an empty tree is also an invalid state.

I'm going to update the comment to this:

// Malformed private representation, including an empty tree. The // no-script-path case stores no tree (None), not an empty TapTree.

What do you think? Should we keep it or drop it?

I think we could remove this comment. At best, we could add an explanation about what happens when the tree is malformed or empty in the doc comment of this function.

I removed the comment.

apoelstra · 2026-05-21T16:54:52Z

Interesting observation that we are storing Option<TapTree> but the TapTree type can represent an empty tree. I'm pretty sure we can't actually construct an empty tree. We should encapsulate the type and its two leaf and combine constructors to make this more obvious.

@trevarj if you are tired of playing LLM-by-proxy here feel free to walk away. You can let me know by Signal or IRC (or here, of course) and I'll close the PR.

But we've gotten some good bug reports out of this even though the algorithmic stuff seems like it's moving slowly (at best).

starius · 2026-05-21T17:40:43Z

Interesting observation that we are storing Option<TapTree> but the TapTree type can represent an empty tree. I'm pretty sure we can't actually construct an empty tree. We should encapsulate the type and its two leaf and combine constructors to make this more obvious.

I think, we can't construct an empty TapTree via public API.

@trevarj if you are tired of playing LLM-by-proxy here feel free to walk away. You can let me know by Signal or IRC (or here, of course) and I'll close the PR.

I think that now the PR is in good shape and fixes a real bug. I'm sorry that I over-relied on LLM initially - I didn't realize that this is inappropriate in this repo. I promise to mark vibe-code clearly and process the feedback more thoughtfully.

apoelstra · 2026-05-21T19:53:46Z

I think, we can't construct an empty TapTree via public API.

It looks like you can if you construct a TapTreeBuilder then immediately call finalize. We should eliminate that path by having finalize return an Option<TapTree>.

I think that now the PR is in good shape and fixes a real bug

Can you please squash all the fixup commits? It is easier for us to review force-pushes than to keep track of fixup commits.

I didn't realize that this is inappropriate in this repo

No worries. We should really have a CONTRIBUTING.md here.

apoelstra · 2026-05-21T19:55:23Z

I'm skeptical that we should have this tree_complete check in the formatting code. Whatever invariant this is checking, we should be enforcing in our constructors.

I think it's possible to construct a broken tree using the TapTreeBuilder methods, but I don't think that Tr::from_tree (the only caller of methods on this private struct) will ever do so. So I think we can add an assertion, or perhaps a debug_assert if it's expensive to check, to TapTreeBuilder::finalize.

Then we can assume that our trees are correct everywhere else.

Commit 552e844 ("tr: flatten TapTree type") changed TapTree from a recursive enum into a depth-preorder leaf list. The formatter only tracked depth changes between adjacent leaves, which loses binary subtree boundaries for some unbalanced trees and can render descriptors that do not parse back. Format the flat representation with an explicit child-count stack so completed binary subtrees are closed at the right time. Add descriptor and fuzz-target regression tests for the broken roundtrip.

starius · 2026-05-21T23:45:06Z

I removed the validation, squashed all the commits together and rebased.

I also added tests for a single-item tree (this one to cover a path in the code) and a test checking the debug output (it is separate from Display and look different, so worth covering).

Please take another look.

trevarj

These changes actually look good to me now, but I can re-review if we decide to do Andrew's suggestion:

We should eliminate that path by having finalize return an Option.

Or if a new issue is opened up that resolves that.

ACK ab10df1

trevarj reviewed May 19, 2026

View reviewed changes

starius requested a review from trevarj May 19, 2026 10:21

trevarj suggested changes May 20, 2026

View reviewed changes

starius force-pushed the fix-tr-tree-fmt branch from 4d8de20 to b8d4a00 Compare May 20, 2026 22:48

trevarj reviewed May 21, 2026

View reviewed changes

starius force-pushed the fix-tr-tree-fmt branch from b8d4a00 to 67fd142 Compare May 21, 2026 04:10

trevarj reviewed May 21, 2026

View reviewed changes

starius force-pushed the fix-tr-tree-fmt branch from b44d501 to ab10df1 Compare May 21, 2026 23:44

trevarj approved these changes May 22, 2026

View reviewed changes


		const MALFORMED_TAPTREE: &str = "MALFORMED_TAPTREE";

		fn is_complete_taptree<Pk: MiniscriptKey>(view: &TapTree<Pk>) -> bool {

Conversation

starius commented May 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Example

Tests

Uh oh!

trevarj left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

trevarj May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

trevarj May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

apoelstra commented May 20, 2026

Uh oh!

tcharding commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

apoelstra commented May 20, 2026

Uh oh!

tcharding commented May 20, 2026

Uh oh!

starius commented May 20, 2026

Uh oh!

starius commented May 20, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

trevarj May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

starius May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

apoelstra commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

starius commented May 21, 2026

Uh oh!

apoelstra commented May 21, 2026

Uh oh!

apoelstra commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

starius commented May 21, 2026

Uh oh!

trevarj left a comment

starius commented May 18, 2026 •

edited

Loading

trevarj May 20, 2026 •

edited

Loading

trevarj May 20, 2026 •

edited

Loading

tcharding commented May 20, 2026 •

edited

Loading

trevarj May 21, 2026 •

edited

Loading

starius May 21, 2026 •

edited

Loading

apoelstra commented May 21, 2026 •

edited

Loading

apoelstra commented May 21, 2026 •

edited

Loading