sv2-tp: reconnect to bitcoin core IPC after disconnect#101
Conversation
79e0653 to
8f0ba30
Compare
| // capnp interface does not define a destroy method, this will just call | ||
| // an empty stub defined in the ProxyClientBase class and do nothing. | ||
| Sub::destroy(*this); | ||
| if (m_context.connection && !m_context.connection->m_disconnected.load()) { |
There was a problem hiding this comment.
In commit "ipc: skip proxy destroy calls after disconnect starts" (fa010b9)
FWIW, there's already a PR on upstream that fixes this issue: bitcoin-core/libmultiprocess#273
There was a problem hiding this comment.
Thanks for pointing that out @xyzconstant. I was not aware that there was already an upstream libmultiprocess PR addressing this.
I have reworked the branch to remove the local libmultiprocess change and replace it with a proper subtree update based on bitcoin-core/libmultiprocess#273. The sv2-tp reconnect commits are now rebased on top of that.
I reran the regtest stack on the updated branch and the reconnect path is still working as expected.
6de92e1 proxy-client: tolerate exceptions from remote destroy during cleanup 90be835 test: regression for ~ProxyClient destroy after peer disconnect 3c69d12 Merge bitcoin-core/libmultiprocess#260: event loop: tolerate unexpected exceptions in `post()` callbacks b8a48c6 event loop: tolerate unexpected exceptions in `post()` callbacks f787863 Merge bitcoin-core/libmultiprocess#270: doc: Bump version 10 > 11 a22f602 doc: Bump version 10 > 11 git-subtree-dir: src/ipc/libmultiprocess git-subtree-split: 6de92e1c7324c4748d05687372256a5051c97bb4
…t-subtree-upstream273
Add a reconnect loop around the initial Bitcoin Core IPC setup. When the IPC connection cannot be established, retry with exponential backoff instead of exiting immediately. This provides the basis for recovering sv2-tp after backend loss.
Decouple the template provider lifetime from the Bitcoin Core IPC backend. Keep the Stratum v2 listener and connected clients alive when the backend disconnects, wait for a replacement backend, and resume serving templates once a new IPC connection is installed.
Adapt the sv2 template provider tests to the reconnect lifecycle. Construct the provider without a fixed Mining reference and install the backend through the new reconnect path so the test harness matches the runtime behavior.
Simplify the reconnect implementation now that disconnected proxy teardown is handled in the IPC layer. Remove the local teardown workarounds, restore ordinary backend ownership, and harden the remaining shutdown path so reconnect and operator shutdown both complete cleanly.
8f0ba30 to
451b144
Compare
|
I'm marking this draft until bitcoin-core/libmultiprocess#273 lands. I also plan to review this. |
This PR makes sv2-tp recover from a lost IPC connection without exiting.
The first commits update the libmultiprocess subtree to include the upstream fix from bitcoin-core/libmultiprocess#273. That fix makes proxy client destruction tolerate remote
destroy()failures during teardown. Without it, stale IPC-backed objects can abort the process after Bitcoin Core disconnects.On top of that, sv2-tp keeps the template provider and pool-facing connection manager alive across backend loss, reconnects to Bitcoin Core with backoff, installs a fresh IPC backend, and resumes serving templates on the existing pool connection.
The last commit drops the reconnect-time cleanup workarounds that were only needed before the IPC fix and returns the backend to normal
unique_ptrownership. It also catches disconnect exceptions from template-provider IPC calls and uses them to trigger backend reconnect instead of crashing.