Conversation
74c01ca to
16cc8e1
Compare
add `tracing` feature of tokio, and the console-subscriber
add necessary logs; add timeout in `read_message_from_stellar` because it gets stuck
db588a5 to
87ad4d5
Compare
| [dependencies] | ||
| clap = { version = "4.0.17", features = ["derive"]} | ||
| hex = "0.4.3" | ||
| tokio = { version = "1.8", features = ["rt-multi-thread", "macros", "time"] } |
| serde_json = "1.0.71" | ||
| thiserror = "1.0" | ||
| tokio = { version = "1.0", features = ["full"] } | ||
| tokio = { version = "1.37", features = ["full"] } |
There was a problem hiding this comment.
The reason for this is the "lost its waker" issue in tokio, fixed in the newer versions.
| [dependencies] | ||
| hex = "0.4.3" | ||
| log = {version = "0.4.14"} | ||
| tracing = { version = "0.1", features = ["log"] } |
There was a problem hiding this comment.
helps with tokio-console and instrumentation:
the console-subscriber crate in this repository contains an implementation of the instrumentation-side API as a tracing-subscriber Layer, for projects using Tokio and tracing.
.cargo/config.toml
Outdated
| @@ -0,0 +1,2 @@ | |||
| [build] | |||
| rustflags = ["--cfg", "tokio_unstable"] No newline at end of file | |||
There was a problem hiding this comment.
to make tokio-console work::
in order to collect task data from Tokio, the tokio_unstable cfg must be enabled.
There was a problem hiding this comment.
With the Rust stable version, there hasn't been any problems. The tokio-console command displays just fine, although I was not able to see the histogram.
The worst it could do would be returning a blank view.
There was a problem hiding this comment.
This doesn't work for me, unfortunately I have to explicitly pass the RUSTFLAGS="--cfg tokio_unstable" when trying to run or build the client.
I assume it's because I do have a ~/.cargo/config.toml file where I already specify some rust flags
[target.aarch64-apple-darwin]
rustflags = ["-C", "link-arg=-fuse-ld=/opt/homebrew/opt/llvm/bin/ld64.lld"]
and this might override the workspace configuration.
I confirmed this by removing my 'global' config.toml and then recompiling. This worked.
Generally I would prefer if we could enable this tokio-console thing with an extra feature flag and have it disabled by default. Do you think this would be possible @b-yap?
There was a problem hiding this comment.
See e77d7ac
I tried to set it through [env] .cargo/config.toml, and it didn't work. All RUSTFLAGS, CARGO_ENCODED_RUSTFLAGS and CARGO_BUILD_FLAGS won't work.
I also tried to do it through a build script and didn't work:
fn main(){
cfg_if::cfg_if! {
if #[cfg(feature = "allow-debugger")] {
println!("cargo:rustc-env=RUSTFLAGS=\"--cfg tokio_unstable\"");
}
}
}and
println!("cargo:rustc-flags=\"--cfg tokio_unstable\"");
as explained in rust-lang/cargo#10141.
In the end I updated the doc to tell the user to set the rustflags themselves.
| auth_msg.sequence | ||
| ); | ||
|
|
||
| let auth_msg_xdr = auth_msg.to_base64_xdr(); |
There was a problem hiding this comment.
I decided to remove this because the xdr was huge and may not be helpful at all.
|
|
||
| // we do not handle other messages. Return to caller | ||
| other => { | ||
| log::trace!( |
There was a problem hiding this comment.
we are only interested on messages that will be processed by the vault.
Logging should be done over there than here.
|
|
||
| #[tokio::main] | ||
| async fn main() { | ||
| console_subscriber::init(); |
There was a problem hiding this comment.
add documentation of the `tokio-console`.
|
|
||
| tasks.append(&mut replace_tasks); | ||
|
|
||
| tasks.push(( |
There was a problem hiding this comment.
😱 removing this duplicate, as it is also found in
spacewalk/clients/vault/src/system.rs
Lines 623 to 630 in e7a672b
inside
fn create_initial_tasks(...
Originally it runs after Replace Cancellation Scheduler and before Redeem Request Listener:
spacewalk/clients/vault/src/system.rs
Lines 619 to 639 in 31df183
clients/stellar-relay-lib/src/connection/connector/message_sender.rs
Outdated
Show resolved
Hide resolved
clients/stellar-relay-lib/src/connection/connector/message_sender.rs
Outdated
Show resolved
Hide resolved
.cargo/config.toml
Outdated
| @@ -0,0 +1,2 @@ | |||
| [build] | |||
| rustflags = ["--cfg", "tokio_unstable"] No newline at end of file | |||
There was a problem hiding this comment.
This doesn't work for me, unfortunately I have to explicitly pass the RUSTFLAGS="--cfg tokio_unstable" when trying to run or build the client.
I assume it's because I do have a ~/.cargo/config.toml file where I already specify some rust flags
[target.aarch64-apple-darwin]
rustflags = ["-C", "link-arg=-fuse-ld=/opt/homebrew/opt/llvm/bin/ld64.lld"]
and this might override the workspace configuration.
I confirmed this by removing my 'global' config.toml and then recompiling. This worked.
Generally I would prefer if we could enable this tokio-console thing with an extra feature flag and have it disabled by default. Do you think this would be possible @b-yap?
…ob/24994504193?pr=517
…9314/job/24998905803?pr=517
…ob/25003758078?pr=517
…ob/25007476694?pr=517
…ob/25040418069?pr=517
|
@ebma Would be great if we have the vaults running with this feature. I bet it'll look very different (and interesting). |
|
True, but I'm not sure if we can easily connect to the clients with @zoveress in this PR we added support for a profiling tool that lets us investigate resource consumption of tasks within our process. As you can see here, it's fairly simple to run. As far as I understand, the changes we made to the client will make it expose a new endpoint (by default) on port 6669 that is used by the This is not super important but would be nice to have. |
|
Is the endpoint password secured? |
|
No it's not and I think it's only supposed to be used locally. Can we expose it to the GitLab runner only and access it from there somehow? |
This does not completely close #512, but it helps with tracking.
But if we can see which threads have been running the longest; which ones are idle for a long time, it could help determine where/when a vault gets stuck.
The
tokio-consoleis described as:^ From using tokio-console, a zombie task was found.
When the vault restarts, it shuts down all running tasks except for 1:
spacewalk/clients/wallet/src/resubmissions.rs
Lines 40 to 47 in e7a672b
Aside from this, I found out that the stream also gets stuck:
spacewalk/clients/stellar-relay-lib/src/connection/connector/message_reader.rs
Line 82 in e7a672b
Requires a timeout to trigger reconnection.
How to begin the review:
I have added comments to help with the review.
tracingfeature oftokiois necessary fortokio-consoleconsole-subscriberdependency is necessary fortokio-consoleStellarWallet:fn stop_periodic_resubmission_of_transactions(&mut self)fn start_periodic_resubmission_of_transactions_from_cache(...):StellarWallet'sresubmission_end_signalfield.fn stop_periodic_resubmission_of_transactions(...)from thewalletfield ofVaultService:fn shutdown_wallet(...)when the service stops.READ_TIMEOUT_IN_SECS; how long to wait on reading from the stream.tokio-console