-
Notifications
You must be signed in to change notification settings - Fork 37
Resolve debug symbols in containers #1399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #1399 +/- ##
=======================================
Coverage 95.84% 95.85%
=======================================
Files 61 61
Lines 11076 11087 +11
=======================================
+ Hits 10616 10627 +11
Misses 460 460 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
For the ELF file itself blazesym does the right thing by reading files like: ``` /proc/2367/map_files/79c8000-20aec000 ``` This works great when the ELF files are in a container in a separate mount namespace. When it comes to looking for split debug info via `debuglink`, it tries looking in by the path in the root mount namespace, which does not work, as the files are just not there. With this change we try checking the process mount namespace first via `/proc/<xxx>/root`. Signed-off-by: Ivan Babrou <github@ivan.computer>
28da791 to
4220b95
Compare
|
Arguably, looking up elf by name like |
|
Or perhaps I'm going about it all wrong and the users should just call into |
|
Hi @bobrik, this certainly looks like something we should fix, but I haven't looked at your proposal closely yet (will do so later today or early next week the latest). That being said, I'd think we'd want a unit/regression test for reproducing this issue. Do you think it would be possible to add one? |
d-e-s-o
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change seems fine from my perspective, thanks! Left two comments. Please take a look and then see if we can add a test for this use as mentioned earlier.
|
|
||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| self.symbolizer | ||
| .elf_cache | ||
| .elf_resolver(entry_path, debug_dirs.as_deref()) | ||
| } else { | ||
| let path = &entry_path.symbolic_path; | ||
| self.symbolizer | ||
| .elf_cache | ||
| .elf_resolver(path, self.symbolizer.maybe_debug_dirs(self.debug_syms)) | ||
| .elf_resolver(path, debug_dirs.as_deref()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately I think the logic is mildly and subtly wrong now: the cache expects the list of debug directories to always be the same, because the "resolver" being looked up will be the same for the same path. However, what can happen now is that the first process has root directory / and a subsequent one uses something else, then, because of caching, we wouldn't be checking the new root for debug link targets.
Realistically, it's probably rare and the fix won't be trivial, but let's at least add a TODO about this issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having different libc on the inside and the outside of the container is definitely not rare.
Let me see how caching works to understand this better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. I think it will probably require quite a bit of refactoring to make this work properly, unless you have some magical idea. If you come up short, I can take care of it in early-to-mid January time frame.
| let debug_dirs = self | ||
| .symbolizer | ||
| .maybe_debug_dirs(self.debug_syms) | ||
| .map(|dirs| { | ||
| dirs.iter() | ||
| .filter_map(|dir| Some(entry_path.root_path.join(dir.strip_prefix("/").ok()?))) | ||
| .chain(dirs.iter().cloned()) | ||
| .collect::<Vec<_>>() | ||
| }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic seems fine, but can we rewrite to not allocate a new Vec in case root_path is /, please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, I see we don't deref the /proc/{pid}/root symbolic link, so this won't fly; but now I am wondering whether we perhaps should...?
For the ELF file itself blazesym does the right thing by reading files like:
This works great when the ELF files are in a container in a separate mount namespace. When it comes to looking for split debug info via
debuglink, it tries looking in by the path in the root mount namespace, which does not work, as the files are just not there. With this change we try checking the process mount namespace first via/proc/<xxx>/root.Here's a practical example with ClickHouse running in a container with debug symbols installed inside: