Skip to content

Add Linux-specific pidfd process extensions (take 2)#81825

Merged
bors merged 7 commits intorust-lang:masterfrom
voidc:pidfd
Aug 1, 2021
Merged

Add Linux-specific pidfd process extensions (take 2)#81825
bors merged 7 commits intorust-lang:masterfrom
voidc:pidfd

Conversation

@voidc
Copy link
Contributor

@voidc voidc commented Feb 6, 2021

Continuation of #77168.
I addressed the following concerns from the original PR:

  • make CommandExt and ChildExt sealed traits
  • wrap file descriptors in PidFd struct representing ownership over the fd
  • add take_pidfd to take the fd out of Child
  • close fd when dropped

Tracking Issue: #82971

@rust-highfive
Copy link
Contributor

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @Mark-Simulacrum (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Feb 6, 2021
@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking for ENOSYS is sadly not enough here.
The syscall can also be blocked by seccomp (EPERM).

We have also found cases where new syscalls where randomly broken on some configurations (see the EOPNOTSUPP comment here. Though, I think we can ignore this case for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EPERM and EOPNOTSUPP might depend on the arguments of the clone3 call. It would be wrong to set HAS_CLONE3 to false if one of these errors occurred.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case the usual approach is to perform the probe separately with arguments that are known to fail with a specific error code, e.g. an invalid combination of flags leading to EINVAL. If that returns EPERM then we know the syscall is filtered. CLONE_SIGHAND | CLONE_CLEAR_SIGHAND could work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we could just fallback to fork on EPERM without setting HAS_CLONE3.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could be considered slightly suboptimal since it'll try again every time in that case, but we already have a preferred path through posix_spawn so this shouldn't be hit too frequently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern was whether it would be possible for clone3 to succeed after having returned EPERM once. In case of ENOSYS we can be certain that any future calls will fail as well. For EPERM, this conclusion does potentially not hold since a seccomp filter could be defined in terms of arbitrary rules.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to account for arbitrary seccomp filters here. seccomp filters should produce ENOSYS, not EPERM.

@rust-log-analyzer

This comment has been minimized.

@Mark-Simulacrum
Copy link
Member

r? @m-ou-se or perhaps @joshtriplett

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Collaborator

bors commented Feb 13, 2021

☔ The latest upstream changes (presumably #82045) made this pull request unmergeable. Please resolve the merge conflicts.

@rust-log-analyzer

This comment has been minimized.

@JohnCSimon JohnCSimon added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 1, 2021
@voidc
Copy link
Contributor Author

voidc commented Mar 2, 2021

@joshtriplett I think you were interested in the original PR. Would you mind taking a look at this?

@bors
Copy link
Collaborator

bors commented Mar 9, 2021

☔ The latest upstream changes (presumably #82877) made this pull request unmergeable. Please resolve the merge conflicts.

@the8472

This comment has been minimized.

@bors
Copy link
Collaborator

bors commented Jul 28, 2021

📌 Commit f48caef6fc47df6acfb9d72f1a254e503e98b557 has been approved by joshtriplett

@bors
Copy link
Collaborator

bors commented Jul 28, 2021

⌛ Testing commit f48caef6fc47df6acfb9d72f1a254e503e98b557 with merge d1b9a4c984315ffa8365dbdfbdc5a331f5adcd8e...

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Collaborator

bors commented Jul 28, 2021

💔 Test failed - checks-actions

@voidc
Copy link
Contributor Author

voidc commented Jul 28, 2021

@Aaron1011 Thanks for tending to this issue! Unfortunately, I think the workaround in your commit is not enough to circumvent the LLVM bug. To my knowledge, the combination of i686, fat LTO, and opt-level=0 is already sufficient to trigger the assertion. While, on the i686-gnu target, I was able to work around the bug by ensuring opt-level > 0, the i686-gnu-nopt target disables optimizations by definition, triggering the assertion in every test using LTO. One option would be to add ignore-i686-unknown-linux-gnu to all of the tests listed below. However, landing this PR with the workaround would mean potentially breaking user's builds satisfying the three criteria mentioned above. The better option, of course, would be fixing the underlying LLVM bug. Since it appears like the bug is already fixed in the latest LLVM trunk (as you commented in #86758), maybe we could cherry-pick the commit resolving the issue?

List of failing tests
ui/abi/stack-probes-lto.rs
ui/debuginfo-lto.rs
ui/extern/issue-64655-allow-unwind-when-calling-panic-directly.rs#fat
ui/fat-lto.rs
ui/lto-many-codegen-units.rs
ui/lto-rustc-loads-linker-plugin.rs
ui/lto-still-runs-thread-dtors.rs#mir
ui/lto-still-runs-thread-dtors.rs#thir
ui/panic-runtime/lto-abort.rs
ui/panic-runtime/lto-unwind.rs
ui/sepcomp/sepcomp-lib-lto.rs

@voidc
Copy link
Contributor Author

voidc commented Jul 28, 2021

I went ahead and added -C opt-level=1 to the failing tests. However, there are more tests that specify -C lto.

List of tests
codegen/lto-removes-invokes.rs
codegen/sanitizer-memory-track-orgins.rs
codegen/sanitizer-recover.rs
debuginfo/cross-crate-type-uniquing.rs (*)
incremental/lto.rs
incremental/rlib-lto.rs
run-make-fulldeps/cdylib/Makefile
run-make-fulldeps/codegen-options-parsing/Makefile
run-make-fulldeps/cross-lang-lto/Makefile
run-make-fulldeps/issue-14500/Makefile
run-make-fulldeps/issue-64153/Makefile
run-make-fulldeps/lto-dylib-dep/Makefile
run-make-fulldeps/lto-no-link-whole-rlib/Makefile
run-make-fulldeps/lto-readonly-lib/Makefile
run-make-fulldeps/lto-smoke-c/Makefile
run-make-fulldeps/lto-smoke/Makefile
run-make-fulldeps/no-builtins-lto/Makefile
run-make-fulldeps/pgo-gen-lto/Makefile
run-make-fulldeps/reproducible-build-2/Makefile
run-make/wasm-custom-section/Makefile
run-make/wasm-export-all-symbols/Makefile
run-make/wasm-import-module/Makefile
run-make/wasm-panic-small/Makefile
run-make/wasm-stringify-ints-small/Makefile
run-make/wasm-symbols-different-module/Makefile
run-make/wasm-symbols-not-imported/Makefile
ui/abi/stack-probes-lto.rs (*)
ui/debuginfo-lto.rs (*)
ui/extern/issue-64655-allow-unwind-when-calling-panic-directly.rs (*)
ui/extern/issue-64655-extern-rust-must-allow-unwind.rs (*)
ui/fat-lto.rs (*)
ui-fulldeps/lto-syntax-extension.rs
ui/issues/issue-11154.rs (build-fail)
ui/issues/issue-44056.rs (only-x86_64)
ui/lto-and-no-bitcode-in-rlib.rs (build-fail)
ui/lto-duplicate-symbols.rs (build-fail)
ui/lto-many-codegen-units.rs (*)
ui/lto-rustc-loads-linker-plugin.rs (*)
ui/lto-still-runs-thread-dtors.rs (*)
ui/panic-runtime/lto-abort.rs (*)
ui/panic-runtime/lto-unwind.rs (*)
ui/sepcomp/sepcomp-lib-lto.rs (*)

(*): added -C opt-level=1

@joshtriplett
Copy link
Member

#87610 may address this. Let's give it another try (with the workarounds removed) once that has gone in.

@voidc
Copy link
Contributor Author

voidc commented Jul 30, 2021

@joshtriplett I removed the workarounds.

@Aaron1011
Copy link
Contributor

#87610 has been merged, so let's try this again:

@bors r=joshtriplett

@bors
Copy link
Collaborator

bors commented Jul 31, 2021

📌 Commit 741176a38771b233bdd6f972929a3930682da169 has been approved by joshtriplett

@bors
Copy link
Collaborator

bors commented Jul 31, 2021

⌛ Testing commit 741176a38771b233bdd6f972929a3930682da169 with merge a2522c6c85a7388a80cb33edc2ce2631b0695f99...

@rust-log-analyzer

This comment has been minimized.

@bors
Copy link
Collaborator

bors commented Aug 1, 2021

💔 Test failed - checks-actions

@Aaron1011
Copy link
Contributor

Aaron1011 commented Aug 1, 2021

All of the non-Apple builders passes, so the LLVM issue is fixed 🎉

The failing tests need to be made Linux-only (or be adjusted to avoid libc::SYS_getpid on non-Linux Unix systems).

voidc added 3 commits August 1, 2021 09:45
The test calls libc::getpid() in the pre_exec hook and asserts that the returned value is different from the PID of the parent.
However, libc::getpid() returns the wrong value.
Before version 2.25, glibc caches the PID of the current process with the goal of avoiding additional syscalls.
The cached value is only updated when the wrapper functions for fork or clone are called.
In PR rust-lang#81825 we switch to directly using the clone3 syscall.
Thus, the cache is not updated and getpid returns the PID of the parent.
source: https://man7.org/linux/man-pages/man2/getpid.2.html#NOTES
@voidc
Copy link
Contributor Author

voidc commented Aug 1, 2021

@Aaron1011 Okay, I fixed it! And thanks for seeing to the LLVM fix getting merged so fast! 😃

@Aaron1011
Copy link
Contributor

@bors r=joshtriplett

@bors
Copy link
Collaborator

bors commented Aug 1, 2021

📌 Commit 4a832d3 has been approved by joshtriplett

@bors
Copy link
Collaborator

bors commented Aug 1, 2021

⌛ Testing commit 4a832d3 with merge 4e21ef2...

@bors
Copy link
Collaborator

bors commented Aug 1, 2021

☀️ Test successful - checks-actions
Approved by: joshtriplett
Pushing 4e21ef2 to master...

@Aaron1011
Copy link
Contributor

@voidc: Thanks for all of your work getting this over the finish line!

@voidc
Copy link
Contributor Author

voidc commented Aug 1, 2021

🎉 🎉 Finally! Thanks for your patience @Aaron1011 and @joshtriplett during all of these 200+ comments and 15+ bors runs 😂

@voidc voidc mentioned this pull request Aug 1, 2021
@brauner
Copy link

brauner commented Aug 2, 2021

Thank you all for your work. Very happy to see upstream work make it into this project. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

merged-by-bors This PR was explicitly merged by bors. O-linux Operating system: Linux relnotes Marks issues that should be documented in the release notes of the next release. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs-api Relevant to the library API team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.