prevent symlink loop recursion in directory traversal#289
Conversation
| if (lstat(std::string(path).c_str(), &path_stat) < 0) { | ||
| return ErrnoStatus(absl::StrCat("could not stat ", path), errno); | ||
| } | ||
| if (S_ISDIR(path_stat.st_mode)) { | ||
| // Do not recurse into symlinked directories to avoid traversal loops. | ||
| if (S_ISDIR(path_stat.st_mode) && !S_ISLNK(path_stat.st_mode)) { |
There was a problem hiding this comment.
Thanks for the contribution.
This fixes walking symlink cycles, but it also changes the metadata passed to the callback from stat() target metadata to lstat() symlink metadata. The replayer filters for regular files which would now skip symlinked files which is likely not intended.
I'm also not sure we want to drop support for symlinked directories entirely. A more compatible fix would likely keep track of visited directories and use that to break symlink cycles without changing normal symlink handling.
There was a problem hiding this comment.
Thanks for the feedback
I’ve updated the patch to keep using stat() so metadata semantics are preserved, and symlinked files/directories continue to behave as before. Instead of skipping symlinks, the fix now tracks visited directories using (st_dev, st_ino) to prevent cycles while still allowing traversal.
dbccd9b to
12b9372
Compare
12b9372 to
3b8b6af
Compare
|
Hi @simonresch any updates on this PR? Thanks! |
| if (stat(std::string(path).c_str(), &path_stat) < 0) { | ||
| return ErrnoStatus(absl::StrCat("could not stat ", path), errno); | ||
| } | ||
| callback(path, path_stat); |
There was a problem hiding this comment.
callback is here called unconditionally including on directories. This likely breaks existing behavior and fails the test for the file utility functions. You can run them on your end with the command bazel test //fuzzing/replay:file_util_test or bazel test //... to run all tests.
The direction to detect cycles on a pair of path_stat.st_dev, path_stat.st_ino makes sense to me.
| EXPECT_GT(collected_paths.size(), 0); | ||
| EXPECT_LE(collected_paths.size(), 6); |
There was a problem hiding this comment.
The tests would need one or two normal files after adjusting the callback handling. Also, is there a reason to assert a range of size values instead of the exact number of expected files?
Summary
Prevent unbounded recursion in directory traversal caused by symlink loops.
Problem
Traversal currently follows symlinked directories (via
stat), which can create cycles (e.g.,dirA → dirB → dirA) and lead to unbounded recursion and unstable behavior.Fix
lstatto inspect paths without following symlinksS_ISDIR && !S_ISLNK)Behavior
Test
dirA ↔ dirB)> 0and<= 6)Scope
file_util.cc