Fix memory leak in LocalExecutor caused by unreleased file descriptor locks#65121
Fix memory leak in LocalExecutor caused by unreleased file descriptor locks#65121wjddn279 wants to merge 3 commits intoapache:mainfrom
Conversation
6f0717f to
ad37cd0
Compare
This comment was marked as outdated.
This comment was marked as outdated.
ad37cd0 to
5a533f4
Compare
|
@wjddn279 — There are 1 unresolved review thread on this PR from @eladkal. Could you either push a fix or reply in each thread explaining why the feedback doesn't apply? Once you believe the feedback is addressed, mark the thread as resolved so the reviewer isn't re-pinged needlessly. Thanks! Note: This comment was drafted by an AI-assisted triage tool and may contain mistakes. Once you have addressed the points above, an Apache Airflow maintainer — a real person — will take the next look at your PR. We use this two-stage triage process so that our maintainers' limited time is spent where it matters most: the conversation with you. |
cb82a2e to
2215c99
Compare
This behaviour was wrong and it's been fixed in #65916 Sorry for the noise @wjddn279 ! |
|
Weird we’d need to do this. Can this be considered a bug in structlog? |
|
I'm currently checking with the structlog maintainers, but I'm not sure whether I'll get a response |
Can you share the issue opened reporting the bug? |
|
not issue but pr thread i've opended |
|
I think I've got a better fix hynek/structlog#807 |
|
The PR has been merged! I checked and the release schedule including that version doesn't seem to be confirmed yet. How about we use this PR for now and remove it later when we bump the version after the release? |
0348613 to
30aaf78
Compare
Generally speaking, we can backport/vendor the fix right? and in parallal have a draft PR that removes the backport with updating the libary version that we will merge when it's released. |
|
Hynek is often prompt about releasing new versions, but if we don't want to wait, then backporting this is a one line change -- as long as we apply it "safely" (so the it doesn't crash if something goes wrong, sure. |
30aaf78 to
b27db45
Compare
b27db45 to
0a08921
Compare
Problem?
We conducted an investigation to resolve the continuous memory growth in the Local Executor's forked processes, following the previous memory spike fix.
Using memray to profile the Local Executor's forked processes, we confirmed that memory was steadily increasing. Specifically, logging-related memory was growing, and this was traced to the process of writing to local files.
memray-flamegraph-output-2026-04-11 13:17:18.516888.html
Cause?
Looking at,
full_path.touch(new_file_permissions)conf.get("logging", "file_task_handler_new_file_permissions", fallback="0o664")in flamegraph, I noticed that basic string objects were not being properly garbage collected, which led us to suspect unreleased references.Upon examining
self._lock = _get_lock_for_file(self._file)in the memory-increasing area, I found that file descriptors were being cached as locks in a dictionary with no corresponding release mechanism. This prevented the file descriptors from being properly released, and consequently, the parent Process objects also retained references and were never garbage collected.Solution
After adding code to manually release these references, we re-ran the profiling and confirmed that the related objects were being properly freed.
memray-flamegraph-output-2026-04-12 13:14:12.556546.html
PS
The remained
parsed = [sys.intern(str(x)) for x in rel.split(sep) if x and x != '.']issue will be addressed in a separate PR.Was generative AI tooling used to co-author this PR?
{pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.