Expose remove_file_from_lock helper for WRITE_LOCKS cleanup#806
Expose remove_file_from_lock helper for WRITE_LOCKS cleanup#806wjddn279 wants to merge 2 commits intohynek:mainfrom
Conversation
|
Not yet sure if we need this change on the structlog side. |
|
OK lmk one y'all sure. :) |
|
Hi! I have a question regarding this :) Is there a specific intention behind not having a separate release logic for WRITE_LOCKS? I would expect the reference to be released when the structlog object is properly closed or cleaned up, but with the current behavior, it seems like there is no way to remove the reference from WRITE_LOCKS. |
|
Could this be not solved with weak references? Sounds to me like they were made for such use-cases:
|
|
I tested replacing it with a weak reference — I tried passing log_file_descriptor as a weakref instead of passing it directly. It ended up throwing an error when accessing WRITE_LOCKS. Is this the intended behavior you had in mind? |
|
Hi, I wanted to check whether you see the missing WRITE_LOCK release as a bug that should be fixed. If that's the case, I'd be happy to submit a PR for it. I'm raising this because I have some concerns about how it's currently handled on the consumer side. |
File objects registered in WRITE_LOCKS were never released, causing a memory leak in long-running processes that open many log files (e.g., task executors creating a per-task BytesLogger or WriteLogger). WeakKeyDictionary stores keys as weak references, so entries expire automatically when the last strong reference to a file object is dropped — no manual cleanup needed. Closes hynek#806 (no need to expose it, we tidy it up correctly ourselves)
I hope #807 fixes your problem, because that seems to be the correct way to fix this? |
|
yes. LGTM thanks! @ashb |
|
great! when I suggested wekarefs to you, I missed that the keys were the referenced part and wanted to check back once I have time to breath, thankfully @ashb took that off my plate 🥳 |
) * Use WeakKeyDictionary for WRITE_LOCKS to prevent file object leaks File objects registered in WRITE_LOCKS were never released, causing a memory leak in long-running processes that open many log files (e.g., task executors creating a per-task BytesLogger or WriteLogger). WeakKeyDictionary stores keys as weak references, so entries expire automatically when the last strong reference to a file object is dropped — no manual cleanup needed. Closes #806 (no need to expose it, we tidy it up correctly ourselves) * docs: rewrite changelog from user's perspective * tests: adapt style --------- Co-authored-by: Hynek Schlawack <hs@ox.cx>
Summary
Hi, I'm an active user and contributor of Airflow. While investigating a memory leak in Airflow, I discovered that structlog's WRITE_LOCKS holds references to file descriptors created by the process, and these references prevent the associated loggers from being garbage collected even after they are no longer in use. I was able to confirm that the memory leak is resolved by importing WRITE_LOCKS and manually removing the references.
please reference the pr related: apache/airflow#65121
As far as I can tell, there is currently no public interface for removing file references from WRITE_LOCKS. I don't think it's ideal for users to directly import and manipulate a variable from a private module. I've added a public interface that allows users to clean up WRITE_LOCKS references — I'd appreciate it if you could review it.
Pull Request Check List
mainbranch.api.py.docs/api.rstby hand.versionadded,versionchanged, ordeprecateddirectives..rstand.mdfiles is written using semantic newlines.