fix(thread-safety): eliminate races in log mutex, watcher, and index threads by map588 · Pull Request #207 · DeusData/codebase-memory-mcp

map588 · 2026-04-05T01:02:16Z

http_server: Replace lazy log mutex init (TOCTOU race between threads) with explicit cbm_ui_log_init() called from main before thread creation
watcher: Add cbm_mutex_t to protect projects hash table from concurrent access by main thread (watch/unwatch) and watcher thread (poll_once)
compat_thread: Add cbm_thread_detach() for POSIX and Windows
http_server: Detach index job threads to prevent handle leaks

Copilot

Pull request overview

This PR addresses thread-safety and resource-leak issues across the watcher, UI logging, and background indexing threads.

Changes:

Add a mutex to protect the watcher’s projects hash table from concurrent access.
Replace lazy log-mutex initialization with an explicit cbm_ui_log_init() call from main().
Add cbm_thread_detach() and detach index job threads to avoid leaking thread handles.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file

File	Description
src/watcher/watcher.c	Adds `projects_lock` and guards project table operations in watch/unwatch/poll/free.
src/ui/http_server.h	Exposes `cbm_ui_log_init()` API and documents initialization requirements.
src/ui/http_server.c	Implements `cbm_ui_log_init()`, removes lazy init from append, detaches index threads.
src/main.c	Calls `cbm_ui_log_init()` before starting background threads.
src/foundation/compat_thread.h	Adds `cbm_thread_detach()` declaration.
src/foundation/compat_thread.c	Implements `cbm_thread_detach()` for Windows and POSIX.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-05T01:03:55Z

src/ui/http_server.c

+/* Must be called once before any threads are created. */
+void cbm_ui_log_init(void) {
+    if (!atomic_exchange(&g_log_mutex_init, 1)) {
+        cbm_mutex_init(&g_log_mutex);
+    }
+}
+
 /* Called from a log hook — appends a line to the ring buffer (thread-safe) */
 void cbm_ui_log_append(const char *line) {
-    if (!line)
+    if (!line || !atomic_load(&g_log_mutex_init))
        return;
-    if (!atomic_load(&g_log_mutex_init)) {
-        cbm_mutex_init(&g_log_mutex);
-        atomic_store(&g_log_mutex_init, 1);
-    }
    cbm_mutex_lock(&g_log_mutex);
    snprintf(g_log_ring[g_log_head], LOG_LINE_MAX, "%s", line);
    g_log_head = (g_log_head + 1) % LOG_RING_SIZE;


With this change, cbm_ui_log_append() silently drops log lines until cbm_ui_log_init() has been called. That’s a behavior change from the previous lazy-init path and can lose early startup logs (even single-threaded) if any code logs before main() reaches cbm_ui_log_init(). Consider making cbm_ui_log_append() invoke cbm_ui_log_init() (using the existing atomic_exchange-based once init, which avoids the prior TOCTOU race), or ensure the init call happens before any log hook / logging can occur.

Copilot · 2026-04-05T01:03:56Z

src/watcher/watcher.c

        .now = now_ns(),
        .reindexed = 0,
    };
+    cbm_mutex_lock(&w->projects_lock);
    cbm_ht_foreach(w->projects, poll_project, &ctx);
+    cbm_mutex_unlock(&w->projects_lock);
    return ctx.reindexed;


The new coarse lock in cbm_watcher_poll_once() is held across the entire cbm_ht_foreach(...) call. If poll_project performs I/O, indexing work, or calls out to callbacks, this can block watch/unwatch for long periods and increase contention. A common pattern is to snapshot the current project states/keys under the lock (e.g., into a temporary array), release the lock, and then run poll_project over the snapshot to keep the critical section small.

Copilot · 2026-04-05T01:03:56Z

src/watcher/watcher.c

+    cbm_mutex_lock(&w->projects_lock);
    project_state_t *s = cbm_ht_get(w->projects, project_name);
    if (s) {
        cbm_ht_delete(w->projects, project_name);
        state_free(s);
+    }
+    cbm_mutex_unlock(&w->projects_lock);
+    if (s) {


s is freed inside the lock, but then reused after unlocking as a boolean flag. While this is currently safe (not dereferenced), it’s easy to misread as a use-after-free and can trip up future edits. Consider using a separate bool removed flag (or set s = NULL after state_free(s) and use a separate flag) to make the intent explicit.

Suggested change

cbm_mutex_lock(&w->projects_lock);

project_state_t *s = cbm_ht_get(w->projects, project_name);

if (s) {

cbm_ht_delete(w->projects, project_name);

state_free(s);

}

cbm_mutex_unlock(&w->projects_lock);

if (s) {

bool removed = false;

cbm_mutex_lock(&w->projects_lock);

project_state_t *s = cbm_ht_get(w->projects, project_name);

if (s) {

cbm_ht_delete(w->projects, project_name);

state_free(s);

removed = true;

}

cbm_mutex_unlock(&w->projects_lock);

if (removed) {

Copilot · 2026-04-05T01:03:56Z

src/foundation/compat_thread.c

+int cbm_thread_detach(cbm_thread_t *t) {
+    if (t->handle) {
+        CloseHandle(t->handle);
+        t->handle = NULL;
+    }
+    return 0;
+}


The POSIX implementation doesn’t clear/poison the stored pthread_t after a successful detach, which makes accidental later join()/double-detach harder to detect and can lead to inconsistent lifecycle handling across platforms (Windows clears the handle). Consider, on success, updating the thread object to a known 'no handle' state (whatever is idiomatic for cbm_thread_t on POSIX), and/or documenting that cbm_thread_join() must not be called after detach.

Copilot · 2026-04-05T01:03:57Z

src/foundation/compat_thread.c

    return pthread_join(t->handle, NULL);
 }

+int cbm_thread_detach(cbm_thread_t *t) {
+    return pthread_detach(t->handle);


The POSIX implementation doesn’t clear/poison the stored pthread_t after a successful detach, which makes accidental later join()/double-detach harder to detect and can lead to inconsistent lifecycle handling across platforms (Windows clears the handle). Consider, on success, updating the thread object to a known 'no handle' state (whatever is idiomatic for cbm_thread_t on POSIX), and/or documenting that cbm_thread_join() must not be called after detach.

Suggested change

return pthread_join(t->handle, NULL);

}

int cbm_thread_detach(cbm_thread_t *t) {

return pthread_detach(t->handle);

int rc = pthread_join(t->handle, NULL);

if (rc == 0) {

memset(&t->handle, 0, sizeof(t->handle));

}

return rc;

}

int cbm_thread_detach(cbm_thread_t *t) {

int rc = pthread_detach(t->handle);

if (rc == 0) {

memset(&t->handle, 0, sizeof(t->handle));

}

return rc;

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-05T03:04:20Z

src/watcher/watcher.c

    cbm_index_fn index_fn;
    void *user_data;
    CBMHashTable *projects; /* name → project_state_t* */
+    cbm_mutex_t projects_lock;
    atomic_int stopped;
 };


projects_lock is added to protect the projects hash table, but not all accessors use it. For example, cbm_watcher_touch() and cbm_watcher_watch_count() call cbm_ht_get/cbm_ht_count without taking projects_lock, so concurrent watch/unwatch/poll can still race. Either guard every hash-table access with projects_lock (including touch/watch_count) or clearly document and enforce single-threaded callers for those APIs.

Copilot · 2026-04-05T03:04:20Z

src/watcher/watcher.c

@@ -272,17 +279,24 @@ void cbm_watcher_watch(cbm_watcher_t *w, const char *project_name, const char *r

    project_state_t *s = state_new(project_name, root_path);


state_new() can return NULL on OOM, but the result is used immediately (s->project_name) and inserted into the hash table while holding projects_lock. Please handle allocation failure by unlocking, cleaning up, and returning (optionally logging an OOM warning).

Suggested change

project_state_t *s = state_new(project_name, root_path);

project_state_t *s = state_new(project_name, root_path);

if (!s) {

cbm_mutex_unlock(&w->projects_lock);

cbm_log_warn("watcher.watch.oom", "project", project_name, "path", root_path);

return;

}

Copilot · 2026-04-05T03:04:21Z

src/ui/http_server.c

 static atomic_int g_log_mutex_init = 0;

+/* Must be called once before any threads are created. */
+void cbm_ui_log_init(void) {
+    if (!atomic_exchange(&g_log_mutex_init, 1)) {
+        cbm_mutex_init(&g_log_mutex);
+    }


cbm_ui_log_init uses atomic_exchange to set g_log_mutex_init=1 before cbm_mutex_init runs. If two threads call cbm_ui_log_init concurrently, one thread can observe init=1 and proceed to lock an uninitialized mutex. Use a real one-time init primitive (pthread_once/InitOnceExecuteOnce), or a 3-state init (UNINIT/INITING/INITED) that only publishes "initialized" after cbm_mutex_init completes.

Suggested change

static atomic_int g_log_mutex_init = 0;

/* Must be called once before any threads are created. */

void cbm_ui_log_init(void) {

if (!atomic_exchange(&g_log_mutex_init, 1)) {

cbm_mutex_init(&g_log_mutex);

}

enum {

CBM_LOG_MUTEX_UNINIT = 0,

CBM_LOG_MUTEX_INITING = 1,

CBM_LOG_MUTEX_INITED = 2

};

static atomic_int g_log_mutex_init = CBM_LOG_MUTEX_UNINIT;

/* Safe for concurrent callers: only publish INITED after cbm_mutex_init()

* has completed. */

void cbm_ui_log_init(void) {

int state = atomic_load(&g_log_mutex_init);

if (state == CBM_LOG_MUTEX_INITED)

return;

state = CBM_LOG_MUTEX_UNINIT;

if (atomic_compare_exchange_strong(&g_log_mutex_init, &state,

CBM_LOG_MUTEX_INITING)) {

cbm_mutex_init(&g_log_mutex);

atomic_store(&g_log_mutex_init, CBM_LOG_MUTEX_INITED);

return;

}

while (atomic_load(&g_log_mutex_init) != CBM_LOG_MUTEX_INITED) {

#ifdef _WIN32

Sleep(0);

#else

usleep(1000);

#endif

}

Copilot · 2026-04-05T03:04:21Z

src/foundation/compat_thread.c

    CloseHandle(t->handle);
    return 0;
 }

+int cbm_thread_detach(cbm_thread_t *t) {


On Windows, cbm_thread_join closes the thread handle but does not clear t->handle. With cbm_thread_detach now available (and used elsewhere), a caller could accidentally CloseHandle twice (join then detach), or treat the thread as still valid. Set t->handle = NULL after CloseHandle in cbm_thread_join for consistency with cbm_thread_detach.

…threads - http_server: Replace lazy log mutex init with 3-state init (UNINIT → INITING → INITED) using atomic CAS. Concurrent callers spin until init completes, preventing use of uninitialized mutex. cbm_ui_log_append calls cbm_ui_log_init on first use so early startup logs are not dropped. - watcher: Add cbm_mutex_t to protect projects hash table. All accessors (watch, unwatch, touch, watch_count, poll_once) are guarded. poll_once snapshots project pointers under lock then polls without holding it, keeping the critical section small during git I/O and indexing. state_new OOM is handled with early return. - compat_thread: Add cbm_thread_detach() for POSIX and Windows. Both join() and detach() clear the handle on success across both platforms for consistent lifecycle tracking. - http_server: Detach index job threads to prevent handle leaks.

Copilot

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-05T03:30:25Z

src/watcher/watcher.c

@@ -236,6 +238,7 @@ cbm_watcher_t *cbm_watcher_new(cbm_store_t *store, cbm_index_fn index_fn, void *
    w->index_fn = index_fn;
    w->user_data = user_data;
    w->projects = cbm_ht_create(CBM_SZ_32);


cbm_watcher_new() doesn’t check whether cbm_ht_create() succeeded. If it returns NULL (OOM), later cbm_watcher_watch/unwatch/touch will dereference a NULL hash table via cbm_ht_get/set/delete, causing a crash. Consider handling allocation failure here (free w, destroy mutex if needed) and return NULL so callers can detect it.

Suggested change

w->projects = cbm_ht_create(CBM_SZ_32);

w->projects = cbm_ht_create(CBM_SZ_32);

if (!w->projects) {

free(w);

return NULL;

}

Copilot · 2026-04-05T03:30:25Z

src/watcher/watcher.c

 int cbm_watcher_watch_count(const cbm_watcher_t *w) {
    if (!w) {
        return 0;
    }
-    return (int)cbm_ht_count(w->projects);
+    cbm_mutex_lock(&((cbm_watcher_t *)w)->projects_lock);
+    int count = (int)cbm_ht_count(w->projects);
+    cbm_mutex_unlock(&((cbm_watcher_t *)w)->projects_lock);
+    return count;


cbm_watcher_watch_count() takes a const cbm_watcher_t* but then casts away const to lock/unlock projects_lock. This undermines const-correctness and can surprise callers. Consider changing the API to take a non-const cbm_watcher_t* (since it necessarily touches synchronization primitives), or store projects_lock behind an internal pointer so the public signature remains const-safe.

Copilot AI review requested due to automatic review settings April 5, 2026 01:02

Copilot AI reviewed Apr 5, 2026

View reviewed changes

map588 force-pushed the fix/thread-safety branch from 9c53fc0 to deea49e Compare April 5, 2026 02:52

map588 requested a review from Copilot April 5, 2026 03:00

Copilot started reviewing on behalf of map588 April 5, 2026 03:00 View session

Copilot AI reviewed Apr 5, 2026

View reviewed changes

map588 force-pushed the fix/thread-safety branch from deea49e to d6966f9 Compare April 5, 2026 03:23

map588 requested a review from Copilot April 5, 2026 03:26

Copilot started reviewing on behalf of map588 April 5, 2026 03:27 View session

Copilot AI reviewed Apr 5, 2026

View reviewed changes

map588 added 2 commits April 6, 2026 10:37

Merge branch 'DeusData:main' into fix/thread-safety

ebda064

Merge branch 'DeusData:main' into fix/thread-safety

8f5b4c3

DeusData added bug Something isn't working stability/performance Server crashes, OOM, hangs, high CPU/memory labels Apr 6, 2026

-    return pthread_join(t->handle, NULL);
-}
-int cbm_thread_detach(cbm_thread_t *t) {
-    return pthread_detach(t->handle);
+    int rc = pthread_join(t->handle, NULL);
+    if (rc == 0) {
+        memset(&t->handle, 0, sizeof(t->handle));
+    }
+    return rc;
+}
+int cbm_thread_detach(cbm_thread_t *t) {
+    int rc = pthread_detach(t->handle);
+    if (rc == 0) {
+        memset(&t->handle, 0, sizeof(t->handle));
+    }
+    return rc;

		@@ -272,17 +279,24 @@ void cbm_watcher_watch(cbm_watcher_t w, const char project_name, const char *r

		project_state_t *s = state_new(project_name, root_path);

-static atomic_int g_log_mutex_init = 0;
-/* Must be called once before any threads are created. */
-void cbm_ui_log_init(void) {
-    if (!atomic_exchange(&g_log_mutex_init, 1)) {
-        cbm_mutex_init(&g_log_mutex);
-    }
+enum {
+    CBM_LOG_MUTEX_UNINIT = 0,
+    CBM_LOG_MUTEX_INITING = 1,
+    CBM_LOG_MUTEX_INITED = 2
+};
+static atomic_int g_log_mutex_init = CBM_LOG_MUTEX_UNINIT;
+/* Safe for concurrent callers: only publish INITED after cbm_mutex_init()
+ * has completed. */
+void cbm_ui_log_init(void) {
+    int state = atomic_load(&g_log_mutex_init);
+    if (state == CBM_LOG_MUTEX_INITED)
+        return;
+    state = CBM_LOG_MUTEX_UNINIT;
+    if (atomic_compare_exchange_strong(&g_log_mutex_init, &state,
+                                       CBM_LOG_MUTEX_INITING)) {
+        cbm_mutex_init(&g_log_mutex);
+        atomic_store(&g_log_mutex_init, CBM_LOG_MUTEX_INITED);
+        return;
+    }
+    while (atomic_load(&g_log_mutex_init) != CBM_LOG_MUTEX_INITED) {
+#ifdef _WIN32
+        Sleep(0);
+#else
+        usleep(1000);
+#endif
+    }

-    w->projects = cbm_ht_create(CBM_SZ_32);
+    w->projects = cbm_ht_create(CBM_SZ_32);
+    if (!w->projects) {
+        free(w);
+        return NULL;
+    }

Conversation

map588 commented Apr 5, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants