Skip to content

fix(thread-safety): eliminate races in log mutex, watcher, and index threads#207

Open
map588 wants to merge 3 commits intoDeusData:mainfrom
map588:fix/thread-safety
Open

fix(thread-safety): eliminate races in log mutex, watcher, and index threads#207
map588 wants to merge 3 commits intoDeusData:mainfrom
map588:fix/thread-safety

Conversation

@map588
Copy link
Copy Markdown
Contributor

@map588 map588 commented Apr 5, 2026

  • http_server: Replace lazy log mutex init (TOCTOU race between threads) with explicit cbm_ui_log_init() called from main before thread creation
  • watcher: Add cbm_mutex_t to protect projects hash table from concurrent access by main thread (watch/unwatch) and watcher thread (poll_once)
  • compat_thread: Add cbm_thread_detach() for POSIX and Windows
  • http_server: Detach index job threads to prevent handle leaks

Copilot AI review requested due to automatic review settings April 5, 2026 01:02
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses thread-safety and resource-leak issues across the watcher, UI logging, and background indexing threads.

Changes:

  • Add a mutex to protect the watcher’s projects hash table from concurrent access.
  • Replace lazy log-mutex initialization with an explicit cbm_ui_log_init() call from main().
  • Add cbm_thread_detach() and detach index job threads to avoid leaking thread handles.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
src/watcher/watcher.c Adds projects_lock and guards project table operations in watch/unwatch/poll/free.
src/ui/http_server.h Exposes cbm_ui_log_init() API and documents initialization requirements.
src/ui/http_server.c Implements cbm_ui_log_init(), removes lazy init from append, detaches index threads.
src/main.c Calls cbm_ui_log_init() before starting background threads.
src/foundation/compat_thread.h Adds cbm_thread_detach() declaration.
src/foundation/compat_thread.c Implements cbm_thread_detach() for Windows and POSIX.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 145 to 158
/* Must be called once before any threads are created. */
void cbm_ui_log_init(void) {
if (!atomic_exchange(&g_log_mutex_init, 1)) {
cbm_mutex_init(&g_log_mutex);
}
}

/* Called from a log hook — appends a line to the ring buffer (thread-safe) */
void cbm_ui_log_append(const char *line) {
if (!line)
if (!line || !atomic_load(&g_log_mutex_init))
return;
if (!atomic_load(&g_log_mutex_init)) {
cbm_mutex_init(&g_log_mutex);
atomic_store(&g_log_mutex_init, 1);
}
cbm_mutex_lock(&g_log_mutex);
snprintf(g_log_ring[g_log_head], LOG_LINE_MAX, "%s", line);
g_log_head = (g_log_head + 1) % LOG_RING_SIZE;
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, cbm_ui_log_append() silently drops log lines until cbm_ui_log_init() has been called. That’s a behavior change from the previous lazy-init path and can lose early startup logs (even single-threaded) if any code logs before main() reaches cbm_ui_log_init(). Consider making cbm_ui_log_append() invoke cbm_ui_log_init() (using the existing atomic_exchange-based once init, which avoids the prior TOCTOU race), or ensure the init call happens before any log hook / logging can occur.

Copilot uses AI. Check for mistakes.
Comment on lines 433 to 439
.now = now_ns(),
.reindexed = 0,
};
cbm_mutex_lock(&w->projects_lock);
cbm_ht_foreach(w->projects, poll_project, &ctx);
cbm_mutex_unlock(&w->projects_lock);
return ctx.reindexed;
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new coarse lock in cbm_watcher_poll_once() is held across the entire cbm_ht_foreach(...) call. If poll_project performs I/O, indexing work, or calls out to callbacks, this can block watch/unwatch for long periods and increase contention. A common pattern is to snapshot the current project states/keys under the lock (e.g., into a temporary array), release the lock, and then run poll_project over the snapshot to keep the critical section small.

Copilot uses AI. Check for mistakes.
Comment on lines +290 to +297
cbm_mutex_lock(&w->projects_lock);
project_state_t *s = cbm_ht_get(w->projects, project_name);
if (s) {
cbm_ht_delete(w->projects, project_name);
state_free(s);
}
cbm_mutex_unlock(&w->projects_lock);
if (s) {
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s is freed inside the lock, but then reused after unlocking as a boolean flag. While this is currently safe (not dereferenced), it’s easy to misread as a use-after-free and can trip up future edits. Consider using a separate bool removed flag (or set s = NULL after state_free(s) and use a separate flag) to make the intent explicit.

Suggested change
cbm_mutex_lock(&w->projects_lock);
project_state_t *s = cbm_ht_get(w->projects, project_name);
if (s) {
cbm_ht_delete(w->projects, project_name);
state_free(s);
}
cbm_mutex_unlock(&w->projects_lock);
if (s) {
bool removed = false;
cbm_mutex_lock(&w->projects_lock);
project_state_t *s = cbm_ht_get(w->projects, project_name);
if (s) {
cbm_ht_delete(w->projects, project_name);
state_free(s);
removed = true;
}
cbm_mutex_unlock(&w->projects_lock);
if (removed) {

Copilot uses AI. Check for mistakes.
Comment on lines +62 to +68
int cbm_thread_detach(cbm_thread_t *t) {
if (t->handle) {
CloseHandle(t->handle);
t->handle = NULL;
}
return 0;
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The POSIX implementation doesn’t clear/poison the stored pthread_t after a successful detach, which makes accidental later join()/double-detach harder to detect and can lead to inconsistent lifecycle handling across platforms (Windows clears the handle). Consider, on success, updating the thread object to a known 'no handle' state (whatever is idiomatic for cbm_thread_t on POSIX), and/or documenting that cbm_thread_join() must not be called after detach.

Copilot uses AI. Check for mistakes.
Comment on lines +85 to +89
return pthread_join(t->handle, NULL);
}

int cbm_thread_detach(cbm_thread_t *t) {
return pthread_detach(t->handle);
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The POSIX implementation doesn’t clear/poison the stored pthread_t after a successful detach, which makes accidental later join()/double-detach harder to detect and can lead to inconsistent lifecycle handling across platforms (Windows clears the handle). Consider, on success, updating the thread object to a known 'no handle' state (whatever is idiomatic for cbm_thread_t on POSIX), and/or documenting that cbm_thread_join() must not be called after detach.

Suggested change
return pthread_join(t->handle, NULL);
}
int cbm_thread_detach(cbm_thread_t *t) {
return pthread_detach(t->handle);
int rc = pthread_join(t->handle, NULL);
if (rc == 0) {
memset(&t->handle, 0, sizeof(t->handle));
}
return rc;
}
int cbm_thread_detach(cbm_thread_t *t) {
int rc = pthread_detach(t->handle);
if (rc == 0) {
memset(&t->handle, 0, sizeof(t->handle));
}
return rc;

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 51 to 56
cbm_index_fn index_fn;
void *user_data;
CBMHashTable *projects; /* name → project_state_t* */
cbm_mutex_t projects_lock;
atomic_int stopped;
};
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

projects_lock is added to protect the projects hash table, but not all accessors use it. For example, cbm_watcher_touch() and cbm_watcher_watch_count() call cbm_ht_get/cbm_ht_count without taking projects_lock, so concurrent watch/unwatch/poll can still race. Either guard every hash-table access with projects_lock (including touch/watch_count) or clearly document and enforce single-threaded callers for those APIs.

Copilot uses AI. Check for mistakes.
@@ -272,17 +279,24 @@ void cbm_watcher_watch(cbm_watcher_t *w, const char *project_name, const char *r

project_state_t *s = state_new(project_name, root_path);
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

state_new() can return NULL on OOM, but the result is used immediately (s->project_name) and inserted into the hash table while holding projects_lock. Please handle allocation failure by unlocking, cleaning up, and returning (optionally logging an OOM warning).

Suggested change
project_state_t *s = state_new(project_name, root_path);
project_state_t *s = state_new(project_name, root_path);
if (!s) {
cbm_mutex_unlock(&w->projects_lock);
cbm_log_warn("watcher.watch.oom", "project", project_name, "path", root_path);
return;
}

Copilot uses AI. Check for mistakes.
Comment on lines +143 to +149
static atomic_int g_log_mutex_init = 0;

/* Must be called once before any threads are created. */
void cbm_ui_log_init(void) {
if (!atomic_exchange(&g_log_mutex_init, 1)) {
cbm_mutex_init(&g_log_mutex);
}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cbm_ui_log_init uses atomic_exchange to set g_log_mutex_init=1 before cbm_mutex_init runs. If two threads call cbm_ui_log_init concurrently, one thread can observe init=1 and proceed to lock an uninitialized mutex. Use a real one-time init primitive (pthread_once/InitOnceExecuteOnce), or a 3-state init (UNINIT/INITING/INITED) that only publishes "initialized" after cbm_mutex_init completes.

Suggested change
static atomic_int g_log_mutex_init = 0;
/* Must be called once before any threads are created. */
void cbm_ui_log_init(void) {
if (!atomic_exchange(&g_log_mutex_init, 1)) {
cbm_mutex_init(&g_log_mutex);
}
enum {
CBM_LOG_MUTEX_UNINIT = 0,
CBM_LOG_MUTEX_INITING = 1,
CBM_LOG_MUTEX_INITED = 2
};
static atomic_int g_log_mutex_init = CBM_LOG_MUTEX_UNINIT;
/* Safe for concurrent callers: only publish INITED after cbm_mutex_init()
* has completed. */
void cbm_ui_log_init(void) {
int state = atomic_load(&g_log_mutex_init);
if (state == CBM_LOG_MUTEX_INITED)
return;
state = CBM_LOG_MUTEX_UNINIT;
if (atomic_compare_exchange_strong(&g_log_mutex_init, &state,
CBM_LOG_MUTEX_INITING)) {
cbm_mutex_init(&g_log_mutex);
atomic_store(&g_log_mutex_init, CBM_LOG_MUTEX_INITED);
return;
}
while (atomic_load(&g_log_mutex_init) != CBM_LOG_MUTEX_INITED) {
#ifdef _WIN32
Sleep(0);
#else
usleep(1000);
#endif
}

Copilot uses AI. Check for mistakes.
Comment on lines 58 to +62
CloseHandle(t->handle);
return 0;
}

int cbm_thread_detach(cbm_thread_t *t) {
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On Windows, cbm_thread_join closes the thread handle but does not clear t->handle. With cbm_thread_detach now available (and used elsewhere), a caller could accidentally CloseHandle twice (join then detach), or treat the thread as still valid. Set t->handle = NULL after CloseHandle in cbm_thread_join for consistency with cbm_thread_detach.

Copilot uses AI. Check for mistakes.
…threads

- http_server: Replace lazy log mutex init with 3-state init
  (UNINIT → INITING → INITED) using atomic CAS. Concurrent callers
  spin until init completes, preventing use of uninitialized mutex.
  cbm_ui_log_append calls cbm_ui_log_init on first use so early
  startup logs are not dropped.
- watcher: Add cbm_mutex_t to protect projects hash table. All
  accessors (watch, unwatch, touch, watch_count, poll_once) are
  guarded. poll_once snapshots project pointers under lock then polls
  without holding it, keeping the critical section small during git
  I/O and indexing. state_new OOM is handled with early return.
- compat_thread: Add cbm_thread_detach() for POSIX and Windows.
  Both join() and detach() clear the handle on success across both
  platforms for consistent lifecycle tracking.
- http_server: Detach index job threads to prevent handle leaks.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@@ -236,6 +238,7 @@ cbm_watcher_t *cbm_watcher_new(cbm_store_t *store, cbm_index_fn index_fn, void *
w->index_fn = index_fn;
w->user_data = user_data;
w->projects = cbm_ht_create(CBM_SZ_32);
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cbm_watcher_new() doesn’t check whether cbm_ht_create() succeeded. If it returns NULL (OOM), later cbm_watcher_watch/unwatch/touch will dereference a NULL hash table via cbm_ht_get/set/delete, causing a crash. Consider handling allocation failure here (free w, destroy mutex if needed) and return NULL so callers can detect it.

Suggested change
w->projects = cbm_ht_create(CBM_SZ_32);
w->projects = cbm_ht_create(CBM_SZ_32);
if (!w->projects) {
free(w);
return NULL;
}

Copilot uses AI. Check for mistakes.
Comment on lines 322 to +329
int cbm_watcher_watch_count(const cbm_watcher_t *w) {
if (!w) {
return 0;
}
return (int)cbm_ht_count(w->projects);
cbm_mutex_lock(&((cbm_watcher_t *)w)->projects_lock);
int count = (int)cbm_ht_count(w->projects);
cbm_mutex_unlock(&((cbm_watcher_t *)w)->projects_lock);
return count;
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cbm_watcher_watch_count() takes a const cbm_watcher_t* but then casts away const to lock/unlock projects_lock. This undermines const-correctness and can surprise callers. Consider changing the API to take a non-const cbm_watcher_t* (since it necessarily touches synchronization primitives), or store projects_lock behind an internal pointer so the public signature remains const-safe.

Copilot uses AI. Check for mistakes.
@DeusData DeusData added bug Something isn't working stability/performance Server crashes, OOM, hangs, high CPU/memory labels Apr 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working stability/performance Server crashes, OOM, hangs, high CPU/memory

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants