Feat/memory helpers by map588 · Pull Request #209 · DeusData/codebase-memory-mcp

map588 · 2026-04-05T20:15:13Z

Adds helper functions which encapsulate repeated patterns throughout the codebase. I have a larger fix available with the call sites all modified, but this needs checked out first, because its the basis of all that. Its four helper functions, simpler to the safe_realloc function.

…crashes Three locations in store.c called sqlite3_prepare_v2 without checking the return code. If the statement fails to prepare (DB corruption, malformed SQL), subsequent bind_text and sqlite3_step calls dereference NULL, crashing the server. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…t access The watcher's projects hash table was written by the main thread (watch/unwatch) and iterated by the watcher thread (poll_once) with no synchronization. Added cbm_mutex_t to the watcher struct and wrapped all hash table operations. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Decorators previously only created DECORATES edges. A @login_required decorator was invisible to "find all references" queries because those look for CALLS and USAGE edges. Now resolve_decorator emits both DECORATES and CALLS edges. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…row memory helpers Four new inline helpers in platform.h alongside existing safe_realloc: - safe_free(ptr): frees and NULLs any pointer (prevents double-free) - safe_str_free(&str): frees const char* and NULLs (replaces free((void*)str)) - safe_buf_free(buf, &count): frees array and zeros its count - safe_grow(arr, n, cap, factor): one-line capacity-doubling realloc

Copilot

Pull request overview

This PR introduces new “safe” memory helper utilities and includes several robustness/concurrency fixes across the watcher, UI logging/threading, Cypher lexing/parsing, and SQLite statement preparation.

Changes:

Add safe_free/safe_str_free/safe_buf_free/safe_grow helpers in platform.h.
Improve thread-safety and lifecycle management (watcher project table mutex, UI log mutex init API, thread detach helper + usage).
Harden parsing/lexing and SQLite calls (truncate long string literals safely, add lexer overflow test, check sqlite3_prepare_v2 return values, emit extra semantic edges).

Reviewed changes

Copilot reviewed 17 out of 18 changed files in this pull request and generated 11 comments.

Show a summary per file

File	Description
tests/test_cypher.c	Adds regression test for long string literal truncation behavior.
src/cypher/cypher.c	Adds truncation guard in string lexing; adds several OOM checks in parser.
src/foundation/platform.h	Adds new memory helper macros/inlines.
src/foundation/compat_thread.h	Exposes `cbm_thread_detach`.
src/foundation/compat_thread.c	Implements `cbm_thread_detach` for Win32/POSIX.
src/watcher/watcher.c	Adds mutex protecting `projects` hash table and wraps some accesses.
src/ui/http_server.h	Exposes `cbm_ui_log_init` API.
src/ui/http_server.c	Requires explicit log mutex initialization; detaches background indexing thread handle.
src/main.c	Calls `cbm_ui_log_init()` during startup.
src/store/store.c	Checks `sqlite3_prepare_v2` results before proceeding.
src/pipeline/pass_semantic.c	Adds a `CALLS` edge for decorators to improve reference queries.
.gitignore	Ignores additional local/Claude-related files.
.claude/skills/, .claude/settings.json, .claude/agents/	Adds Claude workflow/skill configuration files.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-05T20:21:52Z

src/cypher/cypher.c

+    const int max_blen = CBM_SZ_4K - 1;
    while (*pos < len && input[*pos] != quote) {
+        if (blen >= max_blen) { (*pos)++; continue; }
        if (input[*pos] == '\\' && *pos + SKIP_ONE < len) {
            (*pos)++;


In lex_string_literal, the overflow truncation check runs before the escape-handling branch. If the string is already at max length and the next bytes are an escape like \", the code will skip only the backslash, then see " on the next iteration and treat it as the closing quote (even though it was escaped), prematurely terminating the string. To preserve correct lexing, handle escape sequences even when truncating (e.g., move the truncation logic after the escape handling, or when truncating and seeing \\ skip both bytes when possible).

Copilot · 2026-04-05T20:21:52Z

src/foundation/platform.h

+/* Safe free: frees and NULLs a pointer to prevent double-free / use-after-free.
+ * Accepts void** so it works with any pointer type via the macro. */
+static inline void safe_free_impl(void **pp) {
+    if (pp && *pp) {
+        free(*pp);
+        *pp = NULL;
+    }
+}
+#define safe_free(ptr) safe_free_impl((void **)(void *)&(ptr))
+
+/* Safe string free: frees a const char* and NULLs it.
+ * Casts away const so callers don't need the (void*) dance. */
+static inline void safe_str_free(const char **sp) {
+    if (sp && *sp) {
+        free((void *)*sp);
+        *sp = NULL;
+    }
+}
+
+/* Safe buffer free: frees a heap array and zeros its element count.
+ * Use for dynamic arrays paired with a size_t count. */
+static inline void safe_buf_free_impl(void **buf, size_t *count) {
+    if (buf && *buf) {
+        free(*buf);
+        *buf = NULL;
+    }
+    if (count) {
+        *count = 0;
+    }
+}
+#define safe_buf_free(buf, countp) safe_buf_free_impl((void **)(void *)&(buf), (countp))
+


safe_free/safe_buf_free pass &(ptr) cast to void ** into safe_free_impl/safe_buf_free_impl. Converting a T ** to void ** and then writing through it is undefined behavior in C (the pointer-to-pointer types are not compatible). Prefer a macro-based implementation that frees and assigns NULL directly to the lvalue (or use _Generic to dispatch to type-correct helper functions) so you never write through a mis-typed void **.

Copilot · 2026-04-05T20:21:52Z

src/foundation/platform.h

+/* Safe grow: doubles capacity and reallocs when count reaches cap.
+ * Usage: safe_grow(arr, count, cap, growth_factor)
+ * Evaluates to the new arr (NULL on OOM — old memory freed by safe_realloc). */
+#define safe_grow(arr, n, cap, factor) do { \
+    if ((size_t)(n) >= (size_t)(cap)) { \
+        (cap) *= (factor); \
+        (arr) = safe_realloc((arr), (size_t)(cap) * sizeof(*(arr))); \


safe_grow updates cap before verifying that safe_realloc succeeded. On OOM, safe_realloc frees the old buffer and returns NULL, so after the macro runs you can end up with (arr)==NULL but an enlarged (cap), preventing future growth attempts and making later code prone to NULL deref. Consider computing a new_cap first, reallocating, and only committing cap=new_cap when the reallocation succeeded (also consider guarding cap==0 and checking for cap*sizeof(*arr) overflow).

Suggested change

/* Safe grow: doubles capacity and reallocs when count reaches cap.

* Usage: safe_grow(arr, count, cap, growth_factor)

* Evaluates to the new arr (NULL on OOM — old memory freed by safe_realloc). */

#define safe_grow(arr, n, cap, factor) do { \

if ((size_t)(n) >= (size_t)(cap)) { \

(cap) *= (factor); \

(arr) = safe_realloc((arr), (size_t)(cap) * sizeof(*(arr))); \

/* Safe grow: grows capacity and reallocs when count reaches cap.

* Usage: safe_grow(arr, count, cap, growth_factor)

* On success, updates arr and cap together. On OOM/overflow, arr becomes

* NULL only if safe_realloc freed the old buffer; cap is left unchanged. */

#define safe_grow(arr, n, cap, factor) do { \

if ((size_t)(n) >= (size_t)(cap)) { \

size_t _safe_old_cap = (size_t)(cap); \

size_t _safe_new_cap = _safe_old_cap ? (_safe_old_cap * (size_t)(factor)) : 1u; \

if (_safe_new_cap <= _safe_old_cap) { \

_safe_new_cap = _safe_old_cap + 1u; \

} \

if (sizeof(*(arr)) != 0 && _safe_new_cap <= (SIZE_MAX / sizeof(*(arr)))) { \

void *_safe_tmp = safe_realloc((arr), _safe_new_cap * sizeof(*(arr))); \

if (_safe_tmp) { \

(arr) = _safe_tmp; \

(cap) = _safe_new_cap; \

} else { \

(arr) = NULL; \

} \

} else { \

(arr) = NULL; \

} \

Copilot · 2026-04-05T20:21:52Z

src/store/store.c

        const char *sql = "SELECT type, COUNT(*) FROM edges WHERE project = ?1 GROUP BY type ORDER "
                          "BY COUNT(*) DESC;";
        sqlite3_stmt *stmt = NULL;
-        sqlite3_prepare_v2(s->db, sql, CBM_NOT_FOUND, &stmt, NULL);
+        if (sqlite3_prepare_v2(s->db, sql, CBM_NOT_FOUND, &stmt, NULL) != SQLITE_OK || !stmt) {
+            return CBM_NOT_FOUND;
+        }


If the second sqlite3_prepare_v2 fails, this function now returns immediately, but at that point out->node_labels may already have been allocated/populated by the first query. That causes a leak unless the caller explicitly frees out on failure (current callers don't check the return value). Suggest adding a single cleanup path (e.g., goto fail) that finalizes statements and calls cbm_store_schema_free(out) (or equivalent) before returning an error.

Copilot · 2026-04-05T20:21:53Z

src/watcher/watcher.c

@@ -272,17 +279,22 @@

    project_state_t *s = state_new(project_name, root_path);
    cbm_ht_set(w->projects, s->project_name, s);
+    cbm_mutex_unlock(&w->projects_lock);


cbm_watcher_watch calls state_new(...) under the new lock but doesn’t handle OOM: if state_new returns NULL, cbm_ht_set(w->projects, s->project_name, s) will dereference NULL, and the mutex will remain locked on any early return you add later. Add a NULL check for s and ensure the mutex is always unlocked on all exit paths.

Copilot · 2026-04-05T20:21:53Z

src/watcher/watcher.c

    poll_ctx_t ctx = {
        .w = w,
        .now = now_ns(),
        .reindexed = 0,
    };
+    cbm_mutex_lock(&w->projects_lock);
    cbm_ht_foreach(w->projects, poll_project, &ctx);
+    cbm_mutex_unlock(&w->projects_lock);


cbm_watcher_poll_once holds projects_lock while invoking poll_project, which can call index_fn and run a full pipeline/index (potentially long-running). This blocks cbm_watcher_watch/unwatch (called from MCP session handling) for the duration of indexing and risks lock contention/dead time. A more robust approach is to snapshot the current project list under the lock (or increment refcounts), release the lock, and then perform the expensive index_fn work without holding projects_lock.

Copilot · 2026-04-05T20:21:53Z

src/main.c

    /* Create and start watcher in background thread */
+    /* Initialize log mutex before any threads are created */
+    cbm_ui_log_init();
+


cbm_ui_log_append now drops all lines until cbm_ui_log_init() runs, but main installs the log sink and immediately emits logs (server.start) before calling cbm_ui_log_init(). That means early startup logs will never reach the UI ring buffer. Call cbm_ui_log_init() before cbm_log_set_sink(cbm_ui_log_append) / before the first log emission (or restore a thread-safe init-on-first-use pattern).

Copilot · 2026-04-05T20:21:53Z

src/foundation/compat_thread.c

+int cbm_thread_detach(cbm_thread_t *t) {
+    return pthread_detach(t->handle);
+}


On POSIX, cbm_thread_detach calls pthread_detach(t->handle) but leaves t->handle unchanged. That makes it easy to accidentally cbm_thread_join/cbm_thread_detach again on the same cbm_thread_t and get undefined/erroneous behavior. Consider clearing/poisoning the handle after a successful detach (to match the Win32 path) or tracking a detached state in cbm_thread_t.

Copilot · 2026-04-05T20:21:54Z

tests/test_cypher.c

+    /* The string should be truncated to CBM_SZ_4K - 1 (4095) characters. */
+    ASSERT_EQ((int)strlen(r.tokens[0].text), 4095);


This test hardcodes the expected truncated length as 4095, which assumes CBM_SZ_4K == 4096. To keep the test consistent if the buffer size constant ever changes, prefer asserting against (CBM_SZ_4K - 1) (or another exported constant) rather than a literal.

Suggested change

/* The string should be truncated to CBM_SZ_4K - 1 (4095) characters. */

ASSERT_EQ((int)strlen(r.tokens[0].text), 4095);

/* The string should be truncated to CBM_SZ_4K - 1 characters. */

ASSERT_EQ((int)strlen(r.tokens[0].text), (CBM_SZ_4K - 1));

Copilot · 2026-04-05T20:21:54Z

src/foundation/platform.h

+/* Safe free: frees and NULLs a pointer to prevent double-free / use-after-free.
+ * Accepts void** so it works with any pointer type via the macro. */
+static inline void safe_free_impl(void **pp) {
+    if (pp && *pp) {
+        free(*pp);
+        *pp = NULL;
+    }
+}
+#define safe_free(ptr) safe_free_impl((void **)(void *)&(ptr))
+
+/* Safe string free: frees a const char* and NULLs it.
+ * Casts away const so callers don't need the (void*) dance. */
+static inline void safe_str_free(const char **sp) {
+    if (sp && *sp) {
+        free((void *)*sp);
+        *sp = NULL;
+    }
+}
+
+/* Safe buffer free: frees a heap array and zeros its element count.
+ * Use for dynamic arrays paired with a size_t count. */
+static inline void safe_buf_free_impl(void **buf, size_t *count) {
+    if (buf && *buf) {
+        free(*buf);
+        *buf = NULL;
+    }
+    if (count) {
+        *count = 0;
+    }
+}
+#define safe_buf_free(buf, countp) safe_buf_free_impl((void **)(void *)&(buf), (countp))
+
+/* Safe grow: doubles capacity and reallocs when count reaches cap.
+ * Usage: safe_grow(arr, count, cap, growth_factor)
+ * Evaluates to the new arr (NULL on OOM — old memory freed by safe_realloc). */
+#define safe_grow(arr, n, cap, factor) do { \
+    if ((size_t)(n) >= (size_t)(cap)) { \
+        (cap) *= (factor); \
+        (arr) = safe_realloc((arr), (size_t)(cap) * sizeof(*(arr))); \
+    } \
+} while (0)


The PR description focuses on adding memory helper functions, but this PR also changes watcher concurrency behavior, Cypher lexing/parsing, SQLite error handling, pipeline edge emission, HTTP server threading, and adds multiple .claude/* workflow files. Consider splitting these into separate PRs (or updating the description) so reviewers can assess each behavioral change independently and reduce merge risk.

map588 and others added 5 commits April 4, 2026 19:08

fix(cypher): prevent stack buffer overflow in string literal lexer

152ce07

Copilot AI review requested due to automatic review settings April 5, 2026 20:15

Copilot started reviewing on behalf of map588 April 5, 2026 20:16 View session

Copilot AI reviewed Apr 5, 2026

View reviewed changes

map588 and others added 3 commits April 6, 2026 10:31

removed local cruft

d229cb6

Merge branch 'DeusData:main' into feat/memory-helpers

5d06d9a

Merge branch 'DeusData:main' into feat/memory-helpers

7087412

DeusData added the enhancement New feature or request label Apr 6, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat/memory helpers#209

Feat/memory helpers#209
map588 wants to merge 8 commits intoDeusData:mainfrom
map588:feat/memory-helpers

map588 commented Apr 5, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 5, 2026

Uh oh!

Copilot AI Apr 5, 2026

Uh oh!

Copilot AI Apr 5, 2026

Uh oh!

Copilot AI Apr 5, 2026

Uh oh!

Copilot AI Apr 5, 2026

Uh oh!

Copilot AI Apr 5, 2026

Uh oh!

Copilot AI Apr 5, 2026

Uh oh!

Copilot AI Apr 5, 2026

Uh oh!

Copilot AI Apr 5, 2026

Uh oh!

Copilot AI Apr 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

-/* Safe grow: doubles capacity and reallocs when count reaches cap.
- * Usage: safe_grow(arr, count, cap, growth_factor)
- * Evaluates to the new arr (NULL on OOM — old memory freed by safe_realloc). */
-#define safe_grow(arr, n, cap, factor) do { \
-    if ((size_t)(n) >= (size_t)(cap)) { \
-        (cap) *= (factor); \
-        (arr) = safe_realloc((arr), (size_t)(cap) * sizeof(*(arr))); \
+/* Safe grow: grows capacity and reallocs when count reaches cap.
+ * Usage: safe_grow(arr, count, cap, growth_factor)
+ * On success, updates arr and cap together. On OOM/overflow, arr becomes
+ * NULL only if safe_realloc freed the old buffer; cap is left unchanged. */
+#define safe_grow(arr, n, cap, factor) do { \
+    if ((size_t)(n) >= (size_t)(cap)) { \
+        size_t _safe_old_cap = (size_t)(cap); \
+        size_t _safe_new_cap = _safe_old_cap ? (_safe_old_cap * (size_t)(factor)) : 1u; \
+        if (_safe_new_cap <= _safe_old_cap) { \
+            _safe_new_cap = _safe_old_cap + 1u; \
+        } \
+        if (sizeof(*(arr)) != 0 && _safe_new_cap <= (SIZE_MAX / sizeof(*(arr)))) { \
+            void *_safe_tmp = safe_realloc((arr), _safe_new_cap * sizeof(*(arr))); \
+            if (_safe_tmp) { \
+                (arr) = _safe_tmp; \
+                (cap) = _safe_new_cap; \
+            } else { \
+                (arr) = NULL; \
+            } \
+        } else { \
+            (arr) = NULL; \
+        } \

		/* The string should be truncated to CBM_SZ_4K - 1 (4095) characters. */
		ASSERT_EQ((int)strlen(r.tokens[0].text), 4095);

Conversation

map588 commented Apr 5, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants