Skip to content

[Bug] timed_with_status decorator silently swallows exceptions and returns None #1523

@Ptah-CT

Description

@Ptah-CT

Summary

timed_with_status in src/memos/utils.py catches every exception, but when no fallback callable is configured the wrapper falls through to an implicit return None. Decorated functions therefore return None on any failure instead of raising, which masks the real error.

Where

src/memos/utils.py, lines 43-56 (current main @ cddc252).

try:
    result = fn(*args, **kwargs)
    success_flag = True
    return result
except Exception as e:
    exc_type = type(e)
    stack_info = "".join(traceback.format_stack()[:-1])
    exc_message = f"{stack_info}{traceback.format_exc()}"
    success_flag = False

    if fallback is not None and callable(fallback):
        result = fallback(e, *args, **kwargs)
        return result
    # ← no `raise` here; wrapper falls through to implicit return None
finally:
    ...

Impact

OpenAILLM.generate is decorated with @timed_with_status(...). When the upstream LLM returns 4xx/5xx (e.g. MiniMax 400 chat content is empty (2013) for a system-only message), the BadRequestError is caught, logged as status: FAILED, and then swallowed. generate() returns None to its caller.

Downstream that None flows into clean_json_response(response) (src/memos/mem_os/utils/format_utils.py:1403) and crashes with:

AttributeError: 'NoneType' object has no attribute 'replace'

Two consequences:

  1. The user sees a confusing AttributeError instead of the real 400 from the LLM. Diagnosis is hard because nothing in the traceback names the LLM call.
  2. It is a silent fail by API contract. Callers cannot tell whether generate() succeeded with empty output or failed with an exception, because the same None represents both.

Reproduction

  1. Set MOS_CHAT_MODEL=MiniMax-M2.7, OPENAI_API_BASE=https://api.minimax.io/v1, valid OPENAI_API_KEY.
  2. Start memos server, ensure default cube exists.
  3. POST /product/suggestions with a mem_cube_id whose recent memories are empty (so the suggestion prompt has only a system message).
  4. Observe HTTP 500 'NoneType' object has no attribute 'replace' in the response, and [TIMER_WITH_STATUS] OpenAI LLM took 5051 ms, status: FAILED, error_type: BadRequestError, error_message: ... immediately above it in the log.

Proposed fix

Add an explicit raise after the fallback branch:

                if fallback is not None and callable(fallback):
                    result = fallback(e, *args, **kwargs)
                    return result
                raise
            finally:
                ...

This preserves existing fallback semantics and makes the no-fallback path fail-fast.

I will open a PR with this change against main.

Related

The same swallow likely contributes to other reports where downstream code receives unexpected None/empty values from LLM helpers (e.g. #1324 memory_search always returns no results with reasoning-enabled models — different root cause, but the same pattern of LLM-call failure being invisible to the caller).

Metadata

Metadata

Assignees

Labels

ai-doneAI task completed successfullybugSomething isn't working | 功能异常memosCore MemOS logic (memory, MCP, scheduler, API, database) | 核心模块

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions