refactor: Add error code in worker.#116
Conversation
## Walkthrough
This change introduces a new error code system for the worker component by defining a `WorkerErrorCodeEnum` and integrating it with the `ystdlib::error_handling` framework. The error handling in `worker.cpp` is refactored to use `ystdlib::error_handling::Result` types with explicit error codes instead of `std::optional` or boolean returns. The build configuration is updated to include the new error code files. This refactor enables more granular and consistent error reporting within the worker logic.
## Changes
| File(s) | Change Summary |
|------------------------------------------------------------|------------------------------------------------------------------------------------------------------------------|
| src/spider/CMakeLists.txt | Updated build configuration to include `worker/WorkerErrorCode.cpp` and `worker/WorkerErrorCode.hpp` in sources. |
| src/spider/worker/WorkerErrorCode.cpp<br>src/spider/worker/WorkerErrorCode.hpp | Introduced new error code enum, error code type alias, and error category with message mapping for worker errors. |
| src/spider/worker/worker.cpp | Refactored error handling in core functions to use typed error codes and `Result` types instead of optionals/booleans. |
## Sequence Diagram(s)
```mermaid
sequenceDiagram
participant TaskLoop
participant Worker
participant Storage
participant Executor
TaskLoop->>Worker: setup_task()
Worker->>Storage: connect, fetch task details
Storage-->>Worker: success or WorkerErrorCode
Worker-->>TaskLoop: Result<task buffers, WorkerErrorCode>
TaskLoop->>Worker: parse_outputs()
Worker-->>TaskLoop: Result<outputs, WorkerErrorCode>
TaskLoop->>Worker: handle_executor_result()
Worker->>Executor: run task
Executor-->>Worker: result
Worker->>Storage: submit result
Storage-->>Worker: success or WorkerErrorCode
Worker-->>TaskLoop: Result<void, WorkerErrorCode>Assessment against linked issues
Possibly related PRs
Suggested reviewers
|
There was a problem hiding this comment.
Actionable comments posted: 0
🔭 Outside diff range comments (2)
src/spider/worker/worker.cpp (2)
243-270:⚠️ Potential issuePossible out-of-bounds access when
result_buffers.size() < task.get_num_outputs()The loop indexes
result_buffers[i]up totask.get_num_outputs().
If the executor unexpectedly returns fewer buffers, this will dereference past the end and trigger UB before the size mismatch can be reported.Guard early:
if (result_buffers.size() != task.get_num_outputs()) { spdlog::error("Result buffer count mismatch (expected {}, got {})", task.get_num_outputs(), result_buffers.size()); return WorkerErrorCodeEnum::TaskOutputInvalid; }This prevents a crash and yields a meaningful error.
326-340: 🛠️ Refactor suggestionPropagate detailed parsing error instead of generic failure
handle_executor_resultconverts any parsing error toTaskOutputInvalid, even whenparse_outputshas already produced a richer code (e.g.,TaskOutputUnavailable).
Return the exactoutput_result.error()unmodified to preserve diagnostics:- return spider::worker::WorkerErrorCodeEnum::TaskOutputInvalid; + return output_result.error();
🧹 Nitpick comments (6)
src/spider/CMakeLists.txt (1)
64-66: Avoid mixing interface headers into the SOURCES list
WorkerErrorCode.hppis a public‐interface header yet it is added toSPIDER_WORKER_SOURCES.
While CMake will tolerate headers in a source list, generators (Xcode/VS) will treat it as a “compilable” file, cluttering the project tree and occasionally triggering IDE-specific warnings.set(SPIDER_WORKER_SOURCES … - worker/WorkerErrorCode.cpp - worker/WorkerErrorCode.hpp # ← move this + worker/WorkerErrorCode.cpp … ) # Then add to a header list exposed with PUBLIC visibility, e.g. target_sources(spider_worker PUBLIC worker/WorkerErrorCode.hpp)Keeping implementation and interface lists separate improves IDE ergonomics and avoids accidental compilation of headers.
src/spider/worker/WorkerErrorCode.hpp (1)
8-18: Consider introducing anUnknown/Unspecifiedsentinel and revisiting the Storage placeholder
- A generic
Unknown(orUnspecified) value is helpful when converting foreign or unexpected errors into the worker domain without losing information.- The in-code comment states “Storage related errors will be removed …” yet the enum still exposes
StorageError. If removal is planned, mark it[[deprecated]]or drop it now to avoid API churn.enum class WorkerErrorCodeEnum : uint8_t { Success = 0, CmdLineArgumentInvalid = 1, TaskArgumentInvalid = 2, TaskFailed = 3, TaskOutputUnavailable = 4, TaskOutputInvalid = 5, - // Storage related errors will be removed and use storage error codes instead - StorageError = 6, + StorageError [[deprecated("Use storage::StorageErrorCode instead")]] = 6, + Unknown = 255 };This keeps the contract future-proof while signalling imminent deprecation.
src/spider/worker/WorkerErrorCode.cpp (2)
12-18: Define the category object inside the namespace to avoid ODR traps
cWorkerErrorCategoryNameis currently a globalconstexprstring. Moving it insidespider::workeravoids potential clashes if another TU defines the same symbol.-constexpr std::string_view cWorkerErrorCategoryName = "Worker Error Code"; +namespace spider::worker { +constexpr std::string_view cWorkerErrorCategoryName = "Worker Error Code"; +} // namespace spider::workerRemember to qualify references accordingly.
20-41: Exhaustiveswitch: returnUnknowninstead of a string literalThe default branch returns a raw string
"Unknown error".
Returning a concrete enum (WorkerErrorCodeEnum::Unknown) or asserting unreachable would be more type-safe. At the very least, factor the literal into a static constant to avoid duplication.- default: - return "Unknown error"; + default: + return "Unknown error";(If you adopt the
Unknownsentinel proposed earlier, this branch can become unreachable.)src/spider/worker/worker.cpp (2)
205-215: Prefer returningErrorCodeexplicitly to avoid overloaded conversions
return WorkerErrorCodeEnum::StorageError;relies on an implicit conversion from the enum to
WorkerErrorCode.
While the alias currently provides such a constructor, being explicit improves readability and guards against future API changes:- return spider::worker::WorkerErrorCodeEnum::StorageError; + return spider::worker::WorkerErrorCode{spider::worker::WorkerErrorCodeEnum::StorageError};
224-232: Unnecessarystd::moveon a local when returning
return std::move(optional_arg_buffers.value());Moving a local that is about to be returned inhibits NRVO and produces the same code as a copy elision in modern compilers, sometimes even worse. Simply return the value:
- return std::move(optional_arg_buffers.value()); + return optional_arg_buffers.value();[performance]
🧰 Tools
🪛 Cppcheck (2.10-2)
[performance] 231-231: Using std
(returnStdMoveLocal)
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/spider/CMakeLists.txt(1 hunks)src/spider/worker/WorkerErrorCode.cpp(1 hunks)src/spider/worker/WorkerErrorCode.hpp(1 hunks)src/spider/worker/worker.cpp(12 hunks)
🧰 Additional context used
🪛 Cppcheck (2.10-2)
src/spider/worker/worker.cpp
[performance] 231-231: Using std
(returnStdMoveLocal)
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: non-storage-unit-tests (ubuntu-22.04)
- GitHub Check: non-storage-unit-tests (ubuntu-24.04)
- GitHub Check: lint
🔇 Additional comments (3)
src/spider/worker/WorkerErrorCode.hpp (1)
20-23: ExposeWorkerErrorCodedirectly in the namespace documentation
using WorkerErrorCode = …is great, but add a brief comment explaining that the alias enables implicit construction fromWorkerErrorCodeEnum– this will help new contributors understand why plain enum values can be returned fromResult<T, WorkerErrorCode>.src/spider/worker/worker.cpp (2)
35-40: Good workaround for the Valgrind header clashTemporarily undefining
QUICKCPPLIB_ENABLE_VALGRINDbefore includingResult.hppsidesteps the
Abseil/QuickCPPLib conflict (issue #59) without polluting the public interface. Nice!
420-426: Minor logic tweak: preferresult.is_success()overhas_value()forResult<void, …>
Result<void, E>::has_value()works, but semanticallyis_success()(if available in ystdlib) conveys intent more clearly—there is no contained value to “have”.Does
ystdlib::error_handling::Resultexpose anis_success()helper? If so, consider switching.
Co-authored-by: davidlion <davidlion2@protonmail.com>
|
On hold until quickcpplib/outcome dependency is resolved to avoid needing |
There was a problem hiding this comment.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/spider/CMakeLists.txt(1 hunks)src/spider/worker/worker.cpp(12 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- src/spider/CMakeLists.txt
⏰ Context from checks skipped due to timeout of 90000ms (3)
- GitHub Check: lint
- GitHub Check: non-storage-unit-tests (ubuntu-24.04)
- GitHub Check: non-storage-unit-tests (ubuntu-22.04)
🔇 Additional comments (11)
src/spider/worker/worker.cpp (11)
57-57: LGTM! Proper include for the new error handling system.The inclusion of
WorkerErrorCode.hppis correctly added to support the new error handling refactoring.
198-199: Excellent documentation and type signature improvement.The function documentation clearly explains the new return behaviour, and the
Resulttype provides much better error semantics than the previousstd::optional.Also applies to: 205-206
214-214: Well-structured error code mapping.The refactoring correctly maps different failure scenarios to appropriate error codes:
- Storage connection failures →
StorageError- Task fetch failures →
StorageError- Invalid task arguments →
TaskArgumentInvalidThis provides much clearer error semantics than returning
std::nullopt.Also applies to: 222-222, 229-229
231-231: Proper success case handling.The function correctly returns the successful result by moving the value, maintaining efficiency while adhering to the new
Resultpattern.
234-245: Enhanced function documentation and signature.The documentation clearly explains the new return behaviour, and the
Resulttype provides better error handling than the previous implementation.
263-263: Appropriate error code for parsing failure.The function correctly returns
TaskOutputInvalidwhen output parsing fails, which is more descriptive than the previous approach.
282-284: Improved function documentation and return type.The documentation is clear about the new return behaviour, and
Result<void, WorkerErrorCode>is the appropriate type for operations that either succeed or fail with an error code.Also applies to: 291-291
299-299: Comprehensive error code coverage.The function correctly maps different failure scenarios to specific error codes:
- Storage connection failures →
StorageError- Task execution failures →
TaskFailed- Result buffer parsing failures →
TaskOutputUnavailable- Storage submission failures →
StorageErrorThis provides much clearer error semantics than the previous boolean return.
Also applies to: 310-310, 323-323, 358-358
326-330: Proper Result type integration.The code correctly:
- Uses
Resulttype for the output parsing operation- Checks
has_value()to determine success/failure- Propagates errors using
output_result.error()- Extracts successful values using
output_result.value()This demonstrates proper integration with the new error handling system.
Also applies to: 339-339, 342-342
360-360: Correct success indication.Using
ystdlib::error_handling::success()is the proper way to indicate successful completion for aResult<void, ErrorCode>type.
387-396: Consistent Result handling in the main loop.The
task_loopfunction correctly handles the newResulttypes:
- Properly declares and uses
Resulttypes for setup operations- Uses
has_value()to check success/failure- Extracts values using
value()method- Handles both setup and execution result checking consistently
The error handling flow maintains the existing retry logic while providing better error reporting.
Also applies to: 420-426
Description
Refactor worker to use error code and
Result.Resolves #106. Resolves #110.
Checklist
breaking change.
Validation performed
Summary by CodeRabbit