Skip to content

✨ feat: Implement typed error classification for pod lifecycle events#10

Merged
SebastienLaurent-CF merged 1 commit intomainfrom
feat/error-classification-pod-lifecycle
Nov 23, 2025
Merged

✨ feat: Implement typed error classification for pod lifecycle events#10
SebastienLaurent-CF merged 1 commit intomainfrom
feat/error-classification-pod-lifecycle

Conversation

@SebastienLaurent-CF
Copy link
Copy Markdown
Contributor

Add type-aware error handling to locators for intelligent retry strategies. Instead of treating all errors identically, distinguish between transient errors (retry with backoff) and permanent errors (fail fast).

Implementation

New Error Package (internal/locator/errors.go)

  • ErrorType enum with 10 classified error types:
    • Transient: NetworkTransient, APITransient, (pod initializing)
    • Permanent: ResourceNotFound, PodNotRunning, PodFailed, ConfigInvalid, PermissionDenied
    • Uncertain: NoPodAvailable (might retry longer based on context)
  • LocateError wrapper with type info + error unwrapping
  • Helper functions for creating typed errors: NewResourceNotFoundError(), NewPodFailedError(), etc.

Updated Locators (pod.go, service.go, selector_based_locator.go)

  • Detect error types using Kubernetes API predicates (IsNotFound, IsTimeout, IsForbidden, etc.)
  • Return typed LocateError instead of generic fmt.Errorf
  • Classify API errors (timeout, forbidden, not found) separately from resource state errors
  • Port mapping errors classified as ConfigInvalid (user configuration issues)

Test Updates (locator_test.go)

  • Updated assertions to match new error messages (more specific, less generic)
  • All 129 tests passing with zero regressions
  • Error messages now clearly indicate error type: "X not found" vs "X is in failed state"

Benefits

✅ Enables forwarder to implement intelligent retry strategies per error type ✅ Transient errors: exponential backoff retry
✅ Permanent errors: fail fast or give up after few attempts ✅ Better error messages: developers see what went wrong (config vs API vs resource) ✅ Foundation for improved observability: errors can be classified and monitored

Add type-aware error handling to locators for intelligent retry strategies.
Instead of treating all errors identically, distinguish between transient errors
(retry with backoff) and permanent errors (fail fast).

## Implementation

**New Error Package** (internal/locator/errors.go)
- ErrorType enum with 10 classified error types:
  - Transient: NetworkTransient, APITransient, (pod initializing)
  - Permanent: ResourceNotFound, PodNotRunning, PodFailed, ConfigInvalid, PermissionDenied
  - Uncertain: NoPodAvailable (might retry longer based on context)
- LocateError wrapper with type info + error unwrapping
- Helper functions for creating typed errors: NewResourceNotFoundError(), NewPodFailedError(), etc.

**Updated Locators** (pod.go, service.go, selector_based_locator.go)
- Detect error types using Kubernetes API predicates (IsNotFound, IsTimeout, IsForbidden, etc.)
- Return typed LocateError instead of generic fmt.Errorf
- Classify API errors (timeout, forbidden, not found) separately from resource state errors
- Port mapping errors classified as ConfigInvalid (user configuration issues)

**Test Updates** (locator_test.go)
- Updated assertions to match new error messages (more specific, less generic)
- All 129 tests passing with zero regressions
- Error messages now clearly indicate error type: "X not found" vs "X is in failed state"

## Benefits

✅ Enables forwarder to implement intelligent retry strategies per error type
✅ Transient errors: exponential backoff retry
✅ Permanent errors: fail fast or give up after few attempts
✅ Better error messages: developers see what went wrong (config vs API vs resource)
✅ Foundation for improved observability: errors can be classified and monitored
@SebastienLaurent-CF SebastienLaurent-CF merged commit 6196e5e into main Nov 23, 2025
1 of 2 checks passed
@SebastienLaurent-CF SebastienLaurent-CF deleted the feat/error-classification-pod-lifecycle branch November 23, 2025 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant