| title | Troubleshooting | |
|---|---|---|
| sdk | java | |
| spec_sections |
|
|
| kind | reference | |
| since | 1.0.0 |
All 15 canonical error codes, their likely causes, and fixes.
Exception: PermissionDeniedException (non-retryable)
Causes
- Agent called
ctx.authorize(namespace, pattern)and the pattern is not covered by the job's lease. model.uselease missing for the model the agent tried to use.
Fixes
- Expand the lease in
job.submitto include the required namespace and glob. - Use
**glob for broad access during development; tighten in production.
Exception: LeaseSubsetViolationException (non-retryable)
Causes
- A delegating agent tried to grant a sub-agent lease items not covered by its own lease (§10).
Fixes
- Ensure the parent job's lease covers every item you pass to
ctx.delegate(subAgent, subLease).
Exception: JobNotFoundException (non-retryable)
Causes
job.subscribeorjob.cancelsent with an unknown or expired job ID.- The runtime restarted and lost in-memory job state.
Fixes
- Check the job ID. If the runtime restarted, resubmit the job (use the
same
idempotency_keyto avoid duplicate execution).
Exception: DuplicateKeyException (non-retryable)
Causes
- A
job.submitarrived with anidempotency_keythat is already associated with a different payload.
Fixes
- Reuse the same key only for exactly the same submit payload.
- To force a fresh run, change or omit the key.
Exception: AgentNotAvailableException (non-retryable)
Causes
- The agent name in
job.submitis not registered with the runtime.
Fixes
- Check for typos in the agent name.
- Verify
ArcpRuntime.builder().agent(name, version, handler)was called.
Exception: AgentVersionNotAvailableException (non-retryable)
Causes
name@versionspecified but that version is not registered.
Fixes
- List available versions via
session.welcome.agents. - Register the version:
runtime.agents().register(name, version, handler).
Exception: CancelledException (non-retryable)
Causes
handle.cancel()was called and the runtime confirmed cancellation.
Fixes
- Expected behaviour. Do not retry a cancelled job unless the business logic requires it.
Exception: TimeoutException (retryable)
Causes
- The job ran past the
timeout_msset injob.submit. - The lease's
expires_atexpired while the job was running.
Fixes
- Increase
timeout_msinjob.submit. - Check that
lease_expires_atis set far enough in the future. - See guides/jobs.md.
Exception: ResumeWindowExpiredException (non-retryable)
Causes
- A
session.hellowithresume_tokenarrived but the runtime evicted the buffer (default window: 60 s; configurable viaArcpRuntime.builder().resumeWindowSec(int)).
Fixes
- Reconnect faster — reconnect within the window.
- Increase the resume window if your network can partition for longer:
resumeWindowSec(300). - If the window has passed, start a fresh session and re-subscribe to running jobs by ID.
Exception: HeartbeatLostException (retryable)
Causes
- The client missed two consecutive heartbeat intervals without responding
with
session.pong.
Fixes
- Resume the session (
session.hellowithresume_token+last_event_seq). - If heartbeats are too aggressive, increase the interval:
heartbeatIntervalSec(60)inArcpRuntime.Builder. - Check for blocking calls on the transport receive thread.
Exception: LeaseExpiredException (non-retryable)
Causes
- The job's
lease_expires_atpassed while the job was still running.
Fixes
- Set
lease_expires_atto a point far enough in the future. - If the job is long-running, omit
lease_expires_atentirely (the lease then expires only when the job terminates).
Exception: BudgetExhaustedException (non-retryable)
Causes
- The agent emitted metric events whose
cost.budgettotal exceeded the limit set in the lease.
Fixes
- Raise the budget in the lease:
cost.budget → "USD:10.00". - Optimize the agent to emit fewer or smaller metric events.
- See guides/leases.md.
Exception: InvalidRequestException (non-retryable)
Causes
- Malformed envelope (missing required field, wrong type).
lease_expires_atis in the past or uses a UTC offset (+00:00) instead ofZ.- Feature used that was not negotiated in
session.welcome.
Fixes
- Use the SDK builders rather than hand-rolling JSON.
- Always format
lease_expires_atwithZsuffix (e.g.,"2024-09-01T00:00:00Z"). - Check that both peers have negotiated the feature before using it.
Exception: UnauthenticatedException (non-retryable)
Causes
session.hellodid not carry a bearer token.- The token was rejected by the runtime's
BearerVerifier.
Fixes
- Set
ArcpClient.Builder.bearer(token). - Check that the token matches the verifier:
BearerVerifier.staticToken(…)or your custom SPI implementation. - See guides/auth.md.
Exception: InternalErrorException (retryable)
Causes
- Unhandled exception inside the runtime or agent.
Fixes
- Check runtime logs (SLF4J — configure a binding such as logback).
- Agent code that throws unchecked exceptions surfaces as
INTERNAL_ERROR; returnJobOutcome.Failureexplicitly instead.
// See guides/errors.md for the full pattern.
for (int attempt = 0; attempt <= MAX_RETRIES; attempt++) {
try {
return client.submit(submit).result().get();
} catch (ExecutionException ex) {
if (ex.getCause() instanceof RetryableArcpException && attempt < MAX_RETRIES) {
Thread.sleep(backoff(attempt));
continue;
}
throw ex;
}
}RetryableArcpException is the base class for TimeoutException,
HeartbeatLostException, and InternalErrorException. All others extend
NonRetryableArcpException. See guides/errors.md.
Set JAVA_HOME to a JDK 21+ installation. The Gradle wrapper does not
bundle a JDK.
Spotless runs only on JDK 21 (see .github/workflows/ci.yml). Run
./gradlew spotlessApply locally to reformat before pushing.
Ensure --release 21 is set in build.gradle.kts. The SDK targets JDK 21
bytecode even when built on JDK 25.