Skip to content

docs: add reconnect/restart steps after reboot to troubleshooting guide#474

Closed
RitikKadyan wants to merge 1 commit intoNVIDIA:mainfrom
RitikKadyan:docs/restart-commands-469
Closed

docs: add reconnect/restart steps after reboot to troubleshooting guide#474
RitikKadyan wants to merge 1 commit intoNVIDIA:mainfrom
RitikKadyan:docs/restart-commands-469

Conversation

@RitikKadyan
Copy link
Copy Markdown
Contributor

@RitikKadyan RitikKadyan commented Mar 20, 2026

Adds a "Reconnecting after a reboot" section to the troubleshooting guide with step-by-step instructions for restoring a NemoClaw session after a machine restart.

Fixes #469

Summary

Adds a "Reconnecting after a reboot" section to the troubleshooting guide with step-by-step instructions for restoring a NemoClaw session after a machine restart. Also clarifies the existing "Sandbox shows as stopped" entry to distinguish reboot scenarios from other causes.

Related Issue

Fixes #469

Changes

  • Added "Reconnecting after a reboot" section to docs/reference/troubleshooting.md
  • Steps cover verifying Docker, checking sandbox state with openshell sandbox list and nemoclaw status, reconnecting with nemoclaw <name> connect, re-running onboard if needed, and restarting auxiliary services
  • Updated "Sandbox shows as stopped" to clarify non-reboot scenarios

Type of Change

  • Code change for a new feature, bug fix, or refactor.
  • Code change with doc updates.
  • Doc only. Prose changes without code sample modifications.
  • Doc only. Includes code sample changes.

Testing

  • make check passes.
  • npm test passes.
  • make docs builds without warnings. (for doc-only changes)

Note: These steps are based on the existing CLI reference and README documentation. I have not tested the full reboot flow end-to-end on a live NemoClaw installation. Would appreciate a maintainer verifying the exact reconnect behavior before merging.

Checklist

General

Code Changes

  • make format applied (TypeScript and Python).
  • Tests added or updated for new or changed behavior.
  • No secrets, API keys, or credentials committed.
  • Doc pages updated for any user-facing behavior changes (new commands, changed defaults, new features, bug fixes that contradict existing docs).

Doc Changes

  • Follows the style guide. Try running the update-docs agent skill to draft changes while complying with the style guide. For example, prompt your agent with "/update-docs catch up the docs for the new changes I made in this PR."
  • New pages include SPDX license header and frontmatter, if creating a new page.
  • Cross-references and links verified.

Summary by CodeRabbit

  • Documentation
    • Added a troubleshooting guide with step-by-step recovery instructions for reconnecting after a reboot.

Adds a 'Reconnecting after a reboot' section to the troubleshooting
guide with step-by-step instructions for restoring a NemoClaw session
after a machine restart.

Fixes NVIDIA#469
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Mar 20, 2026

📝 Walkthrough

Walkthrough

Documentation enhancement adding a "Reconnecting after a reboot" troubleshooting section to guide users through recovery steps, including Docker verification, sandbox inspection, and reconnection procedures using nemoclaw commands.

Changes

Cohort / File(s) Summary
Troubleshooting Documentation
docs/reference/troubleshooting.md
Added new "Reconnecting after a reboot" section with step-by-step recovery flow: verify Docker status, inspect sandboxes via openshell sandbox list, check registered sandboxes with nemoclaw status, reconnect to stopped sandboxes or recreate with nemoclaw onboard, and restart services with nemoclaw start.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~8 minutes

Poem

🐰 A sandbox reboots, but fear not, friend!
With nemoclaw's commands, the tale won't end—
List, inspect, reconnect with care,
Your agents await you, still dwelling there! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'docs: add reconnect/restart steps after reboot to troubleshooting guide' clearly and concisely summarizes the main change: adding documentation for reconnecting after a reboot.
Linked Issues check ✅ Passed The PR fully addresses issue #469 by documenting the step-by-step process to reconnect to a NemoClaw sandbox after a machine reboot, including Docker verification, sandbox status checks, and reconnection/recreation commands.
Out of Scope Changes check ✅ Passed All changes are directly related to the linked issue #469 objective of documenting reconnection steps after a reboot; no unrelated modifications are present.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
📝 Coding Plan
  • Generate coding plan for human review comments

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

CodeRabbit can scan for known vulnerabilities in your dependencies using OSV Scanner.

OSV Scanner will automatically detect and report security vulnerabilities in your project's dependencies. No additional configuration is required.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
docs/reference/troubleshooting.md (1)

163-163: Use active voice in this sentence.

“has been removed entirely” is passive; please rewrite in active voice to match the docs style requirements.

As per coding guidelines, "Active voice required. Flag passive constructions."

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/reference/troubleshooting.md` at line 163, Rewrite the sentence "This
shows whether your sandbox still exists but is stopped, or has been removed
entirely." to use active voice; replace the passive clause "has been removed
entirely" with an active construction such as "you have removed it entirely" so
the full sentence reads: "This shows whether your sandbox still exists but is
stopped, or you have removed it entirely."
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@docs/reference/troubleshooting.md`:
- Line 163: Rewrite the sentence "This shows whether your sandbox still exists
but is stopped, or has been removed entirely." to use active voice; replace the
passive clause "has been removed entirely" with an active construction such as
"you have removed it entirely" so the full sentence reads: "This shows whether
your sandbox still exists but is stopped, or you have removed it entirely."

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 24318169-7da9-46c2-b756-d5da9ca5533e

📥 Commits

Reviewing files that changed from the base of the PR and between dbfd78c and dc648a2.

📒 Files selected for processing (1)
  • docs/reference/troubleshooting.md

@wscurran wscurran added the documentation Improvements or additions to documentation label Mar 20, 2026
@wscurran
Copy link
Copy Markdown
Contributor

Thanks for adding reconnect/restart steps to the troubleshooting guide, this should help users get back up and running after a reboot.

@wscurran wscurran added Getting Started Use this label to identify setup, installation, or onboarding issues. priority: high Important issue that should be resolved in the next release labels Mar 20, 2026
@RitikKadyan
Copy link
Copy Markdown
Contributor Author

Thanks for adding reconnect/restart steps to the troubleshooting guide, this should help users get back up and running after a reboot.

Of course!

ericksoa added a commit that referenced this pull request Mar 25, 2026
…0.0.15

EXPERIMENTAL — POC branch to validate three-tier config resolution:
  1. Frozen openclaw.json (gateway.auth.token, CORS — always immutable)
  2. Policy defaults (config_overrides in openclaw-sandbox.yaml)
  3. User runtime overrides (nemoclaw config-set → overrides file → hot-reload)

OpenClaw shim patch (patches/openclaw-config-overrides.patch):
- Adds OPENCLAW_CONFIG_OVERRIDES_FILE env var support to config loader
- Deep-merges overrides onto frozen config, stripping gateway.* for security
- Adds overrides file to chokidar watcher for hot-reload

OpenShell minimum bumped to v0.0.15:
- Auto-TLS termination (PR #544) — removes need for tls: terminate
- Security hardening SEC-002–010 (PR #548)
- Runtime settings channel (PR #474)
- Version check now enforced in onboard preflight

Policy changes:
- Remove 35 deprecated tls: terminate annotations (base + all presets)
- Remove permissive wildcard L7 rules from claude_code/nvidia endpoints
- Add config_overrides section defining mutable fields + defaults

New commands:
- nemoclaw <sandbox> config-set --key <path> --value <value>
- nemoclaw <sandbox> config-get [--key <path>]
@prekshivyas
Copy link
Copy Markdown
Contributor

Hey @RitikKadyan — thanks for working on this! Just a heads-up that we have an overlapping PR at #911 that also addresses #469.

After digging into the onboard.js and OpenShell code, we found a couple of things that might be worth noting:

  1. Gateway restart step: After a reboot, the OpenShell gateway doesn't auto-start. Running openshell gateway start --name nemoclaw is needed before the sandbox can recover — without it, openshell sandbox list won't return results.
  2. Data loss warning: nemoclaw onboard destroys and recreates the sandbox, which means workspace files (SOUL.md, USER.md, etc.) are lost. We added a warning admonition about backing up first.

Since both PRs target the same issue, it might make sense to consolidate. Happy to coordinate either way!

mafueee pushed a commit to mafueee/NemoClaw that referenced this pull request Mar 28, 2026
* feat(gateway/sandbox): add global and sandbox runtime settings flow
@prekshivyas prekshivyas assigned prekshivyas and unassigned kjw3 Mar 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation Getting Started Use this label to identify setup, installation, or onboarding issues. priority: high Important issue that should be resolved in the next release

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Please add restart commands to documentation

5 participants