Two skills that turn Claude Code from an interactive tool into an autonomous worker. Nightshift runs planned project work overnight. 24x7 runs an endless task queue.
Landing page: godmodeai2025.github.io/NightShift
Claude Code is powerful but requires constant babysitting. Every file edit, every shell command needs your approval. On longer tasks, context gets compressed and Claude forgets the plan. And if you walk away, Claude stops.
These two skills solve that. They generate a complete setup — runbook, hooks, watchdog, sandbox — that lets Claude Code work autonomously while you sleep, work on something else, or just aren't at the terminal.
Nightshift is for planned project work: you describe a task, choose a genre template, get a validated runbook, and Claude executes it overnight. One task, one project, one clean git commit in the morning.
24x7 is for continuous work: you drop task folders into an inbox, Claude processes them one by one, and results appear in an outbox. An endless loop that runs until you stop it.
Both skills are Claude Code skills — they run inside Claude (either Claude.ai or Claude Code CLI) and generate all files as a setup package. You don't install the skills on your machine directly; you install them into Claude's skill directory, then ask Claude to generate the setup for your specific project.
Running Claude Code with --dangerously-skip-permissions alone fails after ~20 minutes on longer tasks:
- Context compression — Claude's context window fills up,
/compactruns automatically, and Claude loses the plan, the conventions, and which steps are already done - No guardrails — Without approval prompts, a hallucinated
rm -rf ~/has full permission to execute. This has happened in documented incidents. - No monitoring — You don't know if Claude is still working, stuck in a loop, or has crashed
Both skills build four layers of protection around --dangerously-skip-permissions:
Layer 1: The Runbook (External Memory) A markdown file with checkboxes that Claude reads before each step. After context compression, a hook tells Claude to re-read the runbook and continue at the next unchecked item. The runbook is Claude's external memory — it survives any number of compressions.
Layer 2: Hooks (Guardrails) Claude Code hooks are scripts that fire on specific events:
PreToolUse— Runs before every tool call. Blocks destructive commands likerm -rf /,sudo,chmod 777,curl | bash,eval.PostToolUse— Runs after every tool call. Writes a heartbeat timestamp to a log file.SessionStart(compact matcher) — Fires after every context compression. Injects "re-read the runbook" into Claude's context.Stop(Nightshift only) — Fires every time Claude finishes a response. Every 5 completed steps, reminds Claude of autonomy zones and error budget.
Hooks fire even with --dangerously-skip-permissions. A PreToolUse hook returning exit code 2 blocks the tool call unconditionally.
Layer 3: macOS Sandbox (Kernel-Level Isolation)
A sandbox-exec profile that restricts Claude's filesystem access at the kernel level. Claude can only write to the project directory and /tmp. Even if Claude tries rm -rf ~/, the kernel blocks it — the hook doesn't even need to catch it. This is the real safety net.
Note: sandbox-exec is deprecated by Apple but still functional on current macOS versions. It uses Seatbelt, a kernel-level sandbox framework.
Layer 4: Watchdog (Liveness Monitoring) A separate script that checks the heartbeat file. If Claude hasn't written a heartbeat in N minutes (default: 10), it raises an alarm — either a macOS notification or a message you can hook into your own alerting system.
The 24x7 skill does NOT run Claude as one long session. Long sessions degrade — a phenomenon called "context rot" where Claude becomes increasingly unreliable after hours of accumulated context. Instead:
- The bash loop (
runner.sh) is the daemon — it runs forever - For each task, it spawns a fresh
claude -pcall - Claude processes the task, writes results, exits cleanly
- The loop checks for the next task
This means every task gets Claude at full quality. No accumulated errors, no context rot, no compact-related amnesia.
When you tell Nightshift "migrate auth to JWT", it doesn't just write "migrate auth to JWT" into the runbook. It detects the task type (migration) and applies a genre template with predefined phases:
| Genre | Phases | Use When |
|---|---|---|
| refactoring | Analysis → Test coverage → Restructure → Verify → Cleanup | Changing code structure without changing behavior |
| feature | Prep → Structure → Core logic → Integration → Tests → Cleanup | Adding new functionality |
| migration | Compat check → Parallel run → Stepwise migration → Verify → Remove old | Switching frameworks, versions, schemas |
| bugfix | Reproduce → Root cause → Fix → Regression test → Cleanup | Systematic debugging |
| testing | Coverage analysis → Prioritize → Write tests → Verify | Improving test coverage |
| cleanup | Inventory → Prioritize → Clean → Verify | Tech debt, formatting, dependencies |
| devops | Current state → Configure → Test → Deploy check | CI/CD, infrastructure |
| documentation | Inventory → Structure → Content → Review | Docs, README, API docs |
Each genre also includes risk checks specific to the task type. A migration genre asks "Is there a rollback strategy? Could data be lost?" A refactoring genre asks "Are existing tests green before starting?"
Every runbook includes three autonomy zones (inspired by AlpiType's Approval Loop article):
- Green (free): Read files, create/edit in src/tests/docs, install dependencies, run tests, git add + commit
- Yellow (log required): Delete files, modify config files, change more than 3 files at once — Claude must document the reason in log.md
- Red (forbidden): Access files outside the project, touch secrets/credentials, force-push
This gives Claude a decision framework. Without it, Claude either hesitates on trivial operations or overreaches on critical ones.
The runbook defines how many test failures are acceptable:
- 1 failing test: Try to fix, max 2 attempts, then continue
- 2-3 failing tests: Warning in log.md, finish the phase
- Over 3 failing tests: STOP,
git stash, write log.md
Without an error budget, Claude either stops at the first flaky test (wasting the entire overnight run) or ignores real regressions.
Before generating the setup, the skill validates the runbook against 15 checks:
Structure: Has preconditions? Has verification phase? Has git commit in conclusion? Has rollback instructions? Between 5-20 steps?
Quality: Every step contains a file path, command, or concrete action? No vague steps like "implement auth"? Test command is concrete?
Safety: No rm -rf in steps? No hardcoded secrets? No actions outside the project directory?
Autonomy: Has all three zones (green/yellow/red)? Has error budget? Error budget has a stop condition?
The setup is only generated when all 15 checks pass — or when you explicitly override.
Every 5 completed steps (checked boxes in the runbook), a Stop hook fires and reminds Claude of the autonomy zones and error budget. This prevents drift during long runs — Claude re-reads its constraints regularly, not just after context compression.
The Stop hook doesn't just check for checkpoints — it tracks progress. If the number of completed runbook steps hasn't increased after 3 consecutive checks, a STALL WARNING is injected into Claude's context. This catches the most common failure mode in autonomous runs: Claude gets stuck on a failing test and retries it endlessly, burning API credits without making progress.
The warning tells Claude to check its error budget and skip the step if the budget allows it. If the budget doesn't allow skipping, Claude stops the run and writes a log — which is the correct behavior for a real regression.
The watchdog sees heartbeats during a stall (Claude is alive and working), so without stall detection, you'd only discover the loop in the morning when you check the runbook and see step 7 still unchecked after 6 hours.
Nightshift runs are ephemeral — Claude starts fresh each time. But architecture decisions made in one run should inform the next. If Tuesday's run chose PostgreSQL over SQLite, Wednesday's run should know that.
The solution: a decisions.md file in the project root that persists across runs.
- The runbook's conclusion phase includes a step: "Document decisions in decisions.md"
- The CLAUDE-nightshift.md instructs Claude to read
decisions.mdat the start of every run - Format: date, decision, reasoning. Append-only, never overwrite.
This gives cross-run persistence without requiring a long-lived session or external memory system. It's not learning — it's structured remembering.
The 24x7 skill uses the same pattern at workspace level: each task reads decisions.md from the workspace root and appends relevant decisions after completing work.
- Claude Code CLI installed and authenticated (
claudecommand available) - bash, python3, jq in your PATH
- macOS recommended (for
sandbox-exec). Works on Linux without the sandbox layer. - A git repository for your project (Nightshift) or any directory (24x7)
# Download and install Nightshift
curl -L https://github.com/GodModeAI2025/NightShift/releases/latest/download/nightshift.skill -o nightshift.skill
unzip nightshift.skill -d ~/.claude/skills/
# Download and install 24x7
curl -L https://github.com/GodModeAI2025/NightShift/releases/latest/download/24x7.skill -o 24x7.skill
unzip 24x7.skill -d ~/.claude/skills/Or manually: download the nightshift/ and 24x7/ directories from this repo and place them in ~/.claude/skills/.
Open Claude Code and ask:
Set up a nightshift run for /my/project — migrate auth to JWT
If the skill triggers, you'll see it generate a runbook, validate it, and produce the setup files. If Claude doesn't recognize the skill, check that ~/.claude/skills/nightshift/SKILL.md exists.
In Claude (claude.ai or Claude Code), say something like:
Set up a nightshift run for /Users/me/projects/my-api — refactor the auth module to use JWT tokens
Claude will:
- Detect the genre (refactoring)
- Ask you to confirm
- Generate a runbook with concrete steps
- Validate it (15 checks)
- Produce the setup files
cd /your/project
cp -r nightshift-setup/.claude .
cp nightshift-setup/runbook.md .
cp nightshift-setup/nightshift-*.sh .
cp nightshift-setup/nightshift-sandbox.sb .
cat nightshift-setup/CLAUDE-nightshift.md >> CLAUDE.md
chmod +x nightshift-*.shThis is your safety net. If anything goes wrong, git checkout . brings you back here.
git add -A && git commit -m "Checkpoint before Nightshift"Foreground (you see the output):
./nightshift-run.shBackground (terminal can be closed):
./nightshift-run-bg.shWith macOS sandbox (recommended):
sandbox-exec -f nightshift-sandbox.sb ./nightshift-run.shIn a second terminal:
./nightshift-watchdog.sh # Alerts after 10 min without heartbeat
./nightshift-watchdog.sh 300 # Alerts after 5 mingit log --oneline -5 # See the commit
git diff HEAD~1 # See what changed
cat runbook.md # See which steps were completed [x]git checkout . # Revert all changes
# or
git stash # Save changes for reviewSet up a 24x7 runner at /Users/me/claude-workspace with idle behavior cleanup
Claude will generate the workspace structure with runner, watchdog, hooks, and sandbox profile.
cp -r 24x7-setup/* /your/workspace/
chmod +x /your/workspace/*.sh
cd /your/workspace
./runner-bg.shCreate a folder in inbox/ with a task.md and optionally a materials/ directory:
mkdir -p inbox/my-task/materialsWrite the assignment:
cat > inbox/my-task/task.md << 'EOF'
## Task: Create a REST API client
Priority: high
### Assignment
Create a TypeScript REST API client for the JSONPlaceholder API.
Include methods for all CRUD operations on /posts and /users.
Add error handling and TypeScript types.
### Input
- materials/api-spec.md — API specification
### Expected Output
- output/api-client.ts — The client
- output/types.ts — TypeScript interfaces
- output/api-client.test.ts — Unit tests
EOFCopy input files:
cp my-api-spec.md inbox/my-task/materials/api-spec.mdThe runner picks it up automatically. Results appear in outbox/my-task/output/.
./watchdog.shShows live status: inbox: 3 | working: 1 | done: 12 | failed: 0
ls outbox/my-task/output/ # Your files
cat outbox/my-task/log.md # What Claude didWhen the inbox is empty, Claude can:
- cleanup — Tidy the workspace, collect TODOs from outbox files
- docs — Update a workspace README with completed task summaries
- tests — Suggest tests for code in completed tasks
- sleep — Do nothing, save API costs
Configure by editing idle/idle-tasks.md or setting the idle behavior when generating the setup.
Both skills run Claude Code in headless mode. Every tool call, every file read, every response consumes API credits. An overnight Nightshift run might cost $5-50 depending on task complexity. A 24x7 runner generates continuous costs.
- Monitor your usage at console.anthropic.com
- For 24x7: set idle to
sleepif cost is a concern — this prevents Claude from burning credits when no tasks are waiting - Both runners show a cost warning at startup
The generated sandbox.sb / nightshift-sandbox.sb file is just a profile. You must explicitly use it:
sandbox-exec -f nightshift-sandbox.sb ./nightshift-run.shWithout sandbox-exec, Claude has full access to everything your user account can reach. The PreToolUse hook catches obvious destructive commands, but it's pattern-matching — not a real security boundary. The sandbox is the real security boundary.
On Linux, there is no sandbox-exec. Use Docker or a dedicated user account instead:
sudo useradd -m clauderunner
sudo cp -r /your/project /home/clauderunner/project
sudo -u clauderunner ./nightshift-run.shAlways commit before starting a Nightshift run. Without a clean git state, you have no rollback. The runner checks for uncommitted changes and warns you (non-blocking).
Both runners write a PID file. If you try to start a second instance, it refuses with a clear error. To force a restart after a crash:
rm /tmp/nightshift.pid # or /tmp/24x7.pidVague runbook steps produce vague results. "Implement auth" can mean anything — Claude will guess, and in headless mode, nobody corrects the guess. Good steps look like:
- [ ] Create src/services/token-service.ts with functions: generateAccessToken(userId), verifyToken(token)
The 15-point validation catches the worst offenders, but you should review the runbook before starting.
Nightshift handles context compression with two mechanisms:
SessionStarthook re-injects "read the runbook" after every/compactStophook repeats autonomy zones and error budget every 5 steps
This works well for 10-20 step runbooks. For very long tasks (30+ steps), split into multiple Nightshift runs — the quality degrades even with these mechanisms.
24x7 avoids the problem entirely by using fresh sessions per task.
Both runners handle SIGTERM and SIGINT (Ctrl+C) gracefully. On 24x7, an in-progress task is moved to failed/ with a note. The PID file is cleaned up.
# Graceful stop
kill $(cat /tmp/nightshift.pid)
# Or for 24x7
kill $(cat /tmp/24x7.pid)| File | Purpose |
|---|---|
runbook.md |
Task plan with checkboxes, autonomy zones, error budget |
.claude/settings.json |
All hooks: PreToolUse, PostToolUse, SessionStart, Stop |
nightshift-run.sh |
Main script: claude -p + --dangerously-skip-permissions + PID lock + graceful shutdown |
nightshift-run-bg.sh |
Background wrapper using nohup |
nightshift-watchdog.sh |
Heartbeat monitor with configurable timeout |
nightshift-sandbox.sb |
macOS sandbox profile (Seatbelt) |
CLAUDE-nightshift.md |
Conventions + run memory (decisions.md) to append to CLAUDE.md |
README-nightshift.md |
Quick reference for the generated setup |
| File | Purpose |
|---|---|
runner.sh |
Endless loop: poll inbox → spawn Claude → route results |
runner-bg.sh |
Background wrapper using nohup |
watchdog.sh |
Heartbeat monitor with live inbox/outbox counters |
sandbox.sb |
macOS sandbox profile |
.claude/settings.json |
PreToolUse + PostToolUse hooks |
CLAUDE.md |
Workspace rules: autonomy zones, error tolerance, workspace memory (decisions.md) |
idle/idle-tasks.md |
Configurable idle behavior |
inbox/example-task/ |
Example task with task.md template |
Autonomy zones and error budget inspired by AlpiType — Solving the AI Agent Approval Loop
This project was created privately, to the best of the author's knowledge. Use at your own risk. No warranty of completeness, correctness, or fitness for any particular purpose.
Apache-2.0 — see LICENSE