Skip to content

Commit 9099bc3

Browse files
authored
chore: rename Navigator to NemoClaw across user facing contracts (#73)
1 parent 46381e6 commit 9099bc3

99 files changed

Lines changed: 654 additions & 658 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.agents/skills/debug-navigator-cluster/SKILL.md

Lines changed: 33 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,24 @@
11
---
22
name: debug-navigator-cluster
3-
description: Debug why a navigator cluster failed to start or is unhealthy. Use when the user has a failed `nav cluster admin deploy`, cluster health check failure, or wants to diagnose cluster infrastructure issues. Trigger keywords - debug cluster, cluster failing, cluster not starting, deploy failed, cluster troubleshoot, cluster health, cluster diagnose, why won't my cluster start, health check failed.
3+
description: Debug why a nemoclaw cluster failed to start or is unhealthy. Use when the user has a failed `ncl cluster admin deploy`, cluster health check failure, or wants to diagnose cluster infrastructure issues. Trigger keywords - debug cluster, cluster failing, cluster not starting, deploy failed, cluster troubleshoot, cluster health, cluster diagnose, why won't my cluster start, health check failed.
44
---
55

6-
# Debug Navigator Cluster
6+
# Debug NemoClaw Cluster
77

8-
Diagnose why a navigator cluster failed to start after `nav cluster admin deploy`.
8+
Diagnose why a nemoclaw cluster failed to start after `ncl cluster admin deploy`.
99

1010
## Overview
1111

12-
`nav cluster admin deploy` creates a Docker container running k3s with the Navigator server and Envoy Gateway deployed via Helm. The deployment stages, in order, are:
12+
`ncl cluster admin deploy` creates a Docker container running k3s with the NemoClaw server and Envoy Gateway deployed via Helm. The deployment stages, in order, are:
1313

14-
1. **Pre-deploy check**: `nav cluster admin deploy` in interactive mode prompts to **reuse** (keep volume, clean stale nodes) or **recreate** (destroy everything, fresh start). `mise run cluster` always recreates before deploy.
14+
1. **Pre-deploy check**: `ncl cluster admin deploy` in interactive mode prompts to **reuse** (keep volume, clean stale nodes) or **recreate** (destroy everything, fresh start). `mise run cluster` always recreates before deploy.
1515
2. Ensure cluster image is available (local build or remote pull)
1616
3. Create Docker network (`navigator-cluster`) and volume (`navigator-cluster-{name}`)
1717
4. Create and start a privileged Docker container (`navigator-cluster-{name}`)
1818
5. Wait for k3s to generate kubeconfig (up to 60s)
1919
6. **Clean stale nodes**: Remove any `NotReady` k3s nodes left over from previous container instances that reused the same persistent volume
20-
7. **Prepare local images** (if `NAVIGATOR_PUSH_IMAGES` is set): In `internal` registry mode, bootstrap waits for the in-cluster registry and pushes tagged images there. In `external` mode, bootstrap uses legacy `ctr -n k8s.io images import` push-mode behavior.
21-
7. **Reconcile TLS PKI**: Load existing TLS secrets from the cluster; if missing, incomplete, or malformed, generate fresh PKI (CA + server + client certs). Apply secrets to cluster. If rotation happened and the navigator workload is already running, rollout restart and wait for completion (failed rollout aborts deploy).
20+
7. **Prepare local images** (if `NEMOCLAW_PUSH_IMAGES` is set): In `internal` registry mode, bootstrap waits for the in-cluster registry and pushes tagged images there. In `external` mode, bootstrap uses legacy `ctr -n k8s.io images import` push-mode behavior.
21+
7. **Reconcile TLS PKI**: Load existing TLS secrets from the cluster; if missing, incomplete, or malformed, generate fresh PKI (CA + server + client certs). Apply secrets to cluster. If rotation happened and the NemoClaw workload is already running, rollout restart and wait for completion (failed rollout aborts deploy).
2222
8. **Store CLI mTLS credentials**: Persist client cert/key/CA locally for CLI authentication.
2323
9. Wait for cluster health checks to pass (up to 6 min):
2424
- k3s API server readiness (`/readyz`)
@@ -31,16 +31,16 @@ For local deploys, metadata endpoint selection now depends on Docker connectivit
3131
- default local Docker socket (`unix:///var/run/docker.sock`): `https://127.0.0.1:{port}` (default port 8080)
3232
- TCP Docker daemon (`DOCKER_HOST=tcp://<host>:<port>`): `https://<host>:{port}` for non-loopback hosts
3333

34-
The host port is configurable via `--port` on `nav cluster admin deploy` (default 8080) and is stored in `ClusterMetadata.gateway_port`.
34+
The host port is configurable via `--port` on `ncl cluster admin deploy` (default 8080) and is stored in `ClusterMetadata.gateway_port`.
3535

3636
The TCP host is also added as an extra gateway TLS SAN so mTLS hostname validation succeeds.
3737

38-
The default cluster name is `navigator`. The container is `navigator-cluster-{name}`.
38+
The default cluster name is `nemoclaw`. The container is `navigator-cluster-{name}`.
3939

4040
## Prerequisites
4141

4242
- Docker must be running (locally or on the remote host)
43-
- The `nav` CLI must be available
43+
- The `ncl` CLI must be available
4444
- For remote clusters: SSH access to the remote host
4545

4646
## Workflow
@@ -51,9 +51,9 @@ When the user asks to debug a cluster failure, **run diagnostics automatically**
5151

5252
Before running commands, establish:
5353

54-
1. **Cluster name**: Default is `navigator`, giving container name `navigator-cluster-navigator`
54+
1. **Cluster name**: Default is `nemoclaw`, giving container name `navigator-cluster-nemoclaw`
5555
2. **Remote or local**: If the user deployed with `--remote <host>`, all Docker commands must target that host
56-
3. **Config directory**: `~/.config/navigator/clusters/{name}/`
56+
3. **Config directory**: `~/.config/nemoclaw/clusters/{name}/`
5757

5858
For remote clusters, prefix Docker commands with SSH:
5959

@@ -87,7 +87,7 @@ If the container does not exist:
8787
docker images 'navigator/cluster*' --format 'table {{.Repository}}\t{{.Tag}}\t{{.Size}}'
8888
```
8989

90-
If the image is missing, re-deploy so bootstrap can pull the published cluster image (or set `NAVIGATOR_CLUSTER_IMAGE` explicitly).
90+
If the image is missing, re-deploy so bootstrap can pull the published cluster image (or set `NEMOCLAW_CLUSTER_IMAGE` explicitly).
9191

9292
If the container exists but is not running, inspect it:
9393

@@ -132,21 +132,21 @@ If `/readyz` fails, k3s is still starting or has crashed. Check container logs (
132132

133133
If pods are in `CrashLoopBackOff`, `ImagePullBackOff`, or `Pending`, investigate those pods specifically.
134134

135-
### Step 4: Check Navigator Server StatefulSet
135+
### Step 4: Check NemoClaw Server StatefulSet
136136

137-
The Navigator server is deployed via a HelmChart CR as a StatefulSet with persistent storage. Check its status:
137+
The NemoClaw server is deployed via a HelmChart CR as a StatefulSet with persistent storage. Check its status:
138138

139139
```bash
140140
# StatefulSet status
141141
docker exec navigator-cluster-<name> sh -lc 'KUBECONFIG=/etc/rancher/k3s/k3s.yaml kubectl -n navigator get statefulset/navigator -o wide'
142142

143-
# Navigator pod logs
143+
# NemoClaw pod logs
144144
docker exec navigator-cluster-<name> sh -lc 'KUBECONFIG=/etc/rancher/k3s/k3s.yaml kubectl -n navigator logs statefulset/navigator --tail=100'
145145

146146
# Describe statefulset for events
147147
docker exec navigator-cluster-<name> sh -lc 'KUBECONFIG=/etc/rancher/k3s/k3s.yaml kubectl -n navigator describe statefulset/navigator'
148148

149-
# Helm install job logs (the job that installs the Navigator chart)
149+
# Helm install job logs (the job that installs the NemoClaw chart)
150150
docker exec navigator-cluster-<name> sh -lc 'KUBECONFIG=/etc/rancher/k3s/k3s.yaml kubectl -n kube-system logs -l job-name=helm-install-navigator --tail=200'
151151
```
152152

@@ -188,7 +188,7 @@ Component images (server, sandbox, pki-job) can reach kubelet via two paths:
188188
**Local/external pull mode** (default local via `mise run cluster` / `mise run cluster:build`): Local images are tagged to the configured local registry base (default `127.0.0.1:5000/navigator/*`), pushed to that registry, and pulled by k3s via `registries.yaml` mirror endpoint (typically `host.docker.internal:5000`). `cluster:build` builds then pushes images; `cluster` pushes prebuilt local tags (`navigator/*:dev`, falling back to `localhost:5000/navigator/*:dev` or `127.0.0.1:5000/navigator/*:dev`).
189189

190190
```bash
191-
# Verify image refs currently used by navigator deployment
191+
# Verify image refs currently used by nemoclaw deployment
192192
docker exec navigator-cluster-<name> sh -lc 'KUBECONFIG=/etc/rancher/k3s/k3s.yaml kubectl -n navigator get deploy navigator -o jsonpath="{.spec.template.spec.containers[*].image}"'
193193

194194
# Verify registry mirror/auth endpoint configuration
@@ -208,7 +208,7 @@ If images are missing, re-import with:
208208
docker save <image-ref> | docker exec -i navigator-cluster-<name> ctr -a /run/k3s/containerd/containerd.sock images import -
209209
```
210210

211-
**External pull mode** (remote deploy, or local with `NAVIGATOR_REGISTRY_HOST`/`IMAGE_REPO_BASE` pointing at a non-local registry): Images are pulled from an external registry at runtime. The entrypoint generates `/etc/rancher/k3s/registries.yaml`.
211+
**External pull mode** (remote deploy, or local with `NEMOCLAW_REGISTRY_HOST`/`IMAGE_REPO_BASE` pointing at a non-local registry): Images are pulled from an external registry at runtime. The entrypoint generates `/etc/rancher/k3s/registries.yaml`.
212212

213213
```bash
214214
# Verify registries.yaml exists and has credentials
@@ -218,7 +218,7 @@ docker exec navigator-cluster-<name> cat /etc/rancher/k3s/registries.yaml
218218
docker exec navigator-cluster-<name> sh -lc 'KUBECONFIG=/etc/rancher/k3s/k3s.yaml crictl pull d1i0nduu2f6qxk.cloudfront.net/navigator/pki-job:latest'
219219
```
220220

221-
If `registries.yaml` is missing or has wrong values, verify env wiring (`NAVIGATOR_REGISTRY_HOST`, `NAVIGATOR_REGISTRY_INSECURE`, username/password for authenticated registries).
221+
If `registries.yaml` is missing or has wrong values, verify env wiring (`NEMOCLAW_REGISTRY_HOST`, `NEMOCLAW_REGISTRY_INSECURE`, username/password for authenticated registries).
222222

223223
### Step 7: Check mTLS / PKI
224224

@@ -232,15 +232,15 @@ docker exec navigator-cluster-<name> sh -lc 'KUBECONFIG=/etc/rancher/k3s/k3s.yam
232232
docker exec navigator-cluster-<name> sh -lc 'KUBECONFIG=/etc/rancher/k3s/k3s.yaml kubectl -n navigator get secret navigator-server-tls -o jsonpath="{.data.tls\.crt}" | base64 -d | openssl x509 -noout -dates 2>/dev/null || echo "openssl not available"'
233233

234234
# Check if CLI-side mTLS files exist locally
235-
ls -la ~/.config/navigator/clusters/<name>/mtls/
235+
ls -la ~/.config/nemoclaw/clusters/<name>/mtls/
236236
```
237237

238-
On redeploy, bootstrap reuses existing secrets if they are valid PEM. If secrets are missing or malformed, fresh PKI is generated and the navigator workload is automatically restarted. If the rollout restart fails after rotation, the deploy aborts and CLI-side certs are not updated. Certificates use rcgen defaults (effectively never expire).
238+
On redeploy, bootstrap reuses existing secrets if they are valid PEM. If secrets are missing or malformed, fresh PKI is generated and the NemoClaw workload is automatically restarted. If the rollout restart fails after rotation, the deploy aborts and CLI-side certs are not updated. Certificates use rcgen defaults (effectively never expire).
239239

240240
Common mTLS issues:
241241
- **Secrets missing**: The `navigator` namespace may not have been created yet (Helm controller race). Bootstrap waits up to 2 minutes for the namespace.
242242
- **mTLS mismatch after manual secret deletion**: Delete all three secrets and redeploy — bootstrap will regenerate and restart the workload.
243-
- **CLI can't connect after redeploy**: Check that `~/.config/navigator/clusters/<name>/mtls/` contains `ca.crt`, `tls.crt`, `tls.key` and that they were updated at deploy time.
243+
- **CLI can't connect after redeploy**: Check that `~/.config/nemoclaw/clusters/<name>/mtls/` contains `ca.crt`, `tls.crt`, `tls.key` and that they were updated at deploy time.
244244

245245
### Step 8: Check Kubernetes Events
246246

@@ -290,10 +290,10 @@ If DNS is broken, all image pulls from the distribution registry will fail, as w
290290
| Container exited, OOMKilled | Insufficient memory | Increase host memory or reduce workload |
291291
| Container exited, non-zero exit | k3s crash, port conflict, privilege issue | Check `docker logs` and `docker inspect` for details |
292292
| `/readyz` fails | k3s still starting or crashed | Wait longer or check container logs for k3s errors |
293-
| Navigator pods `Pending` | Insufficient CPU/memory for scheduling, or PVC not bound | Check `kubectl describe pod` for scheduling failures and `kubectl get pvc -n navigator` for volume status |
294-
| Navigator pods `CrashLoopBackOff` | Server application error | Check `kubectl logs` on the crashing pod |
295-
| Navigator pods `ImagePullBackOff` (push mode) | Images not imported or wrong containerd namespace | Check `k3s ctr -n k8s.io images ls` for component images (Step 6) |
296-
| Navigator pods `ImagePullBackOff` (pull mode) | Registry auth or DNS issue | Check `/etc/rancher/k3s/registries.yaml` credentials and DNS (Step 8) |
293+
| NemoClaw pods `Pending` | Insufficient CPU/memory for scheduling, or PVC not bound | Check `kubectl describe pod` for scheduling failures and `kubectl get pvc -n navigator` for volume status |
294+
| NemoClaw pods `CrashLoopBackOff` | Server application error | Check `kubectl logs` on the crashing pod |
295+
| NemoClaw pods `ImagePullBackOff` (push mode) | Images not imported or wrong containerd namespace | Check `k3s ctr -n k8s.io images ls` for component images (Step 6) |
296+
| NemoClaw pods `ImagePullBackOff` (pull mode) | Registry auth or DNS issue | Check `/etc/rancher/k3s/registries.yaml` credentials and DNS (Step 8) |
297297
| Image import fails (`k3s ctr` exit code != 0) | Corrupt tar stream or containerd not ready | Retry after k3s is fully started; check container logs |
298298
| Push mode images not found by kubelet | Imported into wrong containerd namespace | Must use `k3s ctr -n k8s.io images import`, not `k3s ctr images import` |
299299
| Gateway not `Programmed` | Envoy Gateway not ready | Check `envoy-gateway-system` pods and Helm install logs |
@@ -331,9 +331,9 @@ docker -H ssh://<host> logs navigator-cluster-<name>
331331
**Setting up kubectl access** (requires tunnel):
332332

333333
```bash
334-
nav cluster admin tunnel --name <name> --remote <host>
334+
ncl cluster admin tunnel --name <name> --remote <host>
335335
# Then in another terminal:
336-
export KUBECONFIG=~/.config/navigator/clusters/<name>/kubeconfig
336+
export KUBECONFIG=~/.config/nemoclaw/clusters/<name>/kubeconfig
337337
kubectl get pods -A
338338
```
339339

@@ -343,7 +343,7 @@ Run all diagnostics at once for a comprehensive report:
343343

344344
```bash
345345
HOST="<host>" # leave empty for local, or set to SSH destination
346-
NAME="navigator" # cluster name
346+
NAME="nemoclaw" # cluster name
347347
CONTAINER="navigator-cluster-${NAME}"
348348
KCFG="KUBECONFIG=/etc/rancher/k3s/k3s.yaml"
349349

@@ -369,10 +369,10 @@ run docker exec "${CONTAINER}" sh -lc "${KCFG} kubectl get pods -A -o wide" 2>&1
369369
echo "=== Failing Pods ==="
370370
run docker exec "${CONTAINER}" sh -lc "${KCFG} kubectl get pods -A --field-selector=status.phase!=Running,status.phase!=Succeeded" 2>&1
371371

372-
echo "=== Navigator StatefulSet ==="
372+
echo "=== NemoClaw StatefulSet ==="
373373
run docker exec "${CONTAINER}" sh -lc "${KCFG} kubectl -n navigator get statefulset/navigator -o wide" 2>&1
374374

375-
echo "=== Navigator Gateway ==="
375+
echo "=== NemoClaw Gateway ==="
376376
run docker exec "${CONTAINER}" sh -lc "${KCFG} kubectl -n navigator get gateway/navigator-gateway" 2>&1
377377

378378
echo "=== Recent Events ==="
@@ -381,7 +381,7 @@ run docker exec "${CONTAINER}" sh -lc "${KCFG} kubectl get events -A --sort-by=.
381381
echo "=== PKI Job Logs ==="
382382
run docker exec "${CONTAINER}" sh -lc "${KCFG} kubectl -n navigator logs -l job-name=navigator-gateway-pki --tail=100" 2>&1
383383

384-
echo "=== Helm Install Navigator Logs ==="
384+
echo "=== Helm Install NemoClaw Logs ==="
385385
run docker exec "${CONTAINER}" sh -lc "${KCFG} kubectl -n kube-system logs -l job-name=helm-install-navigator --tail=100" 2>&1
386386

387387
echo "=== Registry Configuration ==="

.agents/skills/tui-development/SKILL.md

Lines changed: 12 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
name: tui-development
3-
description: Guide for developing the "Gator" TUI — a ratatui-based terminal UI for the Navigator platform. Covers architecture, navigation, data fetching, theming, UX conventions, and development workflow. Trigger keywords - gator, TUI, terminal UI, ratatui, navigator-tui, tui development, gator feature, gator bug.
3+
description: Guide for developing the "Gator" TUI — a ratatui-based terminal UI for the NemoClaw platform. Covers architecture, navigation, data fetching, theming, UX conventions, and development workflow. Trigger keywords - gator, TUI, terminal UI, ratatui, navigator-tui, tui development, gator feature, gator bug.
44
---
55

66
# Gator TUI Development Guide
@@ -9,14 +9,14 @@ Comprehensive reference for any agent working on the Gator TUI.
99

1010
## 1. Overview
1111

12-
Gator is a ratatui-based terminal UI for the Navigator platform. It provides a keyboard-driven interface for managing clusters, sandboxes, and logs — the same operations available via the `nav` CLI, but with a live, interactive dashboard.
12+
Gator is a ratatui-based terminal UI for the NemoClaw platform. It provides a keyboard-driven interface for managing clusters, sandboxes, and logs — the same operations available via the `ncl` CLI, but with a live, interactive dashboard.
1313

14-
- **Launched via:** `nav gator` or `mise run gator`
14+
- **Launched via:** `ncl gator` or `mise run gator`
1515
- **Crate:** `crates/navigator-tui/`
1616
- **Key dependencies:**
1717
- `ratatui` (workspace version) — uses `frame.size()` (not `frame.area()`)
1818
- `crossterm` (workspace version) — terminal backend and event polling
19-
- `tonic` with TLS — gRPC client for the Navigator gateway
19+
- `tonic` with TLS — gRPC client for the NemoClaw gateway
2020
- `tokio` — async runtime for event loop, spawned tasks, and mpsc channels
2121
- `navigator-core` — proto-generated types (`NavigatorClient`, request/response structs)
2222
- `navigator-bootstrap` — cluster discovery (`list_clusters()`)
@@ -104,8 +104,8 @@ Every frame renders four vertical regions:
104104

105105
### Title bar examples
106106

107-
- Dashboard: ` Gator │ Current Cluster: navigator (Healthy) │ Dashboard`
108-
- Sandbox detail: ` Gator │ Current Cluster: navigator (Healthy) │ Sandbox: my-sandbox`
107+
- Dashboard: ` Gator │ Current Cluster: nemoclaw (Healthy) │ Dashboard`
108+
- Sandbox detail: ` Gator │ Current Cluster: nemoclaw (Healthy) │ Sandbox: my-sandbox`
109109

110110
### Adding a new screen
111111

@@ -225,14 +225,14 @@ The `confirm_delete` flag in `App` gates destructive key handling — while true
225225

226226
### CLI parity
227227

228-
Gator actions should parallel `nav` CLI commands so users have familiar mental models:
228+
Gator actions should parallel `ncl` CLI commands so users have familiar mental models:
229229

230230
| CLI Command | Gator Equivalent |
231231
| --- | --- |
232-
| `nav sandbox list` | Sandbox table on Dashboard |
233-
| `nav sandbox delete <name>` | `[d]` on sandbox detail, then `[y]` to confirm |
234-
| `nav sandbox logs <name>` | `[l]` on sandbox detail to open log viewer |
235-
| `nav cluster health` | Status in title bar + cluster list |
232+
| `ncl sandbox list` | Sandbox table on Dashboard |
233+
| `ncl sandbox delete <name>` | `[d]` on sandbox detail, then `[y]` to confirm |
234+
| `ncl sandbox logs <name>` | `[l]` on sandbox detail to open log viewer |
235+
| `ncl cluster health` | Status in title bar + cluster list |
236236

237237
When adding new TUI features, check what the CLI offers and maintain consistency.
238238

@@ -337,7 +337,7 @@ lib.rs (event loop, gRPC, async tasks)
337337
### Dependency constraints
338338

339339
- **`navigator-tui` cannot depend on `navigator-cli`** — this would create a circular dependency. TLS channel building for cluster switching is done directly in `lib.rs` using `tonic::transport` primitives (`Certificate`, `Identity`, `ClientTlsConfig`, `Endpoint`).
340-
- mTLS certs are read from `~/.config/navigator/clusters/<name>/mtls/` (ca.crt, tls.crt, tls.key).
340+
- mTLS certs are read from `~/.config/nemoclaw/clusters/<name>/mtls/` (ca.crt, tls.crt, tls.key).
341341

342342
### Proto generated code
343343

0 commit comments

Comments
 (0)