Skip to content

fix(onboard): auto-cleanup orphaned gateway container on re-onboard#1615

Closed
yanyunl1991 wants to merge 3 commits intomainfrom
fix/cleanup-orphaned-gateway-container-1582
Closed

fix(onboard): auto-cleanup orphaned gateway container on re-onboard#1615
yanyunl1991 wants to merge 3 commits intomainfrom
fix/cleanup-orphaned-gateway-container-1582

Conversation

@yanyunl1991
Copy link
Copy Markdown

@yanyunl1991 yanyunl1991 commented Apr 8, 2026

Summary

Fixes #1582

Ctrl+C during nemoclaw onboard gateway start leaves an orphaned
openshell-cluster-nemoclaw Docker container running. The next
nemoclaw onboard fails with "Port 8080 is not available" because
OpenShell no longer tracks the container but it still holds the port.

  • Add cleanupOrphanedGatewayContainer() that detects and removes
    orphaned openshell-cluster-nemoclaw Docker containers
  • In preflight port check: auto-cleanup before failing on port conflict
  • In destroyGateway(): also clean up containers, not just volumes

Test plan

  • Reproduced on Ubuntu 24.04: onboard → Ctrl+C during [2/8] → re-onboard fails with port 8080 conflict
  • Verified fix: re-onboard auto-cleans orphaned container and continues successfully
  • Existing e2e tests pass (test-double-onboard.sh, test-port8080-conflict.sh)

Reproduction log

Before fix — re-onboard fails with port conflict:

  cuda@cuda-lyy:~$ nemoclaw onboard

  NemoClaw Onboarding
  ===================

  [1/7] Preflight checks
  ──────────────────────────────────────────────────
  ✓ Docker is running
  ✓ Container runtime: docker
  ✓ openshell CLI: openshell 0.0.20
  ✓ Port 8080 available (OpenShell gateway)
  ✓ Port 18789 available (NemoClaw dashboard)
  ⓘ No GPU detected — will use cloud inference
  ✓ Memory OK: 15617 MB RAM + 4095 MB swap

  [2/7] Starting OpenShell gateway
  ──────────────────────────────────────────────────
  Using pinned OpenShell gateway image: ghcr.io/nvidia/openshell/cluster:0.0.20
✓ Checking Docker
✓ Downloading gateway
✓ Initializing environment
⠚ Starting gateway
─────────────────────────────────────────────────────────────────────────── Gateway Logs ───────────────────────────────────────────────────────────────────────────
I0408 16:19:53.931574     111 operation_generator.go:779] UnmountVolume.TearDown succeeded for volume "kubernetes.io/empty-dir/1840ac14-5976-40b8-9ce4-eba1709e51ce…
I0408 16:19:53.931622     111 operation_generator.go:779] UnmountVolume.TearDown succeeded for volume "kubernetes.io/empty-dir/1840ac14-5976-40b8-9ce4-eba1709e51ce…
I0408 16:19:53.931905     111 operation_generator.go:779] UnmountVolume.TearDown succeeded for volume "kubernetes.io/projected/1840ac14-5976-40b8-9ce4-eba1709e51ce…
I0408 16:19:54.030873     111 reconciler_common.go:299] "Volume detached for volume \"klipper-config\" (UniqueName: \"kubernetes.io/empty-dir/1840ac14-5976-40b8-9c…
I0408 16:19:54.030909     111 reconciler_common.go:299] "Volume detached for volume \"kube-api-access-lnkdz\" (UniqueName: \"kubernetes.io/projected/1840ac14-5976-…
I0408 16:19:54.030924     111 reconciler_common.go:299] "Volume detached for volume \"values\" (UniqueName: \"kubernetes.io/projected/1840ac14-5976-40b8-9ce4-eba17…
I0408 16:19:54.030934     111 reconciler_common.go:299] "Volume detached for volume \"klipper-helm\" (UniqueName: \"kubernetes.io/empty-dir/1840ac14-5976-40b8-9ce4…
I0408 16:19:54.030975     111 reconciler_common.go:299] "Volume detached for volume \"content\" (UniqueName: \"kubernetes.io/configmap/1840ac14-5976-40b8-9ce4-eba1…
I0408 16:19:54.030986     111 reconciler_common.go:299] "Volume detached for volume \"klipper-cache\" (UniqueName: \"kubernetes.io/empty-dir/1840ac14-5976-40b8-9ce…
I0408 16:19:54.030997     111 reconciler_common.go:299] "Volume detached for volume \"tmp\" (UniqueName: \"kubernetes.io/empty-dir/1840ac14-5976-40b8-9ce4-eba1709e…
E0408 16:19:54.164926     111 resource_quota_controller.go:460] "Error during resource discovery" err="unable to retrieve the complete list of server APIs: metrics…
I0408 16:19:54.250353     111 garbagecollector.go:792] "failed to discover some groups" groups="map[\"metrics.k8s.io/v1beta1\":\"stale GroupVersion discovery: metr…
I0408 16:19:54.614096     111 pod_container_deletor.go:80] "Container not found in pod's containers" containerID="5d9cade133ca9e2ef2fa2b06d1a9326db5ccf508aed00ab66…
I0408 16:19:54.640705     111 event.go:389] "Event occurred" object="kube-system/openshell" fieldPath="" kind="HelmChart" apiVersion="helm.cattle.io/v1" type="Norm…
I0408 16:19:54.653268     111 event.go:389] "Event occurred" object="kube-system/openshell" fieldPath="" kind="HelmChart" apiVersion="helm.cattle.io/v1" type="Norm…
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────^C
cuda@cuda-lyy:~$ lsof -i:8080
cuda@cuda-lyy:~$ ^C
cuda@cuda-lyy:~$ nemoclaw onboard

  NemoClaw Onboarding
  ===================

  [1/7] Preflight checks
  ──────────────────────────────────────────────────
  ✓ Docker is running
  ✓ Container runtime: docker
  ✓ openshell CLI: openshell 0.0.20

  !! Port 8080 is not available.
     OpenShell gateway needs this port.

     Blocked by: docker-pr (PID 3486802)

     To fix, stop the conflicting process:

       sudo kill 3486802
       # or, if it's a systemd service:
       systemctl --user stop openclaw-gateway.service

     Detail: sudo lsof reports docker-pr (PID 3486802) listening on port 8080
cuda@cuda-lyy:~$ docker ps
CONTAINER ID   IMAGE                                     COMMAND                  CREATED         STATUS                   PORTS                     NAMES
979649bab68b   ghcr.io/nvidia/openshell/cluster:0.0.20   "/usr/local/bin/clus…"   3 minutes ago   Up 3 minutes (healthy)   0.0.0.0:8080->30051/tcp   openshell-cluster-nemoclaw

After fix — auto-cleanup and continue:

  cuda@cuda-lyy:~/NemoClaw$ node ~/NemoClaw/bin/nemoclaw.js onboard 
  NemoClaw Onboarding
  ===================

  [1/8] Preflight checks
  ──────────────────────────────────────────────────
  ✓ Docker is running
  ✓ Container runtime: docker
  ✓ openshell CLI: openshell 0.0.20
  Cleaning up orphaned gateway container (openshell-cluster-nemoclaw)...
openshell-cluster-nemoclaw
openshell-cluster-nemoclaw
  ✓ Port 8080 available after orphaned container cleanup (OpenShell gateway)
  ✓ Port 18789 available (NemoClaw dashboard)
  ⓘ No GPU detected — will use cloud inference
  ✓ Memory OK: 15617 MB RAM + 4095 MB swap

  [2/8] Starting OpenShell gateway
  ──────────────────────────────────────────────────
  Using pinned OpenShell gateway image: ghcr.io/nvidia/openshell/cluster:0.0.20
  Starting gateway cluster...
  Waiting for gateway health...
  Gateway start attempt 1 failed. 2 retries left...
• Destroying gateway nemoclaw...
✓ Gateway nemoclaw destroyed.
  Starting gateway cluster...
  Still starting gateway cluster... (5s elapsed)
  Still starting gateway cluster... (10s elapsed)
  Still starting gateway cluster... (20s elapsed)
  Still starting gateway cluster... (30s elapsed)
  Still starting gateway cluster... (40s elapsed)
  Starting OpenShell gateway pod...
  Installing OpenShell components...
  Starting OpenShell gateway pod...
  Still starting OpenShell gateway pod... (50s elapsed)
  Still starting OpenShell gateway pod... (60s elapsed)
  Waiting for gateway health...
  Waiting for gateway health...
  ✓ Gateway is healthy
✓ Active gateway set to 'nemoclaw'

Summary by CodeRabbit

  • Bug Fixes
    • Improved gateway cleanup to best-effort remove orphaned containers that can block setup, reducing leftover resources.
    • Enhanced port availability checks to retry automatically after cleanup, preventing setup failures when ports are occupied.

…1582)

When Ctrl+C interrupts gateway start, the Docker container
openshell-cluster-nemoclaw keeps running but OpenShell no longer
tracks it. Re-onboard then fails with "Port 8080 is not available".

Add cleanupOrphanedGatewayContainer() that detects and removes the
orphaned container. Called in two places:
- preflight port check: auto-cleanup before failing on port conflict
- destroyGateway(): ensure containers are removed alongside volumes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 8, 2026

📝 Walkthrough

Walkthrough

Detects and removes an orphaned Docker gateway container during preflight and gateway teardown. On port conflict, the preflight now attempts best-effort container cleanup and retries port availability before failing.

Changes

Cohort / File(s) Summary
Docker Gateway Cleanup
bin/lib/onboard.js
Added cleanupOrphanedGatewayContainer() to locate openshell-cluster-${GATEWAY_NAME} containers and best-effort stop/remove them (errors ignored). Integrated cleanup call into preflight() so port checks retry after cleanup, and invoked from destroyGateway() for extended teardown beyond volumes.

Sequence Diagram(s)

sequenceDiagram
  participant CLI as Onboard CLI
  participant Preflight as preflight()
  participant Docker as Docker Engine
  participant PortCheck as checkPortAvailable()

  CLI->>Preflight: start preflight
  Preflight->>PortCheck: check port
  alt port blocked
    Preflight->>Docker: query container openshell-cluster-${GATEWAY_NAME}
    Docker-->>Preflight: container found
    Preflight->>Docker: stop container (best-effort)
    Preflight->>Docker: remove container (best-effort)
    Preflight->>PortCheck: re-check port
    alt port freed
      PortCheck-->>Preflight: available
      Preflight-->>CLI: continue
    else still blocked
      PortCheck-->>Preflight: still blocked
      Preflight-->>CLI: fail preflight
    end
  else port available
    PortCheck-->>Preflight: available
    Preflight-->>CLI: continue
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰
A little hop, a careful sniff,
I found the orphaned gateway adrift.
I nudged it gently, then made it cease,
Now onboarding hums in tidy peace.
🥕✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main change: automatically cleaning up orphaned gateway containers during re-onboarding, which directly addresses the core issue.
Linked Issues check ✅ Passed The PR implementation directly addresses issue #1582's requirement: detects orphaned containers in preflight checks and removes them automatically before port conflicts occur.
Out of Scope Changes check ✅ Passed All changes (cleanup function, preflight integration, and destroyGateway enhancement) are directly aligned with the linked issue's objective of auto-cleanup on re-onboard.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/cleanup-orphaned-gateway-container-1582

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
bin/lib/onboard.js (2)

1565-1566: Prefer a single docker rm -f for cleanup.

This removes the container in one call and avoids a split stop/remove path.

♻️ Proposed simplification
-  run(`docker stop ${shellQuote(containerName)} 2>/dev/null || true`, { ignoreError: true });
-  run(`docker rm ${shellQuote(containerName)} 2>/dev/null || true`, { ignoreError: true });
+  run(`docker rm -f ${shellQuote(containerName)} 2>/dev/null || true`, { ignoreError: true });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 1565 - 1566, Replace the two-step stop+rm
cleanup with a single forced remove call: update the invocation that uses run
and shellQuote with containerName so it calls docker rm -f on the container
(still preserving the 2>/dev/null || true redirection and the { ignoreError:
true } option). Locate the two lines using run(`docker stop
${shellQuote(containerName)} ...`) and run(`docker rm
${shellQuote(containerName)} ...`) and consolidate them into one run(...) that
performs docker rm -f ${shellQuote(containerName)} with the same error
suppression and options.

1733-1741: Scope orphan-container cleanup retry to port 8080 only.

The orphaned gateway container fix is specific to gateway-port conflicts; applying it to other ports (like 18789) is unnecessary and can trigger unrelated Docker operations.

🎯 Proposed scope tightening
-      if (cleanupOrphanedGatewayContainer()) {
+      if (port === 8080 && cleanupOrphanedGatewayContainer()) {
         portCheck = await checkPortAvailable(port);
         if (portCheck.ok) {
           console.log(`  ✓ Port ${port} available after orphaned container cleanup (${label})`);
           continue;
         }
       }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 1733 - 1741, The orphaned gateway container
cleanup logic currently runs for every port; restrict it to only run when
handling the gateway port (8080) to avoid unnecessary Docker operations on other
ports. In the block that calls cleanupOrphanedGatewayContainer() and then
re-checks with checkPortAvailable(port), guard that entire sequence with a check
like if (port === 8080) (or compare against your GATEWAY_PORT constant) so
cleanupOrphanedGatewayContainer() and the subsequent port re-check only execute
for port 8080; keep the existing functions cleanupOrphanedGatewayContainer(),
checkPortAvailable(), and the console log (which references port and label)
unchanged otherwise.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 1565-1566: Replace the two-step stop+rm cleanup with a single
forced remove call: update the invocation that uses run and shellQuote with
containerName so it calls docker rm -f on the container (still preserving the
2>/dev/null || true redirection and the { ignoreError: true } option). Locate
the two lines using run(`docker stop ${shellQuote(containerName)} ...`) and
run(`docker rm ${shellQuote(containerName)} ...`) and consolidate them into one
run(...) that performs docker rm -f ${shellQuote(containerName)} with the same
error suppression and options.
- Around line 1733-1741: The orphaned gateway container cleanup logic currently
runs for every port; restrict it to only run when handling the gateway port
(8080) to avoid unnecessary Docker operations on other ports. In the block that
calls cleanupOrphanedGatewayContainer() and then re-checks with
checkPortAvailable(port), guard that entire sequence with a check like if (port
=== 8080) (or compare against your GATEWAY_PORT constant) so
cleanupOrphanedGatewayContainer() and the subsequent port re-check only execute
for port 8080; keep the existing functions cleanupOrphanedGatewayContainer(),
checkPortAvailable(), and the console log (which references port and label)
unchanged otherwise.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 7d27dee1-3f8a-46e7-bdbb-3c03c66ee6f8

📥 Commits

Reviewing files that changed from the base of the PR and between 65248ce and 098c776.

📒 Files selected for processing (1)
  • bin/lib/onboard.js

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
bin/lib/onboard.js (1)

1558-1569: Add behavioral tests for cleanup execution and retry path.

The current gateway cleanup tests are static string checks; they don’t validate that cleanupOrphanedGatewayContainer() actually triggers docker stop/rm and enables the second checkPortAvailable() pass in preflight.

Also applies to: 1728-1742

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 1558 - 1569, Add behavioral tests that
exercise cleanupOrphanedGatewayContainer() and the preflight retry path instead
of just checking strings: mock/stub runCapture to return a non-empty container
id to force the cleanup branch, and spy/mock run to verify docker stop and
docker rm are invoked for the containerName used in
cleanupOrphanedGatewayContainer(); additionally, test the preflight flow that
calls checkPortAvailable() twice by simulating the first check failing and the
second succeeding (or by advancing the mocked state after cleanup) so the code
path that rechecks the port is executed; reference the functions
cleanupOrphanedGatewayContainer, runCapture, run, and checkPortAvailable when
wiring the mocks and assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@bin/lib/onboard.js`:
- Around line 1560-1563: The current use of runCapture with `docker ps -aq
--filter "name=^${containerName}$"` is unreliable because docker ps does
substring matching; replace that call with a deterministic inspect call: invoke
runCapture(`docker container inspect ${containerName}`, { ignoreError: true })
(or with a --format to return ID) and treat a successful inspect (non-empty
output / zero exit) as the container existing; update the variable `existing`
(and any downstream logic that uses it) to rely on the inspect result instead of
the substring-matching docker ps output while preserving error-ignore behavior
around `runCapture` and using the `containerName` identifier.

---

Nitpick comments:
In `@bin/lib/onboard.js`:
- Around line 1558-1569: Add behavioral tests that exercise
cleanupOrphanedGatewayContainer() and the preflight retry path instead of just
checking strings: mock/stub runCapture to return a non-empty container id to
force the cleanup branch, and spy/mock run to verify docker stop and docker rm
are invoked for the containerName used in cleanupOrphanedGatewayContainer();
additionally, test the preflight flow that calls checkPortAvailable() twice by
simulating the first check failing and the second succeeding (or by advancing
the mocked state after cleanup) so the code path that rechecks the port is
executed; reference the functions cleanupOrphanedGatewayContainer, runCapture,
run, and checkPortAvailable when wiring the mocks and assertions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: d36b60a6-8aa1-4f7c-87f5-3867fe3b65ae

📥 Commits

Reviewing files that changed from the base of the PR and between 098c776 and e1d57e3.

📒 Files selected for processing (1)
  • bin/lib/onboard.js

Comment on lines +1560 to +1563
const existing = runCapture(
`docker ps -aq --filter "name=^${containerName}$"`,
{ ignoreError: true },
).trim();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In Docker CLI, for docker ps --filter "name=...", what is the documented exact-match pattern behavior? Is ^container-name$always reliable, or is^/container-name$(ordocker container inspect ) recommended for exact name checks?

💡 Result:

  • Documented behavior: For docker ps / docker container ls, the name filter “matches on all or part of a container’s name” (i.e., substring match). Docker’s CLI docs do not document an “exact match” mode for name=. [1]
  • Regex anchoring (^...$) status: Docker’s general filtering docs say some filters support regex, but you must rely on each command’s own reference to know what’s supported—and the docker ps docs for the name filter only describe substring matching, not regex/exact semantics. [2][1]
    ⇒ So --filter "name=^container-name$" is not documented as a reliable exact-match contract.
  • Why people use ^/name$: In the Engine API, container “Names” are represented with a leading slash (e.g., "/coolName"). [4] This is why, in practice, exact-match regex attempts often use ^/container-name$—but that’s still implementation-dependent, not the CLI’s documented promise. [4][1]
  • Recommended exact-name existence check: Use docker inspect --type=container <name> (or docker container inspect <name>). docker inspect is documented to match objects by ID or name (and --type avoids ambiguity with non-container objects). [3]

Bottom line: The documented docker ps --filter name=... behavior is substring matching, not exact matching; for a dependable exact-name check, prefer docker (container) inspect <name> (optionally --type=container). [1][3]

Sources: [1] Docker CLI docker container ls (name filter) docs; [2] Docker “Filter commands” docs; [3] Docker CLI docker inspect docs; [4] Docker Engine API swagger (ContainerSummary.Names examples include leading /).


Replace docker ps substring filter with docker container inspect for reliable exact-name matching (Line 1561).

docker ps --filter "name=..." is documented as substring matching only; regex anchors like ^...$ are not part of the CLI's documented contract and can fail to detect containers. Use docker container inspect <name> for deterministic exact-name/ID lookups.

Suggested hardening
 function cleanupOrphanedGatewayContainer() {
   const containerName = `openshell-cluster-${GATEWAY_NAME}`;
-  const existing = runCapture(
-    `docker ps -aq --filter "name=^${containerName}$"`,
-    { ignoreError: true },
-  ).trim();
-  if (!existing) return false;
+  const exists = run(`docker container inspect ${shellQuote(containerName)} >/dev/null 2>&1`, {
+    ignoreError: true,
+  });
+  if (exists.status !== 0) return false;
   console.log(`  Cleaning up orphaned gateway container (${containerName})...`);
   run(`docker stop ${shellQuote(containerName)} 2>/dev/null || true`, { ignoreError: true });
   run(`docker rm ${shellQuote(containerName)} 2>/dev/null || true`, { ignoreError: true });
   return true;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 1560 - 1563, The current use of runCapture
with `docker ps -aq --filter "name=^${containerName}$"` is unreliable because
docker ps does substring matching; replace that call with a deterministic
inspect call: invoke runCapture(`docker container inspect ${containerName}`, {
ignoreError: true }) (or with a --format to return ID) and treat a successful
inspect (non-empty output / zero exit) as the container existing; update the
variable `existing` (and any downstream logic that uses it) to rely on the
inspect result instead of the substring-matching docker ps output while
preserving error-ignore behavior around `runCapture` and using the
`containerName` identifier.

ericksoa
ericksoa previously approved these changes Apr 8, 2026
Copy link
Copy Markdown
Contributor

@ericksoa ericksoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fix — this addresses a real user pain point cleanly. Approving as-is.

A few suggestions for a follow-up PR:

  1. docker ps --filter "name=^...$" vs docker container inspect — The ^...$ regex anchoring on Docker's name filter works in practice but isn't documented behavior. Would docker container inspect openshell-cluster-nemoclaw >/dev/null 2>&1 be a more future-proof way to check for the container? Just something to consider.

  2. Scope cleanup to port 8080 only — Right now the cleanup fires for every port that fails its availability check, but the orphaned gateway container only binds 8080. If 18789 is blocked by something else, the "Cleaning up orphaned gateway container..." message could mislead users. Scoping the call to port === 8080 would make the output clearer.

  3. Add a test for the new cleanup pathgateway-cleanup.test.js currently only does static string matching. Could a follow-up add an assertion for cleanupOrphanedGatewayContainer, or better yet, a behavioral test that mocks runCapture/run?

  4. Nit: docker stop + docker rmdocker rm -f? — Is the graceful SIGTERM via stop intentional for the orphaned container, or would docker rm -f simplify things?

@ericksoa ericksoa dismissed their stale review April 8, 2026 18:08

Dismissing — CI checks are failing (checks and dco-check). Please fix before re-requesting review.

Copy link
Copy Markdown
Contributor

@ericksoa ericksoa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CI is failing — please fix before re-requesting review:

  • dco-check: Commits need --signoff for DCO compliance.
  • checks: Test suite is failing.

The code feedback from my earlier review still applies — happy to re-approve once CI is green.

@cv
Copy link
Copy Markdown
Contributor

cv commented Apr 8, 2026

Closing as duplicate of #1567, which has been merged. Thank you for the contribution @yanyunl1991 — both approaches addressed #1582.

@cv cv closed this Apr 8, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[NemoClaw][All platforms] Re-onboard does not clean up orphaned gateway container from interrupted Ctrl+C

3 participants