Skip to content
Closed
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 32 additions & 3 deletions bin/lib/onboard.js
Original file line number Diff line number Diff line change
Expand Up @@ -1539,14 +1539,34 @@ function destroyGateway() {
if (destroyResult.status === 0) {
registry.clearAll();
}
// openshell gateway destroy doesn't remove Docker volumes, which leaves
// corrupted cluster state that breaks the next gateway start. Clean them up.
// openshell gateway destroy may leave behind Docker containers (e.g. after a
// Ctrl+C interrupt during gateway start) and volumes. Clean both up so the
// next gateway start doesn't hit port conflicts or corrupted state. (#1582)
cleanupOrphanedGatewayContainer();
run(
`docker volume ls -q --filter "name=openshell-cluster-${GATEWAY_NAME}" | grep . && docker volume ls -q --filter "name=openshell-cluster-${GATEWAY_NAME}" | xargs docker volume rm || true`,
{ ignoreError: true },
);
}

/**
* Stop and remove any orphaned Docker container left by a previous gateway
* that OpenShell no longer tracks (e.g. after Ctrl+C during gateway start).
* Returns true if a container was found and removed. (#1582)
*/
function cleanupOrphanedGatewayContainer() {
const containerName = `openshell-cluster-${GATEWAY_NAME}`;
const existing = runCapture(
`docker ps -aq --filter "name=^${containerName}$"`,
{ ignoreError: true },
).trim();
Comment on lines +1560 to +1563
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In Docker CLI, for docker ps --filter "name=...", what is the documented exact-match pattern behavior? Is ^container-name$always reliable, or is^/container-name$(ordocker container inspect ) recommended for exact name checks?

💡 Result:

  • Documented behavior: For docker ps / docker container ls, the name filter “matches on all or part of a container’s name” (i.e., substring match). Docker’s CLI docs do not document an “exact match” mode for name=. [1]
  • Regex anchoring (^...$) status: Docker’s general filtering docs say some filters support regex, but you must rely on each command’s own reference to know what’s supported—and the docker ps docs for the name filter only describe substring matching, not regex/exact semantics. [2][1]
    ⇒ So --filter "name=^container-name$" is not documented as a reliable exact-match contract.
  • Why people use ^/name$: In the Engine API, container “Names” are represented with a leading slash (e.g., "/coolName"). [4] This is why, in practice, exact-match regex attempts often use ^/container-name$—but that’s still implementation-dependent, not the CLI’s documented promise. [4][1]
  • Recommended exact-name existence check: Use docker inspect --type=container <name> (or docker container inspect <name>). docker inspect is documented to match objects by ID or name (and --type avoids ambiguity with non-container objects). [3]

Bottom line: The documented docker ps --filter name=... behavior is substring matching, not exact matching; for a dependable exact-name check, prefer docker (container) inspect <name> (optionally --type=container). [1][3]

Sources: [1] Docker CLI docker container ls (name filter) docs; [2] Docker “Filter commands” docs; [3] Docker CLI docker inspect docs; [4] Docker Engine API swagger (ContainerSummary.Names examples include leading /).


Replace docker ps substring filter with docker container inspect for reliable exact-name matching (Line 1561).

docker ps --filter "name=..." is documented as substring matching only; regex anchors like ^...$ are not part of the CLI's documented contract and can fail to detect containers. Use docker container inspect <name> for deterministic exact-name/ID lookups.

Suggested hardening
 function cleanupOrphanedGatewayContainer() {
   const containerName = `openshell-cluster-${GATEWAY_NAME}`;
-  const existing = runCapture(
-    `docker ps -aq --filter "name=^${containerName}$"`,
-    { ignoreError: true },
-  ).trim();
-  if (!existing) return false;
+  const exists = run(`docker container inspect ${shellQuote(containerName)} >/dev/null 2>&1`, {
+    ignoreError: true,
+  });
+  if (exists.status !== 0) return false;
   console.log(`  Cleaning up orphaned gateway container (${containerName})...`);
   run(`docker stop ${shellQuote(containerName)} 2>/dev/null || true`, { ignoreError: true });
   run(`docker rm ${shellQuote(containerName)} 2>/dev/null || true`, { ignoreError: true });
   return true;
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@bin/lib/onboard.js` around lines 1560 - 1563, The current use of runCapture
with `docker ps -aq --filter "name=^${containerName}$"` is unreliable because
docker ps does substring matching; replace that call with a deterministic
inspect call: invoke runCapture(`docker container inspect ${containerName}`, {
ignoreError: true }) (or with a --format to return ID) and treat a successful
inspect (non-empty output / zero exit) as the container existing; update the
variable `existing` (and any downstream logic that uses it) to rely on the
inspect result instead of the substring-matching docker ps output while
preserving error-ignore behavior around `runCapture` and using the
`containerName` identifier.

if (!existing) return false;
console.log(` Cleaning up orphaned gateway container (${containerName})...`);
run(`docker stop ${shellQuote(containerName)} 2>/dev/null || true`, { ignoreError: true });
run(`docker rm ${shellQuote(containerName)} 2>/dev/null || true`, { ignoreError: true });
return true;
}

async function ensureNamedCredential(envName, label, helpUrl = null) {
let key = getCredential(envName);
if (key) {
Expand Down Expand Up @@ -1704,12 +1724,21 @@ async function preflight() {
{ port: 18789, label: "NemoClaw dashboard" },
];
for (const { port, label } of requiredPorts) {
const portCheck = await checkPortAvailable(port);
let portCheck = await checkPortAvailable(port);
if (!portCheck.ok) {
if ((port === 8080 || port === 18789) && gatewayReuseState === "healthy") {
console.log(` ✓ Port ${port} already owned by healthy NemoClaw runtime (${label})`);
continue;
}
// Auto-cleanup orphaned gateway container that may be holding the port
// (e.g. after Ctrl+C during a previous onboard). (#1582)
if (cleanupOrphanedGatewayContainer()) {
portCheck = await checkPortAvailable(port);
if (portCheck.ok) {
console.log(` ✓ Port ${port} available after orphaned container cleanup (${label})`);
continue;
}
}
console.error("");
console.error(` !! Port ${port} is not available.`);
console.error(` ${label} needs this port.`);
Expand Down
Loading