perf(e2e): skip redundant beforeAll onboard when TEST_SUITE=full#1631
perf(e2e): skip redundant beforeAll onboard when TEST_SUITE=full#1631brandonpelfrey merged 10 commits intomainfrom
Conversation
The full E2E test (test-full-e2e.sh) immediately destroys the sandbox created by beforeAll, then runs install.sh which creates its own from scratch. The beforeAll onboard wastes ~6 min (358s) of image build + upload time that is thrown away. Skip the beforeAll onboard when TEST_SUITE=full. Other suites (credential-sanitization, telegram-injection, all) still need it and are unaffected. Expected: TEST_SUITE=full drops from ~19 min to ~13-14 min. Fixes #1629 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds a new Vitest end-to-end test suite that provisions a Brev VM, conditionally runs onboarding or deploy flows, executes remote test scripts over SSH for multiple TEST_SUITE variants, validates outputs, and cleans up the instance; also updates a regression guard to reference the new test file path. Changes
Sequence Diagram(s)sequenceDiagram
participant Runner as Test Runner
participant Brev as Brev CLI
participant VM as VM Instance
participant SSH as SSH Session
participant Remote as Remote Script
participant Registry as Sandbox Registry
Runner->>Runner: Validate env vars & brev auth
alt TEST_SUITE=deploy-cli
Runner->>Runner: Run local 'nemoclaw deploy' flow
else create/provision
Runner->>Brev: brev search cpu
Brev-->>Runner: CPU image
Runner->>Brev: brev create --startup-script (or URL)
Brev-->>VM: Provision VM
loop Wait for SSH
Runner->>VM: Poll SSH readiness
VM-->>Runner: SSH ready
end
Runner->>SSH: Download/execute startup script (if applicable)
SSH-->>VM: Run setup
loop Wait for sentinel
Runner->>VM: Check sentinel file
VM-->>Runner: Sentinel present
end
alt TEST_SUITE != full
Runner->>Brev: Start 'nemoclaw onboard' (background)
loop Poll sandbox
Runner->>Brev: openshell sandbox list
Brev-->>Runner: Sandbox status
end
Runner->>Registry: Verify sandbox registry (if applicable)
end
end
Runner->>SSH: Execute remote test script (per TEST_SUITE)
SSH-->>Runner: Streamed test output
Runner->>Runner: Assert 'PASS' present and no 'FAIL:'
alt KEEP_ALIVE != true
Runner->>Brev: brev delete instance (multi-attempt)
Brev-->>VM: Delete
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
test/e2e/brev-e2e.test.js (1)
625-638: Consider extracting the skip condition for consistency.The condition
TEST_SUITE !== "full"is duplicated here becauseneedsBeforeAllSandboxis scoped inside theelseblock. This works correctly, but extracting the condition to the top ofbeforeAllwould improve maintainability if the skip logic ever needs to change.♻️ Optional: Hoist the condition for single source of truth
beforeAll(() => { const bootstrapStart = Date.now(); const elapsed = () => `${Math.round((Date.now() - bootstrapStart) / 1000)}s`; + // Skip beforeAll sandbox creation for full suite (it does its own via install.sh) + const needsBeforeAllSandbox = TEST_SUITE !== "full"; // Pre-cleanup: delete any leftover instance with the same name. // ...existing code...Then at line 461 and line 626, reference the same variable:
- if (TEST_SUITE !== "full") { + if (needsBeforeAllSandbox) { console.log(`[${elapsed()}] Verifying sandbox registry...`);🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/brev-e2e.test.js` around lines 625 - 638, Extract the duplicated condition TEST_SUITE !== "full" into a single boolean variable (e.g., needsBeforeAllSandbox) declared near the top of the test file or at the start of the beforeAll block, then replace the direct uses of TEST_SUITE !== "full" (including the check inside beforeAll and the registry verification block) with that variable; update references to TEST_SUITE in the sandbox-related checks to use needsBeforeAllSandbox so the skip logic is centralized and consistent across beforeAll and the sandbox registry verification.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/e2e/brev-e2e.test.js`:
- Around line 625-638: Extract the duplicated condition TEST_SUITE !== "full"
into a single boolean variable (e.g., needsBeforeAllSandbox) declared near the
top of the test file or at the start of the beforeAll block, then replace the
direct uses of TEST_SUITE !== "full" (including the check inside beforeAll and
the registry verification block) with that variable; update references to
TEST_SUITE in the sandbox-related checks to use needsBeforeAllSandbox so the
skip logic is centralized and consistent across beforeAll and the sandbox
registry verification.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 2b80321b-293a-4953-8335-0a61a56dd898
📒 Files selected for processing (1)
test/e2e/brev-e2e.test.js
|
✅ Brev E2E (full): PASSED on branch |
|
✅ Brev E2E (full): PASSED on branch |
The Brev CLI v0.6.322 does not read BREV_API_TOKEN from the environment — it requires ~/.brev/credentials.json. Without it, `brev ls` fails in the test harness and the entire E2E suite skips. Write the credentials file during the "Install Brev CLI" step so the `hasAuthenticatedBrev` check in brev-e2e.test.js passes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (1)
.github/workflows/e2e-brev.yaml (1)
155-159: Harden credentials file creation (valid JSON + fail-fast + file perms).This block should defensively validate the token, serialize JSON safely, and lock down permissions on
~/.brev/credentials.json.🔧 Proposed patch
- mkdir -p ~/.brev - printf '{"refresh_token":"%s"}' "$BREV_API_TOKEN" > ~/.brev/credentials.json + test -n "$BREV_API_TOKEN" || { echo "BREV_API_TOKEN is required"; exit 1; } + mkdir -p ~/.brev + python - <<'PY' +import json, os, pathlib +path = pathlib.Path.home() / ".brev" / "credentials.json" +path.write_text(json.dumps({"refresh_token": os.environ["BREV_API_TOKEN"]}), encoding="utf-8") +os.chmod(path, 0o600) +PY🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.github/workflows/e2e-brev.yaml around lines 155 - 159, Ensure the BREv credentials creation is defensive: validate that the BREv token variable (BREV_API_TOKEN) is non-empty and exit with an error if missing, serialize JSON using a safe serializer (e.g., jq -n --arg refresh_token "$BREV_API_TOKEN" '{refresh_token:$refresh_token}') rather than naive printf to guarantee valid JSON and proper escaping, write to ~/.brev/credentials.json only after creating ~/.brev (mkdir -p ~/.brev) and ensure the write succeeded, and tighten file permissions with chmod 600 ~/.brev/credentials.json so credentials are not world-readable; also run in fail-fast mode (set -e) or explicitly check each command’s exit status and abort on failure.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In @.github/workflows/e2e-brev.yaml:
- Around line 155-159: Ensure the BREv credentials creation is defensive:
validate that the BREv token variable (BREV_API_TOKEN) is non-empty and exit
with an error if missing, serialize JSON using a safe serializer (e.g., jq -n
--arg refresh_token "$BREV_API_TOKEN" '{refresh_token:$refresh_token}') rather
than naive printf to guarantee valid JSON and proper escaping, write to
~/.brev/credentials.json only after creating ~/.brev (mkdir -p ~/.brev) and
ensure the write succeeded, and tighten file permissions with chmod 600
~/.brev/credentials.json so credentials are not world-readable; also run in
fail-fast mode (set -e) or explicitly check each command’s exit status and abort
on failure.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 616fe699-e66b-448c-9032-6acc49ac68ec
📒 Files selected for processing (1)
.github/workflows/e2e-brev.yaml
cv
left a comment
There was a problem hiding this comment.
LGTM — security review WARNING (non-blocking).
Minor suggestions:
- Consider
jq -n --arg t "$BREV_API_TOKEN" '{"refresh_token":$t}'instead ofprintffor safer JSON generation chmod 600 ~/.brev/credentials.jsonfor good hygiene (ephemeral runner, so low risk)
Otherwise clean:
- No false-negative risk — full suite exercises entire install-to-inference pipeline independently
- No test isolation regression — other suites still run full beforeAll
- Credential handling follows existing workflow pattern
- CI failure is pre-existing (ESLint issue from #1564, fix in #1634)
No blocking concerns.
|
Thanks for the PR. Since this branch does not allow maintainer edits, I can't port it across the recently merged mechanical JS→TS migration stack for you. Please merge or rebase If your branch touches migrated root CLI files or renamed root tests, please run: git fetch origin
git merge origin/main
npm run ts-migration:assist -- --base origin/main --write
npm run build:cli
npm run typecheck:cli
npm run lint
npm testThen push the updated branch. If your branch does not touch migrated root CLI paths, a normal merge/rebase from |
Keep main's hardened Brev credential setup (token validation, chmod 600, onboarding_step.json, brev ls smoke check). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The TS migration guard blocks edits to test/*.test.js files. Move the skip-redundant-beforeAll changes to the .ts counterpart and reset the .js file to match main. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (5)
test/e2e/brev-e2e.test.ts (5)
80-86: Consider logging errors instead of silently returning empty array.Swallowing all exceptions makes it harder to debug when
brev lsfails unexpectedly (e.g., auth expiry, CLI issues). A debug log would help:♻️ Log errors for debugging
function listBrevInstances() { try { return JSON.parse(brev("ls", "--json")); - } catch { + } catch (err) { + console.debug(`[listBrevInstances] brev ls failed: ${err.message}`); return []; } }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/brev-e2e.test.ts` around lines 80 - 86, The function listBrevInstances currently swallows all exceptions and returns an empty array, which hides failures from brev("ls", "--json"); update the catch to capture the error (e) and log a useful debug message (e.g., include context "listBrevInstances: brev ls failed" and the error) before returning []; reference the listBrevInstances function and the brev("ls", "--json") invocation and use the existing test logger or console.error for the log so failures are visible during test runs.
1-1: Consider removing@ts-nocheckand adding proper type annotations.Using
@ts-nocheckin a TypeScript file disables all type checking, negating the benefits of the TS migration. Variables likeremoteDir(line 63) and function parameters throughout lack type annotations, which could hide runtime type errors.If full typing is deferred, consider at minimum adding types for the key helper functions and module-level variables, then replacing
@ts-nocheckwith targeted@ts-expect-errorcomments for specific untyped areas.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/brev-e2e.test.ts` at line 1, The test file starts with a blanket "@ts-nocheck" which disables all TypeScript checking; remove that and add targeted types for key module-level variables and helper functions (e.g., type remoteDir, any async helpers and function parameters used in the test suite) so the compiler can validate them; if there are small, unavoidable type errors keep the file checked-in by replacing "@ts-nocheck" with specific "@ts-expect-error" comments only where necessary and export or annotate helper functions and variables (like remoteDir, any setup/teardown helpers, and test utility functions) with appropriate types/interfaces to restore meaningful type checking.
123-132: Consider validatingINSTANCE_NAMEto prevent shell injection.
INSTANCE_NAMEis interpolated directly into the shell command at line 128 without escaping. While this is CI-controlled, adding validation ensures safety if the value comes from untrusted sources in the future:🛡️ Add validation for INSTANCE_NAME
const INSTANCE_NAME = process.env.INSTANCE_NAME; +// Validate INSTANCE_NAME contains only safe characters (alphanumeric, dash, underscore) +if (INSTANCE_NAME && !/^[\w-]+$/.test(INSTANCE_NAME)) { + throw new Error(`Invalid INSTANCE_NAME: must contain only alphanumeric, dash, or underscore characters`); +}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/brev-e2e.test.ts` around lines 123 - 132, The ssh helper interpolates INSTANCE_NAME directly into a shell command, risking shell injection; add validation in the ssh function before building the command: verify INSTANCE_NAME is present and matches a safe pattern (e.g. permitted characters like alphanumerics, dots, dashes, underscores, @ and colon/port) or otherwise throw an error, or alternatively switch to a non-shell exec variant that accepts argv; update the validation logic near the top of function ssh and reference the INSTANCE_NAME variable to ensure only allowed values are used.
644-655: Consider adding a timeout toafterAllto prevent indefinite hangs.If
deleteBrevInstanceencounters issues, the test could hang indefinitely. Vitest allows specifying a timeout forafterAll:♻️ Add timeout to afterAll
- afterAll(() => { + afterAll(async () => { if (!instanceCreated) return; // ... existing logic ... - }); + }, 300_000); // 5 min timeout for cleanup🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/brev-e2e.test.ts` around lines 644 - 655, The afterAll cleanup can hang if deleteBrevInstance stalls; change the call to use Vitest's timeout signature by passing a timeout (e.g. 30_000 ms) as the second argument to afterAll and make the callback async so you can await deleteBrevInstance(INSTANCE_NAME) if it returns a promise; also wrap the delete call in try/catch and rethrow on failure so the test teardown fails fast (refer to afterAll, deleteBrevInstance, INSTANCE_NAME, process.env.KEEP_ALIVE).
76-78: Consider using a native synchronous sleep instead of spawning a shell process.
execSync("sleep ...")forks a shell process for each sleep. A lighter approach:♻️ Native synchronous sleep
function sleep(seconds) { - execSync(`sleep ${seconds}`); + Atomics.wait(new Int32Array(new SharedArrayBuffer(4)), 0, 0, seconds * 1000); }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/brev-e2e.test.ts` around lines 76 - 78, The sleep function currently spawns a shell via execSync which is heavy; replace the execSync(`sleep ${seconds}`) call in the sleep function with a native synchronous delay using Atomics.wait on a SharedArrayBuffer (compute ms = seconds*1000 and call Atomics.wait on a 1-int32 buffer with the timeout) so the function blocks synchronously without forking a process; update any imports/usages if necessary and keep the function name sleep unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@test/e2e/brev-e2e.test.ts`:
- Around line 80-86: The function listBrevInstances currently swallows all
exceptions and returns an empty array, which hides failures from brev("ls",
"--json"); update the catch to capture the error (e) and log a useful debug
message (e.g., include context "listBrevInstances: brev ls failed" and the
error) before returning []; reference the listBrevInstances function and the
brev("ls", "--json") invocation and use the existing test logger or
console.error for the log so failures are visible during test runs.
- Line 1: The test file starts with a blanket "@ts-nocheck" which disables all
TypeScript checking; remove that and add targeted types for key module-level
variables and helper functions (e.g., type remoteDir, any async helpers and
function parameters used in the test suite) so the compiler can validate them;
if there are small, unavoidable type errors keep the file checked-in by
replacing "@ts-nocheck" with specific "@ts-expect-error" comments only where
necessary and export or annotate helper functions and variables (like remoteDir,
any setup/teardown helpers, and test utility functions) with appropriate
types/interfaces to restore meaningful type checking.
- Around line 123-132: The ssh helper interpolates INSTANCE_NAME directly into a
shell command, risking shell injection; add validation in the ssh function
before building the command: verify INSTANCE_NAME is present and matches a safe
pattern (e.g. permitted characters like alphanumerics, dots, dashes,
underscores, @ and colon/port) or otherwise throw an error, or alternatively
switch to a non-shell exec variant that accepts argv; update the validation
logic near the top of function ssh and reference the INSTANCE_NAME variable to
ensure only allowed values are used.
- Around line 644-655: The afterAll cleanup can hang if deleteBrevInstance
stalls; change the call to use Vitest's timeout signature by passing a timeout
(e.g. 30_000 ms) as the second argument to afterAll and make the callback async
so you can await deleteBrevInstance(INSTANCE_NAME) if it returns a promise; also
wrap the delete call in try/catch and rethrow on failure so the test teardown
fails fast (refer to afterAll, deleteBrevInstance, INSTANCE_NAME,
process.env.KEEP_ALIVE).
- Around line 76-78: The sleep function currently spawns a shell via execSync
which is heavy; replace the execSync(`sleep ${seconds}`) call in the sleep
function with a native synchronous delay using Atomics.wait on a
SharedArrayBuffer (compute ms = seconds*1000 and call Atomics.wait on a 1-int32
buffer with the timeout) so the function blocks synchronously without forking a
process; update any imports/usages if necessary and keep the function name sleep
unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 331b3164-4ad4-45e7-80fe-7dedc213f131
📒 Files selected for processing (1)
test/e2e/brev-e2e.test.ts
The TS migration guard blocks edits to test/*.test.js files. Rename .js → .ts, apply the skip-redundant-beforeAll changes to the canonical TypeScript path, and update runner.test.ts references to the new filename. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…b.com/NVIDIA/NemoClaw into perf/skip-redundant-beforeall-onboard
The merge with remote reverted the skip-redundant-beforeAll changes. Re-apply them to the .ts file after the merge. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Summary
beforeAllonboard whenTEST_SUITE=full— the full test immediately destroys the sandbox and rebuilds from scratch viainstall.sh --non-interactivecredential-sanitization,telegram-injection,all) are unaffected — they still get thebeforeAllsandboxTEST_SUITE=fulldrops from ~19 min to ~13-14 min (~5 min saved)Evidence
From run 23957694836:
install.sh --non-interactivewhich creates its ownThe beforeAll onboard is redundant for the
fullsuite.Test plan
TEST_SUITE=fullE2E — confirm all 17 checks still PASSTEST_SUITE=credential-sanitizationE2E — confirm beforeAll onboard still runsFixes #1629
Summary by CodeRabbit