Skip to content

LCORE-1589: Human-in-the-Loop Spike#1459

Open
jrobertboos wants to merge 6 commits intolightspeed-core:mainfrom
jrobertboos:lcore-1589-spike-human-in-the-loop
Open

LCORE-1589: Human-in-the-Loop Spike#1459
jrobertboos wants to merge 6 commits intolightspeed-core:mainfrom
jrobertboos:lcore-1589-spike-human-in-the-loop

Conversation

@jrobertboos
Copy link
Copy Markdown
Contributor

@jrobertboos jrobertboos commented Apr 6, 2026

Description

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up service version
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change
  • Unit tests improvement
  • Integration tests improvement
  • End to end tests improvement
  • Benchmarks improvement

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

  • Assisted-by: (e.g., Claude, CodeRabbit, Ollama, etc., N/A if not used)
  • Generated by: (e.g., tool name and version; N/A if not used)

Related Tickets & Documents

  • Related Issue #
  • Closes #

Checklist before requesting a review

  • I have performed a self-review of my code.
  • PR has passed all pre-merge test jobs.
  • If it is a core feature, I have added thorough tests.

Testing

  • Please provide detailed steps to perform tests related to this code change.
  • How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

  • Documentation
    • Added comprehensive Human-in-the-Loop design docs and a design spike describing approval-gated control flow for MCP tool calls.
    • Defines client-visible behaviors: asynchronous "requires_action" responses, streaming approval events, list/view/submit approval APIs, per-request approval scope, TTL/expiration, and approval timeouts.
    • Documents expected responses, error cases, auth boundaries, storage/cleanup, backward-compatible defaults, and next implementation/test/operator work items.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai bot commented Apr 6, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review

Walkthrough

Adds two documentation-only Human-in-the-Loop design artifacts specifying an asynchronous per-invocation MCP approval flow: new requires_action response shape, required_action payload, /approvals APIs, approval persistence with TTL, config options (including require_approval default "never"), auth/error semantics, and an implementation roadmap.

Changes

Cohort / File(s) Summary
Human-in-the-Loop Design Docs
docs/design/human-in-the-loop/human-in-the-loop-spike.md, docs/design/human-in-the-loop/human-in-the-loop.md
Adds a spike and full spec for async MCP approval: behavior when mcp_approval_request is emitted, HTTP 200 + status:"requires_action" with required_action payload, new /approvals endpoints (GET /approvals, GET /approvals/{id}, POST /approvals/{id}), approval persistence (cache-backed TTL or new approval_requests table), approval timeout default (300s), per-request scoping, allow/deny permanence rules, authz and error matrix, streaming behavior, and a list of implementation, testing, and docs work items.

Sequence Diagram(s)

sequenceDiagram
  participant Client as Client
  participant LCS as LCS (server)
  participant Llama as Llama Stack (agent)
  participant Store as Approval Storage (cache/DB)
  participant MCP as MCP Tool/Server

  Client->>LCS: query/stream request (may invoke MCP)
  LCS->>Llama: forward request
  Llama->>LCS: emits mcp_approval_request (requires_action)
  LCS->>Store: persist approval request (TTL)
  LCS-->>Client: HTTP 200 / stream event with status:"requires_action" + required_action
  Client->>LCS: POST /approvals/{id} (approve/deny)
  LCS->>Store: validate & fetch approval request
  LCS->>Llama: send mcp_approval_response (approved/denied)
  alt approved
    Llama->>MCP: resume/execute MCP call
    MCP->>Llama: tool response
    Llama->>LCS: final output / continue stream
    LCS-->>Client: return/stream result
  else denied
    Llama->>LCS: handle denial (terminate or continue)
    LCS-->>Client: return error/status update
  end
Loading

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'LCORE-1589: Human-in-the-Loop Spike' directly references the spike document that is the core change in this PR, accurately reflecting the main contribution.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
✨ Simplify code
  • Create PR with simplified code

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@jrobertboos jrobertboos marked this pull request as ready for review April 6, 2026 15:41
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 6

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/design/human-in-the-loop/human-in-the-loop-spike.md`:
- Around line 137-139: Replace all placeholder ticket keys "LCORE-????" in this
document (notably in the story header "Add `require_approval` configuration for
MCP servers" and the other repeated headings/keys) with real issue keys or
explicitly mark them as TODO outside merge scope; update every instance so the
design links to a traceable ticket and regenerate any related anchors/README
references if used to ensure cross-references remain valid.
- Line 72: Fix the markdown fence lint violations by adding explicit language
identifiers to all fenced code blocks and ensuring required blank lines before
and after each fence; e.g., replace bare/unnamed fences near the table row
containing | B. "always" | Secure by default, opt-out for trusted tools | and
the suggested fences like ```text and ```sql with properly spaced fences (blank
line above and below) and add a language tag (e.g., ```text, ```sql) so every
fenced block in human-in-the-loop-spike.md (also at the other listed
occurrences) follows markdownlint MD040/MD031 rules.

In `@docs/design/human-in-the-loop/human-in-the-loop.md`:
- Around line 337-355: The docs introduce a new "requires_action" response shape
for /query and /streaming_query (payload keys like conversation_id,
status="requires_action", required_action.type="mcp_approval", approvals[] with
id/server_label/tool_name/arguments/expires_at) but the doc still claims "No API
changes"; update the API contract by explicitly modeling the response as a
union/oneOf (e.g., QueryResponse = oneOf{QueryResult, RequiresActionResult}) or
call out version-gating; clearly list both shapes and add OpenAPI/typed-client
guidance so consumers of query and streaming_query know to handle both variants
(or upgrade the endpoint version).
- Around line 307-309: Choose one authorization error semantics (either "403
Forbidden for unauthorized/ownership failures" or "404 Not Found for not owned
by user") and apply it consistently: update the lines containing "Response
(404): Approval not found or not owned by user" and any other endpoint
descriptions that mention ownership/authorization to use the chosen status, and
update the error matrix entry that currently lists 403 so it matches. Search for
the exact phrase "Approval not found or not owned by user" and the error-matrix
rows that mention 403/404 (sections referenced around the existing text) and
change the status codes and any explanatory text so all endpoint sections and
the matrix use the same policy.
- Around line 392-393: Update the phrasing in the "5. **Rate limiting**" bullet
so the sentence reads "Approval endpoints should be rate-limited to prevent
abuse (handled by existing rate limiting infrastructure)." — find the bullet
titled "Rate limiting" and replace the unhyphenated "should be rate limited" (or
similar) with the hyphenated "should be rate-limited" to ensure grammatical
consistency.
- Around line 93-103: The approval continuation flow is ambiguous: clarify
whether POST /approvals/{id} resumes the agent and returns the assistant's final
output or if the client must call /query again. Update human-in-the-loop.md (and
the related diagram lines ~310-329) to explicitly state the contract for POST
/approvals/{id}: either (A) POST /approvals/{id} returns 200 with approval
metadata plus resumed agent output in a named field (e.g., mcp_approval_response
includes "resumed_output" or "final_response" and any tool result), or (B) POST
/approvals/{id} returns only approval status metadata and the client is required
to re-issue a GET/POST to /query to retrieve the agent’s continued output. Pick
one behavior, update the sequence diagram labels (mcp_approval_response, tool
result + response) and document exact JSON shape and status codes for the chosen
contract so clients know how to continue.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 9d842f74-dc75-4099-be6a-11e98dcf5265

📥 Commits

Reviewing files that changed from the base of the PR and between 7b29914 and 6d6f244.

📒 Files selected for processing (2)
  • docs/design/human-in-the-loop/human-in-the-loop-spike.md
  • docs/design/human-in-the-loop/human-in-the-loop.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: E2E: server mode / ci
🧰 Additional context used
🪛 LanguageTool
docs/design/human-in-the-loop/human-in-the-loop.md

[style] ~61-~61: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...re interaction - U3: As a user, I want to see what a tool will do before it runs,...

(REP_WANT_TO_VB)


[style] ~63-~63: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...unwanted changes - U4: As a user, I want to deny a tool invocation, so that I can s...

(REP_WANT_TO_VB)


[grammar] ~393-~393: Use a hyphen to join words.
Context: ...event abuse (handled by existing rate limiting infrastructure). ### Migration...

(QB_NEW_EN_HYPHEN)

🪛 markdownlint-cli2 (0.22.0)
docs/design/human-in-the-loop/human-in-the-loop-spike.md

[warning] 72-72: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 128-128: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 150-150: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 272-272: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 294-294: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 315-315: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 323-323: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 338-338: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 358-358: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

| Option | Description |
|--------|-------------|
| A. `"never"` (current behavior) | Backwards compatible, opt-in HIL |
| B. `"always"` | Secure by default, opt-out for trusted tools |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fix markdown fence lint violations (MD040/MD031).

There is at least one fenced block without a language (Line 72) and multiple fences missing required blank lines around them. This will keep docs lint noisy/failing where markdownlint is enforced.

Suggested doc cleanup
-```
+```text
 ...
-```
+```

+ 
 ```sql
 ...
</details>




Also applies to: 128-128, 150-150, 272-272, 294-294, 315-315, 323-323, 338-338, 358-358

<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 markdownlint-cli2 (0.22.0)</summary>

[warning] 72-72: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

</details>

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @docs/design/human-in-the-loop/human-in-the-loop-spike.md at line 72, Fix the
markdown fence lint violations by adding explicit language identifiers to all
fenced code blocks and ensuring required blank lines before and after each
fence; e.g., replace bare/unnamed fences near the table row containing | B.
"always" | Secure by default, opt-out for trusted tools | and the suggested
fences like text and sql with properly spaced fences (blank line above and
below) and add a language tag (e.g., text, sql) so every fenced block in
human-in-the-loop-spike.md (also at the other listed occurrences) follows
markdownlint MD040/MD031 rules.


</details>

<!-- fingerprinting:phantom:triton:hawk:9b03e7ec-f5cf-4fea-8293-25b4eb3bd5f8 -->

<!-- This is an auto-generated comment by CodeRabbit -->

Comment on lines +137 to +139
<!-- type: Story -->
<!-- key: LCORE-???? -->
### LCORE-???? Add `require_approval` configuration for MCP servers
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Replace placeholder ticket keys before merge.

LCORE-???? placeholders make the execution plan non-actionable and break traceability from design to delivery. Please resolve these to real issue keys (or explicitly mark as TODO outside merge scope).

Also applies to: 167-169, 194-196, 222-224, 250-253, 275-277, 301-304

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop-spike.md` around lines 137 -
139, Replace all placeholder ticket keys "LCORE-????" in this document (notably
in the story header "Add `require_approval` configuration for MCP servers" and
the other repeated headings/keys) with real issue keys or explicitly mark them
as TODO outside merge scope; update every instance so the design links to a
traceable ticket and regenerate any related anchors/README references if used to
ensure cross-references remain valid.

Comment on lines +93 to +103
│ │ POST /approvals/{id} │ │
│ │ ────────────────────────────► │
│ │ {approve: true} │ │
│ │ │ │
│ │ │ ───── mcp_approval_response ──────────►│
│ │ │ │
│ │ │ ◄────── tool result + response ────────┤
│ │ │ │
│ │ ◄─── 200 OK ────────────────┤ │
│ │ (final response) │ │
│ │
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Approval continuation flow is ambiguous after POST /approvals/{id}.

The diagram implies a final assistant response is returned after approval, but the endpoint example returns only approval status metadata. Please define the exact continuation contract (e.g., whether client must call /query again, or approval endpoint can return resumed agent output).

Also applies to: 310-329

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop.md` around lines 93 - 103,
The approval continuation flow is ambiguous: clarify whether POST
/approvals/{id} resumes the agent and returns the assistant's final output or if
the client must call /query again. Update human-in-the-loop.md (and the related
diagram lines ~310-329) to explicitly state the contract for POST
/approvals/{id}: either (A) POST /approvals/{id} returns 200 with approval
metadata plus resumed agent output in a named field (e.g., mcp_approval_response
includes "resumed_output" or "final_response" and any tool result), or (B) POST
/approvals/{id} returns only approval status metadata and the client is required
to re-issue a GET/POST to /query to retrieve the agent’s continued output. Pick
one behavior, update the sequence diagram labels (mcp_approval_response, tool
result + response) and document exact JSON shape and status codes for the chosen
contract so clients know how to continue.

Comment on lines +307 to +309
Response (404): Approval not found or not owned by user
Response (410): Approval expired

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Resolve conflicting authorization error semantics (404 vs 403).

Line 307/331 state “not found or not owned by user” as 404, but the error matrix includes 403 forbidden for unauthorized access. Pick one policy and apply it consistently across endpoint sections and the matrix.

Also applies to: 331-334, 369-375

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop.md` around lines 307 - 309,
Choose one authorization error semantics (either "403 Forbidden for
unauthorized/ownership failures" or "404 Not Found for not owned by user") and
apply it consistently: update the lines containing "Response (404): Approval not
found or not owned by user" and any other endpoint descriptions that mention
ownership/authorization to use the chosen status, and update the error matrix
entry that currently lists 403 so it matches. Search for the exact phrase
"Approval not found or not owned by user" and the error-matrix rows that mention
403/404 (sections referenced around the existing text) and change the status
codes and any explanatory text so all endpoint sections and the matrix use the
same policy.

Comment on lines +337 to +355
Query and streaming_query endpoints may return:
```json
{
"conversation_id": "conv_xyz",
"status": "requires_action",
"required_action": {
"type": "mcp_approval",
"approvals": [
{
"id": "apr_abc123",
"server_label": "github-mcp",
"tool_name": "create_issue",
"arguments": {"title": "Bug report", "body": "..."},
"expires_at": "2026-04-01T10:05:00Z"
}
]
}
}
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Clarify response contract and versioning for requires_action.

This introduces a second response shape for /query/streaming while Line 401 says “No API changes to existing endpoints.” That is effectively an API contract change for typed clients and OpenAPI consumers. Please explicitly define this as a union/oneOf response model (or version-gate it) to avoid runtime/client breakage.

Also applies to: 397-402

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop.md` around lines 337 - 355,
The docs introduce a new "requires_action" response shape for /query and
/streaming_query (payload keys like conversation_id, status="requires_action",
required_action.type="mcp_approval", approvals[] with
id/server_label/tool_name/arguments/expires_at) but the doc still claims "No API
changes"; update the API contract by explicitly modeling the response as a
union/oneOf (e.g., QueryResponse = oneOf{QueryResult, RequiresActionResult}) or
call out version-gating; clearly list both shapes and add OpenAPI/typed-client
guidance so consumers of query and streaming_query know to handle both variants
(or upgrade the endpoint version).

Comment thread docs/design/human-in-the-loop/human-in-the-loop.md
Comment thread docs/design/human-in-the-loop/human-in-the-loop-spike.md Outdated
Comment thread docs/design/human-in-the-loop/human-in-the-loop-spike.md Outdated
Comment thread docs/design/human-in-the-loop/human-in-the-loop-spike.md
Comment on lines +74 to +75
**Recommendation**: Option A (`"never"`). Maintains backwards compatibility.
Operators explicitly enable HIL for servers with write operations.
Copy link
Copy Markdown
Contributor

@anik120 anik120 Apr 6, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we concerned with backward compatibility yet? If we're still in the prototype landscape with our clients, is it better to switch this to "always" so that this change is more immediately surfaced to our clients?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should maintain backwards compatibility instead of introducing any breaking changes.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tisnik do you have any thoughts on this?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unfortunately we are now in "stable" and "long term support" state, so any breaking changes CAN be made, but they should be configurable (and opt-in, not opt-out). @jrobertboos

Comment thread docs/design/human-in-the-loop/human-in-the-loop.md Outdated
Comment thread docs/design/human-in-the-loop/human-in-the-loop.md
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (4)
docs/design/human-in-the-loop/human-in-the-loop.md (3)

93-103: ⚠️ Potential issue | 🟠 Major

Clarify the continuation contract after POST /approvals/{id}.

The doc still implies two behaviors (approval endpoint returns final response vs client must re-query). Define one explicit contract and keep examples/diagram consistent.

Also applies to: 322-329, 556-558

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop.md` around lines 93 - 103,
The documentation is ambiguous about whether POST /approvals/{id} returns the
final response or requires the client to re-query; pick one behavior and update
the contract and diagrams to match: if you choose synchronous final response,
state that POST /approvals/{id} returns 200 with the final tool result
(mcp_approval_response) and remove any arrows/notes implying re-query; if you
choose asynchronous, state that POST /approvals/{id} returns an acknowledgment
and clients must GET /approvals/{id} (or another polling/webhook) to receive the
final tool result, and update the diagram and all examples (references to POST
/approvals/{id}, mcp_approval_response, and "tool result + response") to
consistently reflect that single chosen flow.

337-355: ⚠️ Potential issue | 🟠 Major

Document /query/streaming_query as a response union instead of “no API changes.”

Adding requires_action introduces a new response variant; this is an API contract change for typed consumers and should be modeled explicitly.

Also applies to: 397-402

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop.md` around lines 337 - 355,
The docs currently state “no API changes” but the addition of the
requires_action response is a new response variant for the /query and
/streaming_query endpoints; update the documentation in human-in-the-loop.md to
document the endpoints’ response as a union type that includes the new
requires_action shape (with conversation_id, status, and
required_action.mcp_approval structure including approvals items and expires_at)
instead of claiming no API changes, and add the same union-style documentation
for the other occurrence referenced (around lines ~397-402) so typed consumers
can see the new contract.

307-308: ⚠️ Potential issue | 🟠 Major

Unify authorization semantics (404 vs 403) across endpoint docs and error matrix.

The endpoint sections and error table still conflict on ownership/authorization failures. Pick one policy and apply it consistently.

Also applies to: 331-332, 369-375

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop.md` around lines 307 - 308,
The docs currently mix 404 and 403 for authorization/ownership failures (e.g.,
the lines "Response (404): Approval not found or not owned by user" and
"Response (410): Approval expired"); pick one canonical policy (recommend using
403 for "not owned/forbidden" and 404 only for truly non-existent resources) and
apply it consistently across the endpoint descriptions and the error matrix
including the other affected sections (around the blocks noted at 331-332 and
369-375); update the response codes and any corresponding error text so all
references to ownership/authorization failures use the chosen status code and
adjust examples/messages to match.
docs/design/human-in-the-loop/human-in-the-loop-spike.md (1)

138-139: ⚠️ Potential issue | 🟠 Major

Replace unresolved LCORE-???? placeholders with real ticket IDs before merge.

These placeholders still make the implementation plan non-actionable and untraceable to delivery work items.

Also applies to: 167-168, 194-195, 222-223, 251-252, 276-277, 302-303

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop-spike.md` around lines 138 -
139, Replace every unresolved placeholder "LCORE-????" in this document (e.g.,
the heading "### LCORE-???? Add `require_approval` configuration for MCP
servers" and the other repeated occurrences) with the actual ticket/issue IDs
that track the work; update all instances so the headings and inline keys
reference real, authoritative ticket numbers before merging.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/design/human-in-the-loop/human-in-the-loop.md`:
- Around line 72-105: The ASCII diagram code fence in the human-in-the-loop doc
is unlabeled and triggers markdown lint; update the fenced block around the
Request Flow diagram in human-in-the-loop.md by adding a language identifier
(e.g., ```text) to the opening fence so the diagram is treated as plain text and
linting passes, leaving the content unchanged.

---

Duplicate comments:
In `@docs/design/human-in-the-loop/human-in-the-loop-spike.md`:
- Around line 138-139: Replace every unresolved placeholder "LCORE-????" in this
document (e.g., the heading "### LCORE-???? Add `require_approval` configuration
for MCP servers" and the other repeated occurrences) with the actual
ticket/issue IDs that track the work; update all instances so the headings and
inline keys reference real, authoritative ticket numbers before merging.

In `@docs/design/human-in-the-loop/human-in-the-loop.md`:
- Around line 93-103: The documentation is ambiguous about whether POST
/approvals/{id} returns the final response or requires the client to re-query;
pick one behavior and update the contract and diagrams to match: if you choose
synchronous final response, state that POST /approvals/{id} returns 200 with the
final tool result (mcp_approval_response) and remove any arrows/notes implying
re-query; if you choose asynchronous, state that POST /approvals/{id} returns an
acknowledgment and clients must GET /approvals/{id} (or another polling/webhook)
to receive the final tool result, and update the diagram and all examples
(references to POST /approvals/{id}, mcp_approval_response, and "tool result +
response") to consistently reflect that single chosen flow.
- Around line 337-355: The docs currently state “no API changes” but the
addition of the requires_action response is a new response variant for the
/query and /streaming_query endpoints; update the documentation in
human-in-the-loop.md to document the endpoints’ response as a union type that
includes the new requires_action shape (with conversation_id, status, and
required_action.mcp_approval structure including approvals items and expires_at)
instead of claiming no API changes, and add the same union-style documentation
for the other occurrence referenced (around lines ~397-402) so typed consumers
can see the new contract.
- Around line 307-308: The docs currently mix 404 and 403 for
authorization/ownership failures (e.g., the lines "Response (404): Approval not
found or not owned by user" and "Response (410): Approval expired"); pick one
canonical policy (recommend using 403 for "not owned/forbidden" and 404 only for
truly non-existent resources) and apply it consistently across the endpoint
descriptions and the error matrix including the other affected sections (around
the blocks noted at 331-332 and 369-375); update the response codes and any
corresponding error text so all references to ownership/authorization failures
use the chosen status code and adjust examples/messages to match.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 22330491-d6b6-4565-b487-db4717d039c7

📥 Commits

Reviewing files that changed from the base of the PR and between 6d6f244 and 0ed23a6.

📒 Files selected for processing (2)
  • docs/design/human-in-the-loop/human-in-the-loop-spike.md
  • docs/design/human-in-the-loop/human-in-the-loop.md
📜 Review details
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2026-04-05T12:19:36.009Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Applies to src/**/*.py : Use `from llama_stack_client import AsyncLlamaStackClient` for Llama Stack imports

Applied to files:

  • docs/design/human-in-the-loop/human-in-the-loop-spike.md
🪛 LanguageTool
docs/design/human-in-the-loop/human-in-the-loop.md

[style] ~61-~61: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...re interaction - U3: As a user, I want to see what a tool will do before it runs,...

(REP_WANT_TO_VB)


[style] ~63-~63: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...unwanted changes - U4: As a user, I want to deny a tool invocation, so that I can s...

(REP_WANT_TO_VB)


[grammar] ~393-~393: Use a hyphen to join words.
Context: ...event abuse (handled by existing rate limiting infrastructure). ### Migration...

(QB_NEW_EN_HYPHEN)

Comment on lines +72 to +105
```
┌──────────────────────────────────────────────────────────────────────────────┐
│ Request Flow │
├──────────────────────────────────────────────────────────────────────────────┤
│ │
│ ┌────────┐ POST /query ┌─────────────────┐ responses.create ┌───────────┐
│ │ Client │ ───────────────► │ LCS Query │ ───────────────────► │ Llama │
│ │ │ │ Endpoint │ │ Stack │
│ └────────┘ └─────────────────┘ └───────────┘
│ │ │ │
│ │ │ ◄─────── mcp_approval_request ─────────┤
│ │ │ │
│ │ ┌──────▼──────┐ │
│ │ │ Store │ │
│ │ │ Approval │ │
│ │ │ Request │ │
│ │ └──────┬──────┘ │
│ │ │ │
│ │ ◄─── 200 OK ────────────────┤ │
│ │ status: requires_action│ │
│ │ │ │
│ │ POST /approvals/{id} │ │
│ │ ────────────────────────────► │
│ │ {approve: true} │ │
│ │ │ │
│ │ │ ───── mcp_approval_response ──────────►│
│ │ │ │
│ │ │ ◄────── tool result + response ────────┤
│ │ │ │
│ │ ◄─── 200 OK ────────────────┤ │
│ │ (final response) │ │
│ │
└──────────────────────────────────────────────────────────────────────────────┘
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add a language identifier to the ASCII diagram code fence.

This fenced block is unlabeled and will trigger markdown lint in stricter setups.

Suggested fix
-```
+```text
 ┌──────────────────────────────────────────────────────────────────────────────┐
 │                              Request Flow                                    │
 ...
 └──────────────────────────────────────────────────────────────────────────────┘
</details>

<!-- suggestion_start -->

<details>
<summary>📝 Committable suggestion</summary>

> ‼️ **IMPORTANT**
> Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

```suggestion

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop.md` around lines 72 - 105,
The ASCII diagram code fence in the human-in-the-loop doc is unlabeled and
triggers markdown lint; update the fenced block around the Request Flow diagram
in human-in-the-loop.md by adding a language identifier (e.g., ```text) to the
opening fence so the diagram is treated as plain text and linting passes,
leaving the content unchanged.

@tisnik
Copy link
Copy Markdown
Contributor

tisnik commented Apr 7, 2026

PR Reviewer Guide 🔍

Here are some key observations to aid the review process:

⏱️ Estimated effort to review: 1 🔵⚪⚪⚪⚪
🧪 No relevant tests
🔒 No security concerns identified
⚡ No major issues detected

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
docs/design/human-in-the-loop/human-in-the-loop-spike.md (2)

156-159: ⚠️ Potential issue | 🟡 Minor

Fix MD031: fenced blocks need blank lines around them.

Several fenced blocks still violate markdownlint MD031, which will keep docs lint noisy/failing where enforced.

Suggested markdown cleanup (example pattern)
 **Agentic tool instruction**:
+
 ```text
 Read the "Configuration" section in docs/design/human-in-the-loop/human-in-the-loop.md.
 Key files: src/models/config.py, src/configuration.py.
</details>

 


Also applies to: 183-186, 211-214, 240-243, 265-268, 291-294, 315-318, 343-355, 437-444

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @docs/design/human-in-the-loop/human-in-the-loop-spike.md around lines 156 -
159, Several fenced code blocks in the human-in-the-loop spike doc are missing
blank lines before/after the fences (MD031); update each fenced block (e.g., the block starting withtext containing "Read the "Configuration"
section...") to ensure there is a blank line both immediately before the opening
and immediately after the closing; apply the same fix to the other
listed blocks (the repeated occurrences noted in the comment) so every
triple-backtick fenced block in that document has surrounding blank lines.


</details>

---

`135-136`: _⚠️ Potential issue_ | _🟠 Major_

**Replace all `LCORE-????` placeholders before merge.**

These placeholders still make the execution plan non-traceable and non-actionable across implementation tasks.

 


Also applies to: 164-165, 191-193, 219-221, 248-250, 273-275, 299-301

<details>
<summary>🤖 Prompt for AI Agents</summary>

```
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop-spike.md` around lines 135 -
136, The document contains placeholder task keys "LCORE-????" (e.g., in the
heading "### LCORE-???? Add `require_approval` configuration for MCP servers"
and the other occurrences around the file) that must be replaced with the real
issue/ticket IDs before merging; update every instance of "LCORE-????" to the
correct ticket key(s), ensure the headings remain consistent with those IDs (and
update any cross-references or TOC entries that reference them), and verify
there are no remaining "LCORE-????" placeholders anywhere in the file.
```

</details>

</blockquote></details>

</blockquote></details>

<details>
<summary>🤖 Prompt for all review comments with AI agents</summary>

Verify each finding against the current code and only fix it if needed.

Inline comments:
In @docs/design/human-in-the-loop/human-in-the-loop-spike.md:

  • Around line 198-201: Update the POST /approvals/{id} spec to define
    idempotency and terminal-state semantics: state that approvals have terminal
    states (APPROVED, DENIED, EXPIRED) which are immutable once set; require clients
    to send an Idempotency-Key header (or client-provided request_id) and specify
    that repeated POSTs with the same key return the original outcome (200 with the
    decision payload) rather than re-applying; specify concurrent-submit handling
    via optimistic concurrency (include a version/etag on GET /approvals/{id} and
    require it or use server-side compare-and-set) and that CAS failures return 409
    Conflict with the current state; define expired/already-decided behavior to
    return 410 Gone for expired or 409/200 depending on idempotency key for
    already-decided requests, and include clear response payload fields (decision,
    decided_by, decided_at, idempotency_key, version) and the exact HTTP status
    codes for each case so clients can implement deterministic retry logic.

Duplicate comments:
In @docs/design/human-in-the-loop/human-in-the-loop-spike.md:

  • Around line 156-159: Several fenced code blocks in the human-in-the-loop spike
    doc are missing blank lines before/after the fences (MD031); update each fenced block (e.g., the block starting withtext containing "Read the
    "Configuration" section...") to ensure there is a blank line both immediately
    before the opening and immediately after the closing; apply the same fix
    to the other listed blocks (the repeated occurrences noted in the comment) so
    every triple-backtick fenced block in that document has surrounding blank lines.
  • Around line 135-136: The document contains placeholder task keys "LCORE-????"
    (e.g., in the heading "### LCORE-???? Add require_approval configuration for
    MCP servers" and the other occurrences around the file) that must be replaced
    with the real issue/ticket IDs before merging; update every instance of
    "LCORE-????" to the correct ticket key(s), ensure the headings remain consistent
    with those IDs (and update any cross-references or TOC entries that reference
    them), and verify there are no remaining "LCORE-????" placeholders anywhere in
    the file.

</details>

<details>
<summary>🪄 Autofix (Beta)</summary>

Fix all unresolved CodeRabbit comments on this PR:

- [ ] <!-- {"checkboxId": "4b0d0e0a-96d7-4f10-b296-3a18ea78f0b9"} --> Push a commit to this branch (recommended)
- [ ] <!-- {"checkboxId": "ff5b1114-7d8c-49e6-8ac1-43f82af23a33"} --> Create a new PR with the fixes

</details>

---

<details>
<summary>ℹ️ Review info</summary>

<details>
<summary>⚙️ Run configuration</summary>

**Configuration used**: Path: .coderabbit.yaml

**Review profile**: ASSERTIVE

**Plan**: Pro

**Run ID**: `f6ddc1f4-e798-4040-acd4-4ab04f00222c`

</details>

<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 0ed23a66a0b55616cb3b6f444e54013160bf24f8 and 263999f699abb081a24259b165279d6e291a10fb.

</details>

<details>
<summary>📒 Files selected for processing (1)</summary>

* `docs/design/human-in-the-loop/human-in-the-loop-spike.md`

</details>

</details>

<details>
<summary>📜 Review details</summary>

<details>
<summary>⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)</summary>

* GitHub Check: unit_tests (3.12)
* GitHub Check: build-pr
* GitHub Check: Pylinter
* GitHub Check: E2E: library mode / ci
* GitHub Check: E2E: server mode / ci
* GitHub Check: E2E Tests for Lightspeed Evaluation job
* GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request

</details>

<details>
<summary>🧰 Additional context used</summary>

<details>
<summary>🧠 Learnings (2)</summary>

<details>
<summary>📚 Learning: 2026-04-05T12:19:36.009Z</summary>

```
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Applies to src/**/*.py : Use `from llama_stack_client import AsyncLlamaStackClient` for Llama Stack imports
```

**Applied to files:**
- `docs/design/human-in-the-loop/human-in-the-loop-spike.md`

</details>
<details>
<summary>📚 Learning: 2026-04-06T20:18:11.336Z</summary>

```
Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:11.336Z
Learning: In the lightspeed-stack codebase (src/app/endpoints/), `tools: Optional[list[Any]]` for MCP tool definitions is an intentional, consistent pattern used across all inference endpoints (query, responses, streaming_query, rlsapi_v1). Do not flag this as a typing issue — changing it in isolation would break consistency.
```

**Applied to files:**
- `docs/design/human-in-the-loop/human-in-the-loop-spike.md`

</details>

</details><details>
<summary>🪛 markdownlint-cli2 (0.22.0)</summary>

<details>
<summary>docs/design/human-in-the-loop/human-in-the-loop-spike.md</summary>

[warning] 156-156: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

[warning] 183-183: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

[warning] 211-211: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

[warning] 240-240: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

[warning] 265-265: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

[warning] 291-291: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

[warning] 315-315: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

[warning] 343-343: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

---

[warning] 437-437: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

</details>

</details>

</details>

</details>

<!-- This is an auto-generated comment by CodeRabbit for review status -->

Comment on lines +198 to +201
- `GET /approvals` - List pending approvals (filter by conversation_id)
- `GET /approvals/{id}` - Get approval request details
- `POST /approvals/{id}` - Submit approval/denial
- Authorization checks (user can only access own approvals)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Define idempotency and terminal-state semantics for approval submission.

POST /approvals/{id} needs explicit behavior for duplicate submits, concurrent submits, and expired/already-decided requests; otherwise clients can observe non-deterministic outcomes.

Proposed acceptance-criteria additions
 **Acceptance criteria**:
 - [ ] Endpoints return appropriate HTTP status codes
 - [ ] Authorization enforced (user scope)
 - [ ] OpenAPI documentation generated
 - [ ] Integration tests for happy path and error cases
+- [ ] `POST /approvals/{id}` is idempotent for repeated identical decisions
+- [ ] Second conflicting decision on an already-decided request returns a deterministic error (e.g., 409)
+- [ ] Expired requests return a deterministic error contract (status + error code)
+- [ ] Concurrent submissions are serialized/guarded so only one terminal decision is persisted

Also applies to: 205-209, 229-237

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop-spike.md` around lines 198 -
201, Update the POST /approvals/{id} spec to define idempotency and
terminal-state semantics: state that approvals have terminal states (APPROVED,
DENIED, EXPIRED) which are immutable once set; require clients to send an
Idempotency-Key header (or client-provided request_id) and specify that repeated
POSTs with the same key return the original outcome (200 with the decision
payload) rather than re-applying; specify concurrent-submit handling via
optimistic concurrency (include a version/etag on GET /approvals/{id} and
require it or use server-side compare-and-set) and that CAS failures return 409
Conflict with the current state; define expired/already-decided behavior to
return 410 Gone for expired or 409/200 depending on idempotency key for
already-decided requests, and include clear response payload fields (decision,
decided_by, decided_at, idempotency_key, version) and the exact HTTP status
codes for each case so clients can implement deterministic retry logic.

@jrobertboos jrobertboos requested a review from anik120 April 14, 2026 13:58
Comment on lines +467 to +470
- **Scheduled task**: Run cleanup periodically (e.g., hourly or daily)
- **Lazy cleanup**: Trigger cleanup during read operations when record count
exceeds a threshold
- **Database-level**: Use PostgreSQL's `pg_cron` or application-level scheduler
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm struggling to envision what this scheduled task will look like exactly.

Do Lightspeed-stack deployments have any cron job infrastructure currently? If not, an infrastructure would need to be set up, which is significant effort.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we have an existing scheduler for the quota, its not a cron job but just looks likes a daemon for the quota db.
I think it might be good to just kind of reuse that pattern in the approvals db
wdyt?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

kind of like #684

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yea that could totally work. Do you mind adding this detail to this section so that we have a record of it please

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (4)
docs/design/human-in-the-loop/human-in-the-loop.md (3)

354-356: ⚠️ Potential issue | 🟠 Major

Resolve inconsistent authorization error semantics (404 vs 403).

The endpoint specifications state:

  • Line 354: GET /approvals/{id} → "Response (404): Approval not found or not owned by user"
  • Line 378: POST /approvals/{id} → "Response (404): Approval not found or not owned by user"

However, the error handling matrix (line 421) defines:

  • 403 forbidden for "Not authorized to access this approval"

Choose one authorization policy and apply it consistently:

Option 1: Use 404 for ownership violations (security through obscurity — don't reveal existence of resources user can't access)
Option 2: Use 403 for ownership violations (explicit — resource exists but user lacks permission)

Update either the endpoint response descriptions or the error matrix to match, and ensure the implementation follows the chosen convention.

Also applies to: 378-380

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop.md` around lines 354 - 356,
Pick a single authorization policy (either Option 1: return 404 for
ownership/authorization violations or Option 2: return 403 for
ownership/authorization violations) and apply it consistently across the docs
and implementation: update the endpoint descriptions for GET /approvals/{id} and
POST /approvals/{id} (currently stating "Response (404): Approval not found or
not owned by user") and the error handling matrix entry "Not authorized to
access this approval" so they match the chosen policy, and then adjust the
approval-access checks in the implementation (the approval lookup/ownership
check in the handlers for GET and POST approvals) to emit the same status code
for ownership/authorization failures.

487-494: ⚠️ Potential issue | 🟠 Major

Document the API contract change for requires_action response.

Line 493 states "No API changes to existing endpoints: New status: "requires_action" is additive", but introducing a new response shape with different fields (required_action, approvals[]) is an API contract change for typed clients.

Clients consuming /query and /streaming_query currently expect a response with fields like response, conversation_id, etc. The new requires_action response has a different structure (status, required_action.type, required_action.approvals).

Please update the documentation to:

  1. Model the response as a union type: Document that query endpoints now return QueryResult | RequiresActionResult (use oneOf in OpenAPI)
  2. Add versioning guidance: If the API uses semantic versioning, clarify whether this is a minor (backward-compatible) or major (breaking) change
  3. Provide client migration guidance: Show how typed clients should handle both response shapes (e.g., discriminate on status field)

This ensures API consumers understand they must update their models to handle both variants.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop.md` around lines 487 - 494,
Update the docs to explicitly document the new response shape as an
additive-but-changing API contract by modeling the query endpoints (/query and
/streaming_query) as returning a union type QueryResult | RequiresActionResult
(use OpenAPI oneOf), describe the RequiresActionResult fields (status,
required_action.type, required_action.approvals, etc.), add versioning guidance
stating whether this change is treated as a minor (backward-compatible) or major
(breaking) release under your semantic versioning policy, and add concise client
migration guidance showing typed clients should discriminate on the top-level
status field to handle both QueryResult and RequiresActionResult shapes.

93-103: ⚠️ Potential issue | 🟠 Major

Clarify the approval continuation flow after POST /approvals/{id}.

The sequence diagram (lines 97-102) shows that after approval submission, mcp_approval_response is sent to Llama Stack, followed by "tool result + response" and "200 OK (final response)" to the user. However, the POST /approvals/{id} endpoint specification (lines 369-376) only returns approval status metadata (id, status, decided_at), not the resumed agent output.

Please explicitly document whether:

  • Option A: POST /approvals/{id} is synchronous — it waits for Llama Stack to complete execution and returns the final agent response (requires adding a field like agent_response to the 200 response schema)
  • Option B: POST /approvals/{id} is asynchronous — it only records the decision and returns immediately; clients must poll or call /query again to retrieve the continued conversation

Update both the diagram labels and the endpoint response schema to match the chosen behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop.md` around lines 93 - 103,
Decide whether POST /approvals/{id} is synchronous or asynchronous and make the
docs consistent: if you choose synchronous (Option A), update the sequence
diagram around mcp_approval_response and "tool result + response" to show the
server waiting and add an agent_response field (e.g., agent_response: {content,
tool_output, metadata}) to the 200 schema returned by POST /approvals/{id} and
describe blocking behavior; if you choose asynchronous (Option B), update the
diagram to show POST /approvals/{id} returning immediately with only approval
metadata (id, status, decided_at), explicitly state clients must poll /query or
fetch the resumed conversation, and remove any implication that the final agent
response is returned by this endpoint. Ensure the diagram labels and the POST
/approvals/{id} response schema and text mention the chosen flow and reference
mcp_approval_response and /query where applicable.
docs/design/human-in-the-loop/human-in-the-loop-spike.md (1)

136-136: ⚠️ Potential issue | 🟡 Minor

Replace or acknowledge placeholder JIRA ticket keys.

The spike document contains seven placeholder ticket keys (LCORE-????) for the proposed implementation stories. Before merging this design document, please either:

Option A: Create the actual JIRA tickets and replace placeholders with real keys (e.g., LCORE-1590, LCORE-1591, etc.)
Option B: Add a note at the top of the "Proposed JIRAs" section stating: "Note: Ticket keys are placeholders and will be assigned when stories are created post-spike approval"

Option A is preferred for traceability, but Option B is acceptable if ticket creation is part of the post-spike workflow.

Also applies to: 165-165, 196-196, 224-224, 253-253, 278-278, 304-304

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop-spike.md` at line 136, The
document contains seven placeholder ticket keys marked as "LCORE-????" (e.g.,
the entry "LCORE-???? Add `require_approval` configuration for MCP servers") in
the "Proposed JIRAs" section; either replace each "LCORE-????" with the real
JIRA keys once you create the tickets (preferred) or add a single clarifying
note at the top of the "Proposed JIRAs" section stating: "Note: Ticket keys are
placeholders and will be assigned when stories are created post-spike approval"
so the placeholders are acknowledged; update all occurrences (the seven
instances referenced) to ensure consistency.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@docs/design/human-in-the-loop/human-in-the-loop-spike.md`:
- Line 136: The document contains seven placeholder ticket keys marked as
"LCORE-????" (e.g., the entry "LCORE-???? Add `require_approval` configuration
for MCP servers") in the "Proposed JIRAs" section; either replace each
"LCORE-????" with the real JIRA keys once you create the tickets (preferred) or
add a single clarifying note at the top of the "Proposed JIRAs" section stating:
"Note: Ticket keys are placeholders and will be assigned when stories are
created post-spike approval" so the placeholders are acknowledged; update all
occurrences (the seven instances referenced) to ensure consistency.

In `@docs/design/human-in-the-loop/human-in-the-loop.md`:
- Around line 354-356: Pick a single authorization policy (either Option 1:
return 404 for ownership/authorization violations or Option 2: return 403 for
ownership/authorization violations) and apply it consistently across the docs
and implementation: update the endpoint descriptions for GET /approvals/{id} and
POST /approvals/{id} (currently stating "Response (404): Approval not found or
not owned by user") and the error handling matrix entry "Not authorized to
access this approval" so they match the chosen policy, and then adjust the
approval-access checks in the implementation (the approval lookup/ownership
check in the handlers for GET and POST approvals) to emit the same status code
for ownership/authorization failures.
- Around line 487-494: Update the docs to explicitly document the new response
shape as an additive-but-changing API contract by modeling the query endpoints
(/query and /streaming_query) as returning a union type QueryResult |
RequiresActionResult (use OpenAPI oneOf), describe the RequiresActionResult
fields (status, required_action.type, required_action.approvals, etc.), add
versioning guidance stating whether this change is treated as a minor
(backward-compatible) or major (breaking) release under your semantic versioning
policy, and add concise client migration guidance showing typed clients should
discriminate on the top-level status field to handle both QueryResult and
RequiresActionResult shapes.
- Around line 93-103: Decide whether POST /approvals/{id} is synchronous or
asynchronous and make the docs consistent: if you choose synchronous (Option A),
update the sequence diagram around mcp_approval_response and "tool result +
response" to show the server waiting and add an agent_response field (e.g.,
agent_response: {content, tool_output, metadata}) to the 200 schema returned by
POST /approvals/{id} and describe blocking behavior; if you choose asynchronous
(Option B), update the diagram to show POST /approvals/{id} returning
immediately with only approval metadata (id, status, decided_at), explicitly
state clients must poll /query or fetch the resumed conversation, and remove any
implication that the final agent response is returned by this endpoint. Ensure
the diagram labels and the POST /approvals/{id} response schema and text mention
the chosen flow and reference mcp_approval_response and /query where applicable.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: f10736b7-a54f-4622-ae61-704c090cf071

📥 Commits

Reviewing files that changed from the base of the PR and between 263999f and e8b73ad.

📒 Files selected for processing (2)
  • docs/design/human-in-the-loop/human-in-the-loop-spike.md
  • docs/design/human-in-the-loop/human-in-the-loop.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: E2E: library mode / ci / group 3
  • GitHub Check: E2E: server mode / ci / group 1
  • GitHub Check: E2E: server mode / ci / group 3
  • GitHub Check: E2E: library mode / ci / group 1
  • GitHub Check: E2E: server mode / ci / group 2
  • GitHub Check: E2E: library mode / ci / group 2
  • GitHub Check: Pylinter
  • GitHub Check: build-pr
  • GitHub Check: E2E Tests for Lightspeed Evaluation job
  • GitHub Check: unit_tests (3.12)
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2026-04-06T20:18:11.336Z
Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:11.336Z
Learning: In the lightspeed-stack codebase (src/app/endpoints/), `tools: Optional[list[Any]]` for MCP tool definitions is an intentional, consistent pattern used across all inference endpoints (query, responses, streaming_query, rlsapi_v1). Do not flag this as a typing issue — changing it in isolation would break consistency.

Applied to files:

  • docs/design/human-in-the-loop/human-in-the-loop.md
  • docs/design/human-in-the-loop/human-in-the-loop-spike.md
📚 Learning: 2026-04-05T12:19:36.009Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Applies to src/**/*.py : Use `from llama_stack_client import AsyncLlamaStackClient` for Llama Stack imports

Applied to files:

  • docs/design/human-in-the-loop/human-in-the-loop-spike.md
🪛 LanguageTool
docs/design/human-in-the-loop/human-in-the-loop.md

[style] ~61-~61: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...re interaction - U3: As a user, I want to see what a tool will do before it runs,...

(REP_WANT_TO_VB)


[style] ~63-~63: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...unwanted changes - U4: As a user, I want to deny a tool invocation, so that I can s...

(REP_WANT_TO_VB)


[grammar] ~440-~440: Use a hyphen to join words.
Context: ...event abuse (handled by existing rate limiting infrastructure). ### Data rete...

(QB_NEW_EN_HYPHEN)

🪛 markdownlint-cli2 (0.22.0)
docs/design/human-in-the-loop/human-in-the-loop-spike.md

[warning] 72-72: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 128-128: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 150-150: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 299-299: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 319-319: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 341-341: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 362-362: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 370-370: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 385-385: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 405-405: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

Copy link
Copy Markdown
Contributor

@anik120 anik120 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I think this looks solid now.

/lgtm

- **U4:** As a user, I want to deny a tool invocation, so that I can stop an
action I don't want
- **U5:** As a developer, I want approval requests to expire, so that abandoned
sessions don't leave pending approvals indefinitely
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to rephrase U5 as "As an operator, I want to limit the amount of storage pending approvals take or else LCS can consume an unbounded amount of storage"?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that, its more detached from the actual implementation details than talking about the expirations.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

♻️ Duplicate comments (6)
docs/design/human-in-the-loop/human-in-the-loop.md (4)

354-356: ⚠️ Potential issue | 🟠 Major

Resolve conflicting authorization error semantics (404 vs 403).

Line 354 states "Approval not found or not owned by user" returns 404, but the error handling matrix at line 421 defines 403 with error code forbidden for "Not authorized to access this approval." Conflating "not found" with "unauthorized" in a 404 response can leak information about resource existence. Choose one consistent policy.

Also applies to: 378, 421, 427

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop.md` around lines 354 - 356,
The docs currently mix 404 and 403 semantics for approval access (e.g., the
"Approval not found or not owned by user" response vs the error matrix entry
using the error code "forbidden"), which can leak resource existence; pick a
single consistent policy (either "404 for not found/ownership mismatch" or "403
for unauthorized access") and update all related occurrences ("Approval not
found or not owned by user", the error matrix entry with error code "forbidden",
and the other referenced locations at lines 378, 421, 427) so the message text,
HTTP status, and error code align across the doc, and add a one-line note
explaining the chosen policy for future contributors.

72-105: ⚠️ Potential issue | 🟡 Minor

Add language identifier to ASCII diagram fence.

The fenced code block containing the ASCII flow diagram should specify a language (e.g., text) to satisfy markdown linting rules (MD040).

🔧 Suggested fix
-```
+```text
 ┌──────────────────────────────────────────────────────────────────────────────┐
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop.md` around lines 72 - 105,
The fenced ASCII diagram under the "Request Flow" section is missing a language
identifier which triggers markdown lint MD040; update the opening code fence for
that diagram (the triple backticks immediately before the box starting with
"┌──────────────────────────────────────────────────────────────────────────────┐"
and the "Request Flow" header) to include a language identifier such as text
(i.e., replace ``` with ```text) so the block is lint-compliant.

384-402: ⚠️ Potential issue | 🟠 Major

Clarify that requires_action is an additive API contract change.

While line 493 states "No API changes to existing endpoints," introducing a new status: "requires_action" response shape for /query and /streaming_query effectively changes the response contract. Typed clients and OpenAPI consumers need to handle this as a union/discriminated response type. Consider explicitly documenting this as an additive change that existing clients can safely ignore (they won't encounter it unless approval is configured).

Also applies to: 493-494

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop.md` around lines 384 - 402,
The docs currently introduce a new response shape with status "requires_action"
for the /query and /streaming_query endpoints (including the required_action
object and approvals array) which is an additive API contract change; update the
text to explicitly call this out as a new discriminated-union response variant
clients must handle (e.g., typed clients/OpenAPI users should treat responses as
a union on status) and note it is additive and safe to ignore for existing
clients unless approval is configured, and include the exact response fields
("status":"requires_action", "required_action", "approvals" with
id/server_label/tool_name/arguments/expires_at) so implementers can update
schemas and clients accordingly.

548-555: ⚠️ Potential issue | 🟠 Major

Streaming approval continuation flow needs clarification.

The section describes yielding an approval_required event and tracking incomplete response, but doesn't specify what happens after the client submits approval via POST /approvals/{id}. Does the client need to reconnect to the stream, or does LCS automatically resume and send additional events? This relates to the continuation flow ambiguity flagged in past reviews.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop.md` around lines 548 - 555,
Clarify the streaming approval continuation: update the section around
streaming_query.py (the branch where output_item.type == "mcp_approval_request")
to explicitly state what happens after the client POSTs to /approvals/{id} —
whether the LCS automatically resumes the existing stream and continues emitting
events for that same session or whether the client must reconnect to receive
further events; describe the mechanism that resumes the stream (e.g., the cache
backend entry in approval_requests being marked approved and the streaming logic
polling/observing that change) and reference the approval_required event, the
approval_requests table cached entry, and the POST /approvals/{id} handler so
readers can find the code paths that perform the resume or reconnection
behavior.
docs/design/human-in-the-loop/human-in-the-loop-spike.md (2)

134-136: ⚠️ Potential issue | 🟠 Major

Replace placeholder ticket keys before merge.

The LCORE-???? placeholders prevent traceability and make the execution plan non-actionable. These should be resolved to real issue keys.

Also applies to: 165, 195, 223, 252, 277, 303

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop-spike.md` around lines 134 -
136, Replace all placeholder ticket keys "LCORE-????" in the document (e.g., the
heading "### LCORE-???? Add `require_approval` configuration for MCP servers"
and the other occurrences flagged) with the actual tracking issue keys used by
our team (e.g., LCORE-1234) so the headings and references are actionable;
ensure you use the canonical issue-key format consistently across the file,
update any internal links or references that used the placeholder, and run a
quick search to confirm no "LCORE-????" strings remain before merging.

156-159: ⚠️ Potential issue | 🟡 Minor

Add blank lines around fenced code blocks.

Markdown lint requires blank lines before and after fenced code blocks (MD031). This applies to all text code blocks used for "Agentic tool instruction" sections throughout the document.

Also applies to: 186-189, 215-218, 244-247, 269-272, 295-298, 319-322, 347-359

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop-spike.md` around lines 156 -
159, The markdown fenced code blocks marked with ```text in
human-in-the-loop-spike.md are missing required blank lines before and/or after,
triggering MD031; add a single blank line immediately before and immediately
after each fenced code block (especially the "Agentic tool instruction"
sections) so every ```text block is surrounded by one empty line; apply this to
all occurrences referenced in the review (and scan the document for any other
```text blocks) to satisfy the linter.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/design/human-in-the-loop/human-in-the-loop-spike.md`:
- Around line 347-359: get_mcp_tools currently hardcodes
InputToolMCP(require_approval="never"); change it to derive the value from the
MCP server config or caller policy (e.g., use a property like
mcp_server.require_approval or an injected parameter named
require_approval_policy), defaulting to "never" only when no config is present,
and pass that variable into InputToolMCP(...); update the get_mcp_tools
signature if needed to accept the policy and ensure any callers/tests are
adjusted to provide or expect the configurable value.

In `@docs/design/human-in-the-loop/human-in-the-loop.md`:
- Line 655: The file docs/design/human-in-the-loop/human-in-the-loop.md is
missing a trailing newline; open the file and ensure it ends with a single
newline character (add one blank line break at EOF) so it complies with markdown
lint rule MD047; save the file after adding the newline.
- Around line 128-169: Add a CHECK constraint to the approval_requests table to
enforce status can only be 'pending', 'approved', 'denied', or 'expired': update
the CREATE TABLE definitions for approval_requests (both SQLite and PostgreSQL
variants) to include a CHECK on the status column (reference: table name
approval_requests and column status) so invalid status values are rejected at
the DB level; ensure the constraint is named (e.g.,
approval_requests_status_check) or included inline and that existing indexes
(idx_approval_requests_conversation, idx_approval_requests_expires) remain
unchanged.

---

Duplicate comments:
In `@docs/design/human-in-the-loop/human-in-the-loop-spike.md`:
- Around line 134-136: Replace all placeholder ticket keys "LCORE-????" in the
document (e.g., the heading "### LCORE-???? Add `require_approval` configuration
for MCP servers" and the other occurrences flagged) with the actual tracking
issue keys used by our team (e.g., LCORE-1234) so the headings and references
are actionable; ensure you use the canonical issue-key format consistently
across the file, update any internal links or references that used the
placeholder, and run a quick search to confirm no "LCORE-????" strings remain
before merging.
- Around line 156-159: The markdown fenced code blocks marked with ```text in
human-in-the-loop-spike.md are missing required blank lines before and/or after,
triggering MD031; add a single blank line immediately before and immediately
after each fenced code block (especially the "Agentic tool instruction"
sections) so every ```text block is surrounded by one empty line; apply this to
all occurrences referenced in the review (and scan the document for any other
```text blocks) to satisfy the linter.

In `@docs/design/human-in-the-loop/human-in-the-loop.md`:
- Around line 354-356: The docs currently mix 404 and 403 semantics for approval
access (e.g., the "Approval not found or not owned by user" response vs the
error matrix entry using the error code "forbidden"), which can leak resource
existence; pick a single consistent policy (either "404 for not found/ownership
mismatch" or "403 for unauthorized access") and update all related occurrences
("Approval not found or not owned by user", the error matrix entry with error
code "forbidden", and the other referenced locations at lines 378, 421, 427) so
the message text, HTTP status, and error code align across the doc, and add a
one-line note explaining the chosen policy for future contributors.
- Around line 72-105: The fenced ASCII diagram under the "Request Flow" section
is missing a language identifier which triggers markdown lint MD040; update the
opening code fence for that diagram (the triple backticks immediately before the
box starting with
"┌──────────────────────────────────────────────────────────────────────────────┐"
and the "Request Flow" header) to include a language identifier such as text
(i.e., replace ``` with ```text) so the block is lint-compliant.
- Around line 384-402: The docs currently introduce a new response shape with
status "requires_action" for the /query and /streaming_query endpoints
(including the required_action object and approvals array) which is an additive
API contract change; update the text to explicitly call this out as a new
discriminated-union response variant clients must handle (e.g., typed
clients/OpenAPI users should treat responses as a union on status) and note it
is additive and safe to ignore for existing clients unless approval is
configured, and include the exact response fields ("status":"requires_action",
"required_action", "approvals" with
id/server_label/tool_name/arguments/expires_at) so implementers can update
schemas and clients accordingly.
- Around line 548-555: Clarify the streaming approval continuation: update the
section around streaming_query.py (the branch where output_item.type ==
"mcp_approval_request") to explicitly state what happens after the client POSTs
to /approvals/{id} — whether the LCS automatically resumes the existing stream
and continues emitting events for that same session or whether the client must
reconnect to receive further events; describe the mechanism that resumes the
stream (e.g., the cache backend entry in approval_requests being marked approved
and the streaming logic polling/observing that change) and reference the
approval_required event, the approval_requests table cached entry, and the POST
/approvals/{id} handler so readers can find the code paths that perform the
resume or reconnection behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: a77661eb-f41a-4bd2-a420-e26cb164184f

📥 Commits

Reviewing files that changed from the base of the PR and between e8b73ad and b99d29f.

📒 Files selected for processing (2)
  • docs/design/human-in-the-loop/human-in-the-loop-spike.md
  • docs/design/human-in-the-loop/human-in-the-loop.md
📜 Review details
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (9)
  • GitHub Check: radon
  • GitHub Check: list_outdated_dependencies
  • GitHub Check: E2E: library mode / ci / group 3
  • GitHub Check: E2E: server mode / ci / group 3
  • GitHub Check: E2E Tests for Lightspeed Evaluation job
  • GitHub Check: mypy
  • GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
  • GitHub Check: Pylinter
  • GitHub Check: build-pr
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2026-04-06T20:18:11.336Z
Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:11.336Z
Learning: In the lightspeed-stack codebase (src/app/endpoints/), `tools: Optional[list[Any]]` for MCP tool definitions is an intentional, consistent pattern used across all inference endpoints (query, responses, streaming_query, rlsapi_v1). Do not flag this as a typing issue — changing it in isolation would break consistency.

Applied to files:

  • docs/design/human-in-the-loop/human-in-the-loop.md
  • docs/design/human-in-the-loop/human-in-the-loop-spike.md
📚 Learning: 2026-04-05T12:19:36.009Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Applies to src/**/*.py : Use `from llama_stack_client import AsyncLlamaStackClient` for Llama Stack imports

Applied to files:

  • docs/design/human-in-the-loop/human-in-the-loop-spike.md
🪛 LanguageTool
docs/design/human-in-the-loop/human-in-the-loop.md

[style] ~61-~61: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...re interaction - U3: As a user, I want to see what a tool will do before it runs,...

(REP_WANT_TO_VB)


[style] ~63-~63: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ...unwanted changes - U4: As a user, I want to deny a tool invocation, so that I can s...

(REP_WANT_TO_VB)


[style] ~65-~65: You have already used this phrasing in nearby sentences. Consider replacing it to add variety to your writing.
Context: ... don't want - U5: As an operator, I want to limit the amount of storage pending app...

(REP_WANT_TO_VB)


[uncategorized] ~440-~440: If this is a compound adjective that modifies the following noun, use a hyphen.
Context: ...o prevent abuse (handled by existing rate limiting infrastructure). ### Data retention A...

(EN_COMPOUND_ADJECTIVE_INTERNAL)

🪛 markdownlint-cli2 (0.22.0)
docs/design/human-in-the-loop/human-in-the-loop.md

[warning] 72-72: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


[warning] 128-128: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 150-150: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 299-299: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 319-319: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 341-341: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 362-362: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 370-370: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 385-385: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 405-405: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 464-464: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 473-473: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 655-655: Files should end with a single newline character

(MD047, single-trailing-newline)

docs/design/human-in-the-loop/human-in-the-loop-spike.md

[warning] 156-156: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 186-186: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 215-215: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 244-244: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 269-269: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 295-295: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 319-319: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 347-347: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 441-441: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

🔇 Additional comments (6)
docs/design/human-in-the-loop/human-in-the-loop-spike.md (2)

1-47: Design decisions are well-structured and justified.

The overview clearly articulates the problem (hardcoded require_approval="never" with no approval mechanism) and the recommended asynchronous /approvals API solution. Decision tables provide clear options with pros/cons, and the per-request approval scope (Decision 2, Option C) aligns well with the security reasoning discussed in past review comments.


324-341: PoC validation correctly references existing implementation.

The claims about existing Llama Stack approval types and LCS parsing support are accurate. The code reference to responses.py:1067-1094 correctly identifies the build_tool_call_summary() function that already handles mcp_approval_request and mcp_approval_response types, as confirmed by the relevant code snippet provided.

docs/design/human-in-the-loop/human-in-the-loop.md (4)

1-67: Requirements and use cases are comprehensive and well-articulated.

The document clearly defines the problem space, requirements (R1-R8), and user stories (U1-U5). The per-request approval scope (R7) and backward compatibility default (R8) align well with the security-conscious decisions in the spike document.


224-296: Configuration classes are well-designed and follow codebase patterns.

The ApprovalFilter and ApprovalsConfiguration classes properly extend ConfigurationBase, use Field() with descriptive metadata, and include appropriate validation (the validate_no_overlap validator at lines 248-256 prevents configuration errors). The conversion logic from config ApprovalFilter to Llama Stack's format (shown later at lines 528-533) correctly handles the empty list to None mapping.


564-578: Test patterns and HIL-specific considerations are thorough.

The testing guidance correctly references the codebase's pytest + pytest-asyncio + pytest-mock stack, provides accurate mocking patterns, and includes HIL-specific test considerations (mock mcp_approval_request events, test TTL expiration, test authorization checks). This will help ensure comprehensive test coverage.


520-546: 🧹 Nitpick | 🔵 Trivial

Code example should include necessary imports.

The example uses LlamaStackApprovalFilter (line 530) but doesn't show the import statement. For completeness, the example should include:

📦 Suggested import addition
from llama_stack_api.openai_responses import (
    OpenAIResponseInputToolMCP as InputToolMCP,
)
from llama_stack_api.openai_responses import (
    ApprovalFilter as LlamaStackApprovalFilter,
)
from models.config import ApprovalFilter

async def get_mcp_tools(...) -> list[InputToolMCP]:
    # ... rest of example

Based on learnings: Use from llama_stack_client import AsyncLlamaStackClient for Llama Stack imports (though this specific case imports types from llama_stack_api.openai_responses).

⛔ Skipped due to learnings
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Applies to src/**/*.py : Use `from llama_stack_client import AsyncLlamaStackClient` for Llama Stack imports
Learnt from: asimurka
Repo: lightspeed-core/lightspeed-stack PR: 1211
File: src/models/responses.py:8-16
Timestamp: 2026-02-25T07:46:39.608Z
Learning: In the lightspeed-stack codebase, src/models/requests.py uses OpenAIResponseInputTool as Tool while src/models/responses.py uses OpenAIResponseTool as Tool. This type difference is intentional - input tools and output/response tools have different schemas in llama-stack-api.
Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:11.336Z
Learning: In the lightspeed-stack codebase (src/app/endpoints/), `tools: Optional[list[Any]]` for MCP tool definitions is an intentional, consistent pattern used across all inference endpoints (query, responses, streaming_query, rlsapi_v1). Do not flag this as a typing issue — changing it in isolation would break consistency.
Learnt from: jrobertboos
Repo: lightspeed-core/lightspeed-stack PR: 1396
File: src/app/endpoints/conversations_v1.py:6-6
Timestamp: 2026-04-07T15:03:11.530Z
Learning: In the `llama_stack_api` package, all imports MUST use the flat form `from llama_stack_api import <symbol>`. Sub-module imports (e.g., `from llama_stack_api.common.errors import ConversationNotFoundError`) are explicitly NOT supported and considered a code smell, as stated in `llama_stack_api/__init__.py` lines 15-19. Do not flag or suggest changing root-package imports to sub-module imports for this package.

Comment on lines +347 to +359
```python
async def get_mcp_tools(...) -> list[InputToolMCP]:
# ...
tools.append(
InputToolMCP(
type="mcp",
server_label=mcp_server.name,
server_url=mcp_server.url,
require_approval="never", # <-- Hardcoded, needs to be configurable
# ...
)
)
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Verify the current hardcoded require_approval value in get_mcp_tools

# Search for require_approval in the get_mcp_tools function
rg -n -A3 -B3 'require_approval.*=.*"never"' src/utils/responses.py

Repository: lightspeed-core/lightspeed-stack

Length of output: 375


Make require_approval configurable instead of hardcoded.

The require_approval="never" value is hardcoded at line 739 in src/utils/responses.py within the get_mcp_tools function's InputToolMCP initialization. This should be made configurable based on MCP server settings or user policy rather than always defaulting to "never".

🧰 Tools
🪛 markdownlint-cli2 (0.22.0)

[warning] 347-347: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop-spike.md` around lines 347 -
359, get_mcp_tools currently hardcodes InputToolMCP(require_approval="never");
change it to derive the value from the MCP server config or caller policy (e.g.,
use a property like mcp_server.require_approval or an injected parameter named
require_approval_policy), defaulting to "never" only when no config is present,
and pass that variable into InputToolMCP(...); update the get_mcp_tools
signature if needed to accept the policy and ensure any callers/tests are
adjusted to provide or expect the configurable value.

Comment on lines +128 to +169
```sql
CREATE TABLE IF NOT EXISTS approval_requests (
id TEXT PRIMARY KEY,
conversation_id TEXT NOT NULL,
user_id TEXT NOT NULL,
server_label TEXT NOT NULL,
tool_name TEXT NOT NULL,
arguments TEXT NOT NULL, -- JSON
status TEXT NOT NULL DEFAULT 'pending', -- pending, approved, denied, expired
created_at TEXT NOT NULL,
expires_at TEXT NOT NULL,
decision_reason TEXT,
decided_at TEXT
);

CREATE INDEX IF NOT EXISTS idx_approval_requests_conversation
ON approval_requests(conversation_id, status);
CREATE INDEX IF NOT EXISTS idx_approval_requests_expires
ON approval_requests(expires_at) WHERE status = 'pending';
```

PostgreSQL schema:
```sql
CREATE TABLE IF NOT EXISTS approval_requests (
id TEXT PRIMARY KEY,
conversation_id TEXT NOT NULL,
user_id TEXT NOT NULL,
server_label TEXT NOT NULL,
tool_name TEXT NOT NULL,
arguments JSONB NOT NULL,
status TEXT NOT NULL DEFAULT 'pending',
created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(),
expires_at TIMESTAMPTZ NOT NULL,
decision_reason TEXT,
decided_at TIMESTAMPTZ
);

CREATE INDEX IF NOT EXISTS idx_approval_requests_conversation
ON approval_requests(conversation_id, status);
CREATE INDEX IF NOT EXISTS idx_approval_requests_expires
ON approval_requests(expires_at) WHERE status = 'pending';
```
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

Consider adding CHECK constraints for status field integrity.

The status field (lines 136, 158) is documented to accept only 'pending', 'approved', 'denied', 'expired', but the schema doesn't enforce this at the database level. Adding a CHECK constraint would prevent invalid status values from being inserted.

🛡️ Optional schema enhancement

SQLite:

 CREATE TABLE IF NOT EXISTS approval_requests (
     id                TEXT PRIMARY KEY,
     conversation_id   TEXT NOT NULL,
     user_id           TEXT NOT NULL,
     server_label      TEXT NOT NULL,
     tool_name         TEXT NOT NULL,
     arguments         TEXT NOT NULL,  -- JSON
-    status            TEXT NOT NULL DEFAULT 'pending',  -- pending, approved, denied, expired
+    status            TEXT NOT NULL DEFAULT 'pending' CHECK (status IN ('pending', 'approved', 'denied', 'expired')),
     created_at        TEXT NOT NULL,
     expires_at        TEXT NOT NULL,
     decision_reason   TEXT,
     decided_at        TEXT
 );

PostgreSQL:

-    status            TEXT NOT NULL DEFAULT 'pending',
+    status            TEXT NOT NULL DEFAULT 'pending' CHECK (status IN ('pending', 'approved', 'denied', 'expired')),
🧰 Tools
🪛 markdownlint-cli2 (0.22.0)

[warning] 128-128: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)


[warning] 150-150: Fenced code blocks should be surrounded by blank lines

(MD031, blanks-around-fences)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/design/human-in-the-loop/human-in-the-loop.md` around lines 128 - 169,
Add a CHECK constraint to the approval_requests table to enforce status can only
be 'pending', 'approved', 'denied', or 'expired': update the CREATE TABLE
definitions for approval_requests (both SQLite and PostgreSQL variants) to
include a CHECK on the status column (reference: table name approval_requests
and column status) so invalid status values are rejected at the DB level; ensure
the constraint is named (e.g., approval_requests_status_check) or included
inline and that existing indexes (idx_approval_requests_conversation,
idx_approval_requests_expires) remain unchanged.


# 4. Continue conversation (or re-query)
# The next query in the same conversation will continue from where it left off
``` No newline at end of file
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add trailing newline to file.

Files should end with a single newline character per markdown linting rules (MD047).


<details>
<summary>🧰 Tools</summary>

<details>
<summary>🪛 markdownlint-cli2 (0.22.0)</summary>

[warning] 655-655: Files should end with a single newline character

(MD047, single-trailing-newline)

</details>

</details>

<details>
<summary>🤖 Prompt for AI Agents</summary>

Verify each finding against the current code and only fix it if needed.

In @docs/design/human-in-the-loop/human-in-the-loop.md at line 655, The file
docs/design/human-in-the-loop/human-in-the-loop.md is missing a trailing
newline; open the file and ensure it ends with a single newline character (add
one blank line break at EOF) so it complies with markdown lint rule MD047; save
the file after adding the newline.


</details>

<!-- fingerprinting:phantom:triton:puma:f8184a1a-8e7c-498a-af6b-832e82b2c0ed -->

<!-- This is an auto-generated comment by CodeRabbit -->

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants