LCORE-1583: Allowed reasoning and max_output_tokens attributes#1483
LCORE-1583: Allowed reasoning and max_output_tokens attributes#1483asimurka wants to merge 1 commit intolightspeed-core:mainfrom
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: ASSERTIVE Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
WalkthroughThese changes remove deprecated field-clearing logic from the responses endpoint handler and streamline test configurations by eliminating a skipped test scenario, removing unsupported Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~8 minutes 🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
✨ Simplify code
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
7cd8248 to
e89b907
Compare
PR Reviewer Guide 🔍(Review updated until commit e89b907)Here are some key observations to aid the review process:
|
|
Persistent review updated to latest commit e89b907 |
|
Persistent review updated to latest commit e89b907 |
|
@coderabbitai help |
ChatThere are 3 ways to chat with CodeRabbit:
CodeRabbit commands
Other keywords and placeholders
Status, support, documentation and community
|
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@lightspeed-stack.yaml`:
- Around line 13-15: The YAML contains a duplicated key use_as_library_client
with conflicting values (false and true); remove one of the entries so the map
contains a single use_as_library_client key with the intended boolean value,
ensuring the remaining value reflects the desired behavior and resolving the
parser/lint error.
In `@src/app/endpoints/responses.py`:
- Around line 165-170: The code currently nulls out responses_request.reasoning
(responses_request.reasoning = None) and mutates the input before copying,
preventing the field from reaching ResponsesApiParams.model_validate(...) and
client.responses.create(...); remove the in-place assignment so reasoning is
preserved, log a warning if present but do not modify responses_request, and
ensure you perform the deep copy (responses_request =
responses_request.model_copy(deep=True)) before any non-mutating checks or logs
so the original object is not altered and the reasoning value is passed through
to downstream validation and client.responses.create.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: ASSERTIVE
Plan: Pro
Run ID: 8abc4275-a00c-4ee0-8518-6e7140b44683
📒 Files selected for processing (6)
lightspeed-stack.yamlrun.yamlsrc/app/endpoints/responses.pysrc/utils/types.pytests/e2e/features/responses.featuretests/e2e/test_list.txt
💤 Files with no reviewable changes (3)
- run.yaml
- tests/e2e/test_list.txt
- tests/e2e/features/responses.feature
📜 Review details
🧰 Additional context used
📓 Path-based instructions (2)
src/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/**/*.py: Use absolute imports for internal modules (e.g.,from authentication import get_auth_dependency)
Usefrom llama_stack_client import AsyncLlamaStackClientfor Llama Stack imports
Checkconstants.pyfor shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Uselogger = get_logger(__name__)fromlog.pyfor module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Use complete type annotations for function parameters and return types
Use union types with modern syntax:str | intinstead ofUnion[str, int]
UseOptional[Type]for optional types in type annotations
Use snake_case with descriptive, action-oriented names for functions (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters
Useasync deffor I/O operations and external API calls
HandleAPIConnectionErrorfrom Llama Stack in error handling
Uselogger.debug()for detailed diagnostic information
Uselogger.info()for general information about program execution
Uselogger.warning()for unexpected events or potential problems
Uselogger.error()for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes:Configuration,Error/Exception,Resolver,Interface
Use complete type annotations for all class attributes; avoid usingAny
Follow Google Python docstring conventions for all modules, classes, and functions
IncludeParameters:,Returns:,Raises:sections in function docstrings as needed
Files:
src/app/endpoints/responses.pysrc/utils/types.py
src/app/**/*.py
📄 CodeRabbit inference engine (AGENTS.md)
src/app/**/*.py: Usefrom fastapi import APIRouter, HTTPException, Request, status, Dependsfor FastAPI dependencies
Use FastAPIHTTPExceptionwith appropriate status codes for API endpoint error handling
Files:
src/app/endpoints/responses.py
🧠 Learnings (3)
📚 Learning: 2026-04-06T20:18:07.852Z
Learnt from: major
Repo: lightspeed-core/lightspeed-stack PR: 1463
File: src/app/endpoints/rlsapi_v1.py:266-271
Timestamp: 2026-04-06T20:18:07.852Z
Learning: In the lightspeed-stack codebase, within `src/app/endpoints/` inference/MCP endpoints, treat `tools: Optional[list[Any]]` in MCP tool definitions as an intentional, consistent typing pattern (used across `query`, `responses`, `streaming_query`, `rlsapi_v1`). Do not raise or suggest this as a typing issue during code review; changing it in isolation could break endpoint typing consistency across the codebase.
Applied to files:
src/app/endpoints/responses.py
📚 Learning: 2025-12-18T10:21:09.038Z
Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.
Applied to files:
lightspeed-stack.yaml
📚 Learning: 2026-04-05T12:19:36.009Z
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: AGENTS.md:0-0
Timestamp: 2026-04-05T12:19:36.009Z
Learning: Applies to src/**/*.py : Use `from llama_stack_client import AsyncLlamaStackClient` for Llama Stack imports
Applied to files:
lightspeed-stack.yaml
🪛 YAMLlint (1.38.0)
lightspeed-stack.yaml
[error] 15-15: duplication of key "use_as_library_client" in mapping
(key-duplicates)
🔇 Additional comments (2)
lightspeed-stack.yaml (1)
16-18: Library mode wiring looks correct once the duplicate key is removed.
library_client_config_path: run.yamlplus disabled URL/API key matches the intended library-client startup path.src/utils/types.py (1)
217-220: Good contract alignment forsafety_identifierpassthrough.This field addition cleanly matches the request/response models and allows forwarding without special-case logic.
| use_as_library_client: false | ||
| # Alternative for "as library use" | ||
| # use_as_library_client: true | ||
| # library_client_config_path: <path-to-llama-stack-run.yaml-file> | ||
| url: http://llama-stack:8321 | ||
| api_key: xyzzy | ||
| use_as_library_client: true |
There was a problem hiding this comment.
Remove duplicated use_as_library_client key (conflicting values).
This map defines use_as_library_client twice (false then true). It is ambiguous, fails lint, and parser behavior can differ. Keep only one value.
Suggested fix
llama_stack:
- # Uses a remote llama-stack service
- # The instance would have already been started with a llama-stack-run.yaml file
- use_as_library_client: false
- # Alternative for "as library use"
+ # Use llama-stack in library-client mode
use_as_library_client: true
library_client_config_path: run.yaml
# url: http://llama-stack:8321
# api_key: xyzzy📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| use_as_library_client: false | |
| # Alternative for "as library use" | |
| # use_as_library_client: true | |
| # library_client_config_path: <path-to-llama-stack-run.yaml-file> | |
| url: http://llama-stack:8321 | |
| api_key: xyzzy | |
| use_as_library_client: true | |
| llama_stack: | |
| # Use llama-stack in library-client mode | |
| use_as_library_client: true | |
| library_client_config_path: run.yaml | |
| # url: http://llama-stack:8321 | |
| # api_key: xyzzy |
🧰 Tools
🪛 YAMLlint (1.38.0)
[error] 15-15: duplication of key "use_as_library_client" in mapping
(key-duplicates)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@lightspeed-stack.yaml` around lines 13 - 15, The YAML contains a duplicated
key use_as_library_client with conflicting values (false and true); remove one
of the entries so the map contains a single use_as_library_client key with the
intended boolean value, ensuring the remaining value reflects the desired
behavior and resolving the parser/lint error.
| if responses_request.reasoning is not None: | ||
| logger.warning("reasoning is not yet supported in LCORE and will be ignored") | ||
| responses_request.reasoning = None | ||
| if responses_request.max_output_tokens is not None: | ||
| logger.warning( | ||
| "max_output_tokens is not yet supported in LCORE and will be ignored" | ||
| ) | ||
| responses_request.max_output_tokens = None | ||
|
|
||
| responses_request = responses_request.model_copy(deep=True) | ||
|
|
There was a problem hiding this comment.
Do not strip reasoning; this currently breaks the advertised passthrough behavior.
Line 165-Line 167 still nullifies responses_request.reasoning, so the field can never reach ResponsesApiParams.model_validate(...) (Line 302/Line 657) or client.responses.create(...). This contradicts LCORE-1583 and also mutates the request object in place before the copy on Line 169.
✅ Suggested fix
- # Known LLS bug: https://redhat.atlassian.net/browse/LCORE-1583
- if responses_request.reasoning is not None:
- logger.warning("reasoning is not yet supported in LCORE and will be ignored")
- responses_request.reasoning = None
-
- responses_request = responses_request.model_copy(deep=True)
+ responses_request = responses_request.model_copy(deep=True)As per coding guidelines "Avoid in-place parameter modification anti-patterns: return new data structures instead of modifying parameters".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/app/endpoints/responses.py` around lines 165 - 170, The code currently
nulls out responses_request.reasoning (responses_request.reasoning = None) and
mutates the input before copying, preventing the field from reaching
ResponsesApiParams.model_validate(...) and client.responses.create(...); remove
the in-place assignment so reasoning is preserved, log a warning if present but
do not modify responses_request, and ensure you perform the deep copy
(responses_request = responses_request.model_copy(deep=True)) before any
non-mutating checks or logs so the original object is not altered and the
reasoning value is passed through to downstream validation and
client.responses.create.
Description
Type of change
Tools used to create PR
Identify any AI code assistants used in this PR (for transparency and review context)
Related Tickets & Documents
Checklist before requesting a review
Testing
Summary by CodeRabbit
Bug Fixes
New Features
Tests