diff --git a/docs/evaluate/user-sim.md b/docs/evaluate/user-sim.md index 0ca2c8d20..fd41d8e94 100644 --- a/docs/evaluate/user-sim.md +++ b/docs/evaluate/user-sim.md @@ -190,7 +190,7 @@ The below `EvalConfig` shows the default user simulator configuration: # same as before }, "user_simulator_config": { - "model": "gemini-2.5-flash", + "model": "gemini-3.1-pro-preview", "model_configuration": { "thinking_config": { "include_thoughts": true, @@ -256,3 +256,39 @@ Example of a custom persona definition: } ``` +## Generating Evaluation Cases via User Simulation + +Writing evaluation cases manually can be time-consuming and may not cover all potential failure modes. ADK provides a command to automatically generate diverse and realistic conversation scenarios based on your agent's definition using the Vertex AI Eval SDK. + +!!! warning "Prerequisites: Vertex AI Credentials" + Generating evaluation cases uses the [Vertex Gen AI Evaluation Service API](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/evaluation). You must have a Google Cloud project with the Vertex AI API enabled and valid Application Default Credentials (ADC) configured in your environment. + + +### Command Syntax + +```bash +adk eval_set generate_eval_cases \ + \ + \ + --user_simulation_config_file= +``` + +### Configuration File Format + +The `--user_simulation_config_file` expects a JSON file matching the `ConversationGenerationConfig` schema: + +```json +{ + "count": 5, + "generation_instruction": "Generate scenarios where the user asks to control home devices under different conditions.", + "environment_context": "Available devices: device_1 (Light), device_2 (Thermostat).", + "model_name": "gemini-3.1-pro-preview" +} +``` + +### Configuration Fields + +* **`count`** (required): The number of conversation scenarios to generate. +* **`generation_instruction`** (optional): A natural language prompt guiding the specific types of scenarios or goals you want to test. +* **`environment_context`** (optional): Context describing the backend data or state accessible to the agent's tools. This helps the generator create queries that are grounded in realistic data (e.g., valid device IDs). +* **`model_name`** (required): The Gemini model used for generation (e.g., `gemini-3.1-pro-preview`).