diff --git a/NEW-AGENT.md b/NEW-AGENT.md new file mode 100644 index 0000000000..a1575c0c48 --- /dev/null +++ b/NEW-AGENT.md @@ -0,0 +1,242 @@ +# Tasks & AI Agent + +This branch adds AI-powered Tasks to Studio. A Task is a chat session with a Claude agent, tied to a WordPress site. Users can create tasks from the sidebar or site overview, chat with the agent, and the agent can manage their site via MCP tools and file operations. + +## Status + +**Proof of Concept** — The full pipeline is functional: task creation, sidebar navigation, chat UI, agent integration via the Claude Agent SDK in the Electron main process, and desktop-native MCP tools (site_list, site_info, site_start, site_stop, wp_cli). Tasks persist across app restarts. Permission prompts for filesystem operations outside the site directory are wired but not yet polished. + +## Architecture + +### Agent Execution Model + +The Claude Agent SDK (`@anthropic-ai/claude-agent-sdk`) runs directly in the Electron main process. The SDK internally spawns its own subprocess (a bundled Claude Code CLI), so there's no double-nesting of child processes. The desktop app passes `pathToClaudeCodeExecutable` explicitly because Vite bundling breaks the SDK's internal path resolution. + +``` +Renderer (React) Main Process (Node.js) + | | + |-- ipcApi.startTaskAgentHandler --->|-- query({ prompt, mcpServers, ... }) + | | SDK spawns its own subprocess + |<-- 'task-message' events ----------|-- for await (msg of query) { + |<-- 'task-message' events ----------| sendIpcEventToRenderer('task-message', msg) + | | } + |-- ipcApi.interruptTaskHandler ---->|-- query.interrupt() +``` + +### Provider Resolution + +Authentication mirrors the CLI's provider fallback chain (`provider-resolver.ts`): + +1. **WordPress.com** — Uses the shared auth token from `readSharedConfig()`. Proxies through the wpcom AI gateway. +2. **Anthropic Claude auth** — Checks for local Claude Code authentication via `claude auth status`. +3. **Anthropic API key** — Direct API key (not yet wired to UI for input). + +### MCP Tools (Desktop-Native) + +The agent has access to 12 Studio tools via an MCP server (`tools.ts`), all using `SiteServer` directly (not the CLI daemon): + +- **site_list** — Lists all sites with status, paths, URLs. +- **site_info** — Detailed info for a specific site (path, URL, credentials, PHP version). +- **site_start** / **site_stop** — Start or stop a site's server. +- **wp_cli** — Execute WP-CLI commands on a running site (plugin install, post create, etc.). +- **post_blocks_read** — List all Gutenberg blocks in a post/page with indices, types, attributes, and content previews. Uses WordPress's `parse_blocks()` via `wp eval`. +- **post_block_update** — Replace a specific block by index with new block markup. Uses `parse_blocks()` / `serialize_blocks()` / `wp_update_post()` via `wp eval`. Markup is base64-encoded to avoid escaping issues. +- **browser_navigate** — Navigate the site preview browser to a URL or path. Syncs the visible preview iframe. +- **browser_reload** — Reload the current page. Syncs the visible preview iframe. +- **browser_screenshot** — Take a PNG screenshot via a hidden BrowserWindow's `webContents.capturePage()`. Returns an MCP image content block. +- **browser_read_page** — Read page title, URL, text content, and a structural DOM outline (headings, links, forms, buttons). +- **browser_console** — Read recent console messages (log/warning/error) with optional clear. + +The browser tools use a `BrowserInspector` singleton (`browser-inspector.ts`) that manages hidden `BrowserWindow` instances per site. See `NEW-BROWSER.md` for details on the architecture. + +The agent also has Claude Code's built-in file tools (Read, Write, Edit, Glob, Grep, Bash) for direct file manipulation. The agent's `cwd` is set to the task's site path so file operations work relative to the site. + +### Permission Model + +Read-only tools (Read, Glob, Grep, MCP tools) are auto-approved. Write operations (Write, Edit, Bash) within the site directory or temp directories are auto-approved. Write operations outside trusted roots trigger a permission prompt: + +1. Main process `canUseTool` callback creates a pending Promise +2. Sends `task-permission-request` IPC event to renderer +3. Renderer shows inline permission dialog (Allow once / Allow for session / Deny) +4. User's response resolves the Promise via `respondToPermissionRequestHandler` IPC + +Session-level approvals are cached per tool name so the user isn't prompted repeatedly. + +### Data Model + +**TaskMetadata** (persisted in `appdata-v1.json` alongside site data): + +| Field | Type | Description | +|-------|------|-------------| +| `id` | `string` | UUID | +| `siteId` | `string` | Associated site ID | +| `title` | `string` | Auto-generated from first message | +| `status` | `'in-progress' \| 'waiting' \| 'done'` | Current state | +| `archived` | `boolean` | Hidden from sidebar when true | +| `createdAt` | `number` | Timestamp | +| `updatedAt` | `number` | Timestamp | +| `sessionId` | `string?` | SDK session ID for resuming conversations | + +**TaskMessage** (persisted to `localStorage` via Redux listener, survives app restarts): + +| Field | Type | Description | +|-------|------|-------------| +| `id` | `string` | Message ID (from SDK UUID or generated) | +| `role` | `'user' \| 'assistant' \| 'tool' \| 'system'` | Message type | +| `content` | `string` | Text content | +| `toolName` | `string?` | Tool name for tool-type messages | +| `toolInput` | `unknown?` | Tool input parameters | +| `toolResult` | `string?` | Tool execution result | +| `images` | `ImageAttachment[]?` | Base64 image attachments (user messages) | +| `elements` | `ElementAttachment[]?` | Selected DOM elements (user messages) | +| `isStreaming` | `boolean?` | Currently being streamed | +| `isError` | `boolean?` | Error message | + +### Navigation State + +Task selection lives in Redux (`tasks-slice.ts`) via `selectedTaskId`. This is separate from site selection (which uses `useSiteDetails` context). The two are mutually exclusive in the UI: + +- Clicking a task sets `selectedTaskId`, primary panel shows `TaskChatPanel` +- Clicking a site clears `selectedTaskId`, primary panel shows `SiteContentTabs` +- Site list items check `selectedTaskId` to suppress their highlight when a task is active + +### System Prompt + +The agent uses the full CLI system prompt from `tools/common/ai/system-prompt.ts` (shared between CLI and desktop). This includes detailed workflow steps, design guidelines, block content rules, and all tool descriptions. The desktop app appends site-specific context (site name, path, instructions to use wp_cli for content and check revisions). + +### Message Serialization + +SDK messages (`SDKMessage` union type) are converted to flat `TaskMessage` objects by `message-serializer.ts`. The serializer handles: + +- `assistant` messages — extracts text content and tool use blocks +- `user` messages (synthetic) — extracts tool result content blocks with `tool_use_id` matching +- `result` messages — surfaces error messages only (successful results are already shown via the preceding `assistant` message) +- `tool_use_summary` — shows tool execution summaries +- `tool_progress` — shows tool running indicators + +Tool results are merged back onto their invocation message in Redux by matching `tool_use_id`, so each tool call appears as a single expandable card with both input and output. + +Other SDK message types (system/init, stream events, auth status) are filtered out. + +### Redux State + +The `tasks` slice (`tasks-slice.ts`) manages: + +- `tasks: TaskMetadata[]` — All task metadata, loaded from IPC on app init +- `selectedTaskId: string | null` — Currently viewed task +- `messagesByTask: Record` — Chat messages per task +- `streamingByTask: Record` — Per-task streaming indicator +- `queuedMessagesByTask: Record` — Messages queued while agent is busy (transient, not persisted) +- `pendingPermissions: PermissionRequest[]` — Pending permission dialogs + +IPC event listeners in `stores/index.ts` dispatch actions for `task-updated`, `task-deleted`, `task-message`, `task-status-changed`, `task-permission-request`, and `task-error` events from the main process. The `task-status-changed` listener also handles auto-sending the next queued message when the agent transitions to `waiting`. + +## UI Components + +### Sidebar Task List + +The sidebar's Tasks section (`tasks/task-list.tsx`) replaces the former placeholder: + +- Header with "Tasks" label, archive toggle, and `+` button for creating new tasks +- Clicking `+` enters a "pending new task" state — the primary panel shows a site picker dropdown ("A new task for... [choose a site]") instead of an inline sidebar picker +- Task items show title, site name, and a status dot (blue pulsing = in-progress, gray = waiting, green = done) +- Archive button appears when hovering the status dot (not the whole row), replacing the dot +- Non-archived tasks sorted by `updatedAt` descending +- Archive toggle opens a Popover flyout listing archived tasks with count and "Clear all" button + +### Chat Panel + +When a task is selected, the primary panel renders `TaskChatPanel` instead of `SiteContentTabs`: + +- **Message list** — User messages (right-aligned, themed) and assistant messages (left-aligned). Tool call messages are filtered out of the main conversation view — they only appear in the activity indicator. +- **Auto-scroll** — Scrolls to bottom on new messages +- **Streaming indicator** — Bouncing dots while agent is responding +- **Activity indicator** — Compact status bar above the chat input showing the agent's current state ("Thinking", "Editing", "Searching files", etc.) with an elapsed time counter. Clicking opens a flyout panel listing all activity (user messages, assistant responses, tool calls) with relative timestamps. Tool entries in the log are expandable to show full input/output details. Uses a blue pulsing dot only while the agent is actively streaming; no dot when idle. +- **Permission prompt** — Inline amber dialog above the input when the agent needs filesystem approval +- **Input** — Textarea with Enter-to-send (Shift+Enter for newlines). Supports image attachments via file picker, clipboard paste, or browser area capture. Images exceeding the API's 5MB base64 limit are automatically resized using a canvas (`resize-image.ts`). +- **Message queue** — Users can type and send follow-up messages while the agent is streaming. Queued messages appear as compact chips above the input. They auto-send one-by-one as the agent completes each turn. Clicking a chip restores it to the input for editing; the X button dismisses it. Auto-send pauses if the agent's last message was an error. + +### Site Overview Integration + +The site overview tab includes a "New task" button in the shortcuts section that creates a task pre-bound to that site. + +## IPC Interface + +### Handlers (invoke-style, return values) + +| Handler | Description | +|---------|-------------| +| `createTask(siteId)` | Create task metadata, returns `TaskMetadata` | +| `getAllTasks()` | Returns all tasks | +| `updateTask(taskId, updates)` | Partial update (title, status, archived, sessionId) | +| `archiveTask(taskId)` | Set archived=true | +| `deleteTask(taskId)` | Remove from appdata | +| `clearArchivedTasks()` | Delete all archived tasks, returns removed IDs | +| `updateTaskStatus(taskId, status)` | Update status field | +| `startTaskAgentHandler(taskId, prompt, resumeSessionId?)` | Start agent session | +| `sendTaskMessageHandler(taskId, message)` | Send follow-up message | +| `interruptTaskHandler(taskId)` | Interrupt active agent | +| `respondToPermissionRequestHandler(requestId, response, taskId?)` | Resolve permission | + +### Events (main -> renderer) + +| Event | Payload | Description | +|-------|---------|-------------| +| `task-updated` | `TaskMetadata` | Task metadata changed | +| `task-deleted` | `string` (taskId) | Task removed | +| `task-message` | `{ taskId, message: TaskMessage }` | New chat message | +| `task-status-changed` | `{ taskId, status }` | Agent status transition | +| `task-permission-request` | `PermissionRequest` | Agent needs user approval | +| `task-error` | `{ taskId, error }` | Agent error | +| `browser-navigate` | `{ siteId, url }` | Agent navigated the browser (syncs preview iframe) | + +## Key Files + +``` +apps/studio/src/ +├── modules/ai/ +│ ├── types.ts # TaskMetadata, TaskMessage, PermissionRequest +│ └── lib/ +│ ├── ipc-handlers.ts # Task CRUD + agent lifecycle handlers +│ ├── agent-manager.ts # Active Query management, message loop, permissions +│ ├── tools.ts # Desktop MCP tools (site_list, wp_cli, etc.) +│ ├── browser-inspector.ts # Hidden BrowserWindow manager for agent inspection +│ ├── browser-tools.ts # Browser MCP tools (navigate, screenshot, etc.) +│ ├── provider-resolver.ts # Auth provider fallback (wpcom/claude/api-key) +│ └── message-serializer.ts # SDKMessage -> TaskMessage conversion +├── components/site-menu.tsx # Updated: clears task selection on site click +├── stores/ +│ └── tasks-slice.ts # Redux state for tasks, messages, permissions +├── components/new-ui/tasks/ +│ ├── task-list.tsx # Sidebar task list with + button and archive flyout +│ ├── task-list-item.tsx # Individual task item with status dot +│ ├── task-new-panel.tsx # New task site picker (primary panel) +│ ├── task-chat-panel.tsx # Chat panel (primary panel replacement) +│ ├── task-chat-input.tsx # Message input with agent IPC + queue-on-send +│ ├── task-queued-messages.tsx # Queued message chips (click to restore, X to dismiss) +│ ├── task-message-list.tsx # Message bubbles (user/assistant only, no tool cards) +│ ├── task-activity-indicator.tsx # Status bar + expandable activity log flyout +│ └── task-permission-prompt.tsx # Inline permission dialog +├── lib/resize-image.ts # Auto-resize base64 images exceeding API limit +├── storage/storage-types.ts # UserData.tasks field added +├── ipc-handlers.ts # Re-exports task handlers +├── ipc-utils.ts # Task IPC event types +├── preload.ts # Task IPC bridge methods +└── stores/index.ts # Tasks reducer + IPC event listeners + +tools/common/ai/ +└── system-prompt.ts # Shared system prompt (used by CLI and desktop) + +apps/cli/ai/ +└── system-prompt.ts # Re-exports from tools/common/ai/ +``` + +## What's Next + +- Session resume on app restart (sessionId is persisted, `query({ resume })` is wired) +- More MCP tools (site_create, site_delete, preview_create/update/delete, validate_blocks) +- Markdown rendering in assistant messages +- Streaming partial text (SDKPartialAssistantMessage events) +- ~~Auto-title refinement (use agent to generate a better title after first exchange)~~ Done — uses Haiku to generate titles +- Error recovery UI (provider not available, rate limits, etc.). Agent errors now send `ai:done` after `ai:error` so the desktop cleans up properly and follow-ups can resume via session ID. +- Keyboard shortcuts (Escape to interrupt, etc.) diff --git a/NEW-BROWSER.md b/NEW-BROWSER.md new file mode 100644 index 0000000000..d987981d5b --- /dev/null +++ b/NEW-BROWSER.md @@ -0,0 +1,217 @@ +# Browser Panel + +The secondary panel in the new Studio UI is an embedded browser that previews the currently active WordPress site. It supports multiple tabs so users can keep several pages open simultaneously (e.g. the site frontend in one tab and wp-admin in another). + +## How It Works + +### Auto-Authentication + +The iframe loads through the `/studio-auto-login` endpoint rather than the site URL directly. This authenticates the user automatically, which means: + +- The WordPress admin bar is visible on the frontend, giving quick access to wp-admin pages, the site editor, customizer, etc. +- No separate login step is needed — the user is always authenticated when previewing. + +### Site Resolution + +The browser panel resolves which site to display based on context: + +1. **Task selected** — If a task is selected in the sidebar, the browser shows the site associated with that task (via `task.siteId`). +2. **Site selected** — Otherwise, it shows the currently selected site from the sidebar. +3. **Site not running** — If the resolved site isn't running, the panel shows an empty state with "Start the site to preview it". + +This logic lives in the `useBrowserPanel` hook (`apps/studio/src/hooks/use-browser-panel.ts`). + +### Tabs + +The browser supports multiple tabs per site. Each tab is its own iframe that preserves scroll position, form state, and navigation history independently. + +**State model** — Each tab tracks its own `id`, `displayUrl`, `title`, `isLoading`, and `isInitialLoad`. Tab state lives entirely in the `useBrowserPanel` hook (not Redux) since it's local UI state. Iframe element refs are stored in a `Map`. + +**Multiple hidden iframes** — All tab iframes are rendered simultaneously. The active tab is visible; inactive tabs use `display: none`. This means switching tabs is instant — scroll position, form inputs, and history are fully preserved because the iframe stays mounted. + +**Tab lifecycle:** + +- **Site loads** → one tab opens at the homepage (via auto-login URL) +- **New tab (`+` button)** → opens at the site homepage, becomes active +- **Switch** → CSS visibility swap, toolbar updates to show the active tab's URL and loading state +- **Close** → tab removed, its left neighbor activates (or first tab if leftmost was closed). The last remaining tab cannot be closed +- **Site changes** (sidebar selection or task switch) → all tabs reset to a single fresh tab +- **Soft cap** — maximum of 8 tabs to limit iframe resource usage + +**Keyboard shortcuts:** + +- `Cmd+Shift+[` / `Cmd+Shift+]` — switch to previous/next tab (wraps around) + +### CSP and Framing + +WordPress sends headers that block iframe embedding (`X-Frame-Options` and `Content-Security-Policy` with `frame-ancestors`). The Electron main process strips these headers from all localhost responses so wp-admin pages can load in the iframe. The app's own CSP includes `frame-src http://localhost:*` to allow framing local sites. + +See the `onHeadersReceived` handler in `apps/studio/src/index.ts`. + +### Loading States + +Loading states are tracked per-tab. The toolbar reflects whichever tab is active. + +- **Initial load** — Shows a WordPress `` centered on the dark panel background. The iframe is hidden (`opacity-0`) until it fires `onLoad`. The toolbar is always visible so tabs remain accessible. +- **In-page navigation** — When the user clicks links inside the iframe, a `beforeunload` listener detects the navigation start and shows an indeterminate progress bar (bouncing left-to-right) at the bottom of the toolbar. The bar disappears when the iframe's `onLoad` fires. A blue pulsing dot also appears in the tab strip next to the loading tab's title. + +### URL Bar + +The toolbar at the top of the panel includes an editable URL input. It updates automatically to reflect the iframe's current URL (read from `contentWindow.location.href` on each load). The user can also type a URL and press Enter to navigate the iframe directly. + +### Navigation Controls + +Back, Forward, and Reload buttons in the toolbar use the iframe's `contentWindow.history` API. Reload calls `contentWindow.location.reload()` to refresh the current page in-place (not the homepage). + +## Key Files + +- `apps/studio/src/hooks/use-browser-panel.ts` — Hook encapsulating all browser/tab state, site resolution, navigation handlers, tab management (`addTab`, `closeTab`, `selectTab`), keyboard shortcuts, and per-tab loading state. +- `apps/studio/src/components/new-ui/panel-layout.tsx` — Renders the browser panel in the secondary panel slot, composing the toolbar, tab bar, and iframe container. +- `apps/studio/src/components/new-ui/browser-tab-bar.tsx` — Compact tab strip component showing tab titles, close buttons, loading indicators, and new-tab button. +- `apps/studio/src/components/new-ui/browser-iframe-container.tsx` — Renders all tab iframes simultaneously (active visible, inactive hidden). Each iframe is wrapped in a `BrowserIframe` sub-component that manages its own `beforeunload` listener lifecycle. +- `apps/studio/src/index.ts` — Main process header stripping for iframe compatibility. +- `apps/studio/index.html` — Static CSP with `frame-src http://localhost:*`. +- `apps/studio/src/index.css` — `browser-progress` keyframe animation for the loading bar. + +## Agent Browser Control + +The AI agent can navigate and reload the in-app browser preview. These tools control the actual preview the user sees — when the agent navigates, the user watches it happen in their active tab. + +### Tools + +| Tool | Description | +|------|-------------| +| `browser_navigate` | Navigate the active tab to a URL or path (e.g. `/wp-admin/`). Relative paths are resolved against the site's base URL. | +| `browser_reload` | Reload the current page in the active tab. | + +Screenshots and page reading use separate Playwright-based tools (`take_screenshot`, `validate_blocks`) that run in the CLI subprocess. + +### IPC Flow + +The agent runs in a forked CLI subprocess. Browser control messages flow through an IPC bridge: + +1. CLI tool calls `process.send({ type: 'ai:browser-navigate', url })` or `process.send({ type: 'ai:browser-reload' })` +2. Desktop main process (`agent-manager.ts`) receives the message, resolves the `siteId` from the task metadata, resolves relative paths to full URLs using the site's base URL +3. Main process sends `browser-navigate` or `browser-reload` IPC event to the renderer via `sendIpcEventToRenderer` +4. Store listener (`stores/index.ts`) dispatches a `studio:browser-navigate` or `studio:browser-reload` custom DOM event +5. `useBrowserPanel` hook handles the event — navigates or reloads the active tab's iframe + +### Key Files + +- `apps/cli/ai/tools.ts` — `browserNavigateTool` and `browserReloadTool` definitions (send IPC via `process.send`). +- `apps/studio/src/modules/ai/lib/agent-manager.ts` — Handles `ai:browser-navigate` and `ai:browser-reload` messages from the CLI child process. +- `apps/studio/src/ipc-utils.ts` — Defines `browser-navigate` and `browser-reload` IPC event types. +- `apps/studio/src/stores/index.ts` — Subscribes to IPC events and dispatches DOM custom events. +- `apps/studio/src/hooks/use-browser-panel.ts` — Listens for custom events and controls the active tab's iframe. +- `tools/common/ai/system-prompt.ts` — Agent system prompt listing available tools. + +## Element Selection for AI Context + +Users can select elements from the browser preview to provide as structured context when chatting with the AI agent. This bridges the gap between "I want to change this thing" and the agent knowing exactly which element, styles, and block to target. + +### How It Works + +A WordPress mu-plugin (`0-element-selector-bridge.php`) injects a lightweight JavaScript bridge into every page via `wp_footer` and `admin_footer`. The script is dormant until activated — zero overhead on normal browsing. + +Communication between the Electron renderer and the iframe uses `postMessage`, which works cross-origin by design (unlike direct DOM access via `contentDocument`/`contentWindow.location`, which is blocked). + +### User Flow + +1. User clicks the **Add notes** button (`sidesAxial` icon) in the browser toolbar — always visible, not just during active tasks. +2. **Without an active task** — standard selection mode. The renderer posts `studio:select-element:activate` to the active tab's iframe. Any previous selection highlight is cleared. +3. **With an active task** — persistent selection mode. The renderer posts `studio:select-element:activate-persistent`. Selection mode stays active after each click so users can rapidly annotate multiple elements. +4. Inside the iframe, hovering highlights elements with a blue overlay. Smart targeting resolves clicks on inline text to their parent semantic element (button, link, etc.). +5. User clicks an element → the bridge applies a persistent `outline` highlight directly on the element, extracts metadata, and posts `studio:select-element:selected` back to the parent. +6. A glassmorphic floating chat input appears near the selected element in the browser panel. The input is available in both task-chat and project-detail modes: + - **Without an active task** — sending creates a new task and starts the agent. + - **With an active task** — sending routes the message through the existing task chat (or queues it if the agent is busy). The element stays highlighted with a numbered floating label showing truncated note text. +7. **Multi-note flow** — With an active task, users can keep selecting elements and sending notes in rapid succession. Each note is numbered sequentially. If the agent is already responding, notes are queued and auto-sent when the agent finishes each turn. +8. **Note lifecycle** — When the agent completes a response for a note, that note's highlight turns green (success state) and fades away after ~2 seconds. +9. Pressing **Escape** clears the selection highlight and exits selection mode (handled in both the iframe and the parent frame). + +### Element Data Captured + +| Field | Description | +|-------|-------------| +| `cssSelector` | Unique selector path using id, classes, and nth-of-type | +| `tagName` | HTML tag name (lowercase) | +| `outerHTML` | Element markup, truncated to 2000 chars | +| `textContent` | Text content, truncated to 500 chars | +| `computedStyles` | Key CSS properties: color, background, font, padding, margin, border, display, position, dimensions | +| `boundingBox` | Position and size from `getBoundingClientRect()` | +| `domPath` | Ancestor chain, e.g. `["body", "div.site", "main", "section.hero", "h1"]` | +| `wpBlockName` | WordPress block name from `data-type` attribute or `wp-block-*` class pattern | + +### Data Flow + +``` +mu-plugin JS (iframe) + → window.parent.postMessage({ type: 'studio:select-element:selected', element }) + → useElementSelector hook (renderer) stores single element (replaces any previous) + → Floating input displays chip (always shown — both task-chat and project-detail modes) + → On send: element threaded through IPC (same path as images) + → headless.ts serializes as text block prepended to the prompt + → Agent receives structured element data alongside the user's message +``` + +### Selection Highlight + +The selected element keeps a visible `outline` directly applied to the DOM element (not a separate overlay div). This is more reliable than positioned overlays — outlines aren't clipped by `overflow: hidden` on ancestors and don't require z-index management. + +Highlights are cleared via `studio:select-element:clear` postMessage in these cases: +- Entering selection mode (to replace any stale highlight) +- Selecting a new element (clear previous before applying new) +- Pressing Escape during selection +- Dismissing the element chip or sending the message + +### Note Overlays + +When a note is sent from the floating input during an active task, the selected element gets a persistent overlay managed by the iframe bridge: + +- **`studio:notes:add`** — Creates an absolutely-positioned highlight div (blue outline) and a label pill showing `#N: truncated text` at the element's top-right corner. +- **`studio:notes:complete`** — Turns the highlight green and prepends a checkmark to the label (success state). +- **`studio:notes:remove`** — Fades out and removes the overlay DOM elements. +- **`studio:notes:clear`** — Removes all note overlays (e.g. when switching tasks). + +Overlays reposition on scroll/resize via debounced `querySelector` + `getBoundingClientRect`. If the element is removed by the agent, the overlay cleans itself up gracefully. + +Note state is tracked in Redux (`activeNotesByTask` in `tasks-slice.ts`) with an `ActiveNote` interface that stores `id`, `number`, `text`, `cssSelector`, `boundingBox`, and `status` (`active` → `completed` → `fading`). The `BrowserNoteOverlays` component bridges Redux state to iframe postMessages, handling the timed fade lifecycle. + +### Floating Input + +A compact floating input appears positioned near the selected element (below it, or above if insufficient space). It uses glassmorphism (`backdrop-blur-xl` with semi-transparent background) to stay readable over any page content. The input auto-focuses and supports Enter to send, Escape to dismiss. + +The floating input is always visible regardless of whether a task is active: +- **Without an active task** — Sending creates a new task and starts the agent. +- **With an active task** — Sending routes the message through the existing task chat system. If the agent is streaming, messages are queued via the existing `enqueueMessage` mechanism and auto-sent when the agent finishes. After sending, persistent selection mode re-activates so the user can immediately pick another element. + +### Key Files + +- `tools/common/lib/mu-plugins.ts` — `0-element-selector-bridge.php` mu-plugin with the in-iframe JS bridge (hover overlay, click handler, element data extraction, postMessage communication). +- `apps/studio/src/hooks/use-element-selector.ts` — Hook managing selection state, postMessage listener, and selected elements collection. +- `apps/studio/src/hooks/use-browser-panel.ts` — Exposes `getActiveIframe()` for the element selector to post messages to the active tab. +- `apps/studio/src/components/new-ui/browser-floating-input.tsx` — Floating chat input that appears in the browser panel when elements are selected. Routes to new task creation (project-detail mode) or existing task chat with queueing support (task-chat mode). +- `apps/studio/src/components/new-ui/browser-note-overlays.tsx` — Lifecycle bridge between Redux note state and iframe postMessages. Handles completed→fading→removed transitions with timed delays. +- `apps/studio/src/components/new-ui/panel-layout.tsx` — Add Notes button in the browser toolbar. +- `apps/studio/src/modules/ai/types.ts` — `ElementAttachment` interface. +- `apps/cli/commands/ai/headless.ts` — `serializeElementContext()` and updated `buildContentBlocks()` for threading element data into agent prompts. +- `tools/common/ai/system-prompt.ts` — Documents element context usage for the agent. + +## Design Decisions + +- **iframe over webview/BrowserView** — Simpler integration, works within the existing React panel layout, and localhost sites don't have the security concerns that would warrant a separate process. +- **Multiple hidden iframes for tabs** — All tab iframes stay mounted in the DOM (inactive ones use `display: none`). This is the simplest way to preserve full page state (scroll position, form inputs, JavaScript state, history) when switching tabs. Memory cost is negligible for the expected 2–5 tabs. +- **Tab state in the hook, not Redux** — Browser tabs are local UI state within the panel. No other component needs to know about them, so Redux would be unnecessary indirection. +- **Agent controls the real preview, not a hidden browser** — Navigate and reload go through the in-app browser so the user sees the agent's actions in real-time. Heavier inspection (screenshots, block validation) uses Playwright in the CLI subprocess, which doesn't need Electron APIs. +- **Dark toolbar (`#1d2327`)** — Matches the WordPress admin bar color so the toolbar and admin bar blend together visually. +- **Auto-login on every load** — The iframe `src` always goes through `/studio-auto-login` to ensure the session stays authenticated, even after reload. +- **No fake progress bar** — The loading indicator is an honest indeterminate bar (bouncing animation) since iframes don't expose real loading progress. +- **Tabs reset on site switch** — When the user selects a different site or task, all tabs are replaced with a single fresh tab. Preserving tabs across site switches would be confusing since each site has its own localhost URL. +- **postMessage for element selection** — The iframe's cross-origin restrictions block direct DOM access, but `postMessage` works across origins by design. A mu-plugin injects the bridge script into every WordPress page, keeping the iframe architecture intact while enabling rich element inspection. This avoids a costly migration to `` or BrowserView. +- **Add Notes as unified entry point** — The button is always visible and adapts to context. Without a task, it creates one. With a task, it feeds into the existing chat/queue system. This keeps the interaction discoverable and reduces friction — users don't need to think about which mode they're in. +- **Persistent selection mode for tasks** — When a task is active, clicking "Add notes" enters persistent selection mode. After selecting an element and sending a note, selection mode stays active so users can immediately pick another element. This supports the rapid multi-note workflow without requiring repeated button clicks. +- **One element per note, multiple notes per task** — Each note targets a single element to keep the agent's context focused. But users can fire off multiple notes in rapid succession — they're queued and processed sequentially, with numbered overlays tracking which note is active. +- **Note overlays in the iframe** — Post-send note highlights and labels are rendered as absolutely-positioned divs inside the iframe (via the mu-plugin bridge) rather than as React components over the iframe. This means they scroll with the page content and stay aligned with their target elements without requiring cross-frame coordinate translation. +- **FIFO note completion** — Note lifecycle maps 1:1 with agent response cycles. Each `task-status-changed` → `waiting` event completes the oldest active note. This works because the message queue is also FIFO — the order of note creation matches the order of agent processing. +- **Outline over overlay divs for initial selection highlight** — The selection-mode highlight uses `el.style.outline` directly on the element rather than a positioned overlay div. Outlines are immune to `overflow: hidden` clipping, don't require scroll-position calculations, and don't need z-index management. Note overlays (post-send) use positioned divs instead, since they need labels and color transitions. +- **Glassmorphic floating input** — The browser panel floating input uses `backdrop-blur-xl` with a semi-transparent background so it stays readable regardless of the page content behind it. Positioned near the selected element to keep the spatial connection obvious. diff --git a/NEW-SITE-CREATION.md b/NEW-SITE-CREATION.md new file mode 100644 index 0000000000..4119ea5770 --- /dev/null +++ b/NEW-SITE-CREATION.md @@ -0,0 +1,272 @@ +# New Project Creation Flow + +## Vision + +Studio builds things **powered by WordPress**. Not just sites — anything. A blog, an online store, a social network, a mobile app, a game, a newspaper, a community hub, an API. WordPress is the foundation: data, auth, APIs, media, admin. The frontend can be anything. + +Today Studio builds WordPress block themes. But the architecture supports headless builds too — WordPress as the backend with React, Vue, or whatever serves the frontend. The creation flow is the same either way. The agent just uses different tools depending on the stack. + +## v1 Scope + +**Fully working:** WordPress block theme path — spec → design → build → done. + +**Functional but basic:** Headless path — spec → design → `site_create` with default theme → enable REST API → scaffold connected starter app. Polished headless frontend building is future work. + +**Existing and unchanged:** Import from Jetpack Backup, WordPress Export (.xml), and WordPress.com sync. These flows use the current `AddSiteModal` module. + +**New but stubbed:** "Import anything" via URL — paste any website URL and the agent scrapes design/content to recreate it locally. v1 can fake this with a placeholder flow that captures the URL and creates a basic site inspired by it. + +## Implementation Status + +### Done + +- **Add-site window removed.** Dedicated Electron window (`add-site-window.ts`, `add-site-root.tsx`) deleted. IPC handlers, preload bridge, renderer routing, and `isAddSiteVisible` menu state all cleaned up. +- **Creation flow in main window.** `CreateProjectFlow` renders in the primary panel with DotGrid background. "Something new" and "Bring something you already have" chooser cards with frosted glass styling. Triggered via sidebar button, Cmd+N menu, `create-project` IPC event, or auto-start for new users. +- **Placeholder task system.** "Something new" creates a task with `SETUP_SITE_ID` (`__project-setup__`) — no site on disk. Task list shows setup tasks ungrouped (no site header). When the agent calls `site_create`, the `createSite` handler automatically migrates setup tasks to the real site. +- **First-time user experience.** Auto-starts creation flow when there are no sites AND no tasks. Sidebar and browser panel collapse for distraction-free full-width experience. +- **Questionnaire system.** Separate IPC channel (`ai:question-request` / `ai:question-response`) independent from permissions. Full round-trip: CLI headless → agent-manager → IPC event → Redux `pendingQuestions` → `TaskQuestionPrompt` UI → response back to CLI. Options render as a vertical text list with hover-to-theme styling. Chat input doubles as free-form answer when a question is pending. +- **Browser preview infrastructure.** Preview server (`src/lib/preview-server.ts`) serves local HTML files via `http://localhost:` to satisfy CSP. Agent-manager converts file paths to localhost URLs. Browser panel accepts preview tabs without a running site (`hasContent` flag). Panel auto-expands when preview content arrives. `browser_navigate` tool description updated to mention local file paths. +- **Task chat max-width.** Messages and input capped at `max-w-3xl` (768px). +- **Site-spec skill rewritten.** Conversational, not a form. Leads with "read what the user already told you" — only asks about what's missing. Design preview phase is mandatory with explicit `browser_navigate` instructions. +- **Sidebar layout.** Tasks section scrolls when long; projects section stays visible at bottom. "Add project" button shows active state when creation mode is on. +- **Floating tour component.** Built (`src/components/new-ui/floating-tour.tsx`) with step-by-step tooltips, arrow positioning, dismiss persistence. Not yet integrated into the post-creation transition. +- **Import step UI.** Shows WordPress.com, Pressable, Jetpack Backup, WordPress Export, and URL options. Handlers are stubbed (TODO). +- **Orphan task cleanup.** Task list filters out tasks whose site no longer exists. + +### Not yet done + +- **Import flow wiring.** The import options render but `handleSelect` is a TODO. Needs to connect to existing `AddSiteModal` flows for backup/xml/wpcom, and build the URL import agent flow. +- **Floating tour integration.** Component exists but is never triggered. Needs to fire after the user commits to a design and the build starts — point to the build task and project detail. +- **Post-creation transition.** Sidebar should auto-expand when the build starts. User should land on Project Detail view. "Your project is ready!" moment when build completes. None of this is wired. +- **Headless path.** Stack choice exists in the skill but no build logic. Needs: React/Vue scaffold tools, REST API enablement, frontend starter project generation. +- **Design iteration UX.** No structured "commit" moment — the agent just waits for text. Could benefit from a clear "Ready to build?" UI element. +- **Preview storage cleanup.** Files in `~/Studio/previews/` persist but are never cleaned up when a project is deleted. + +## Flow + +The creation flow is the user's **first Task**. It runs in the main window using the existing Task system — primary panel for the spec and chat, browser panel for design previews. When creation is done, the task persists in the sidebar like any other task. + +``` +Welcome / Auth (existing, unchanged) + │ + ▼ +┌─────────────────────────────────────────────────────────┐ +│ WHAT DO YOU WANT TO BUILD? │ +│ (sidebar hidden, full-width, dot grid bg) │ +│ │ +│ ┌──────────────────┐ ┌──────────────────────┐ │ +│ │ Something │ │ Bring something │ │ +│ │ new │ │ you already have │ │ +│ │ │ │ │ │ +│ │ A blog, a game, │ │ From WordPress.com, │ │ +│ │ an app, a │ │ a backup, a URL, │ │ +│ │ store, a social │ │ or anywhere else. │ │ +│ │ network — │ │ │ │ +│ │ anything. │ │ │ │ +│ └────────┬─────────┘ └──────────┬───────────┘ │ +└────────────┼─────────────────────────┼───────────────────┘ + │ │ + ▼ ▼ +┌───────────────────────┐ ┌──────────────────────────────┐ +│ PROJECT SPEC (chat) │ │ WHERE IS IT? │ +│ │ │ │ +│ Agent reads the │ │ ○ WordPress.com │ +│ user's initial │ │ ○ Pressable │ +│ message and only │ │ ○ Jetpack Backup │ +│ asks about what's │ │ ○ WordPress Export (.xml) │ +│ genuinely missing. │ │ ○ URL │ +│ │ │ Paste any website and │ +│ (see site-spec │ │ we'll pull it in │ +│ skill) │ └──────────┬───────────────────┘ +│ │ │ +│ │ ▼ +│ │ ┌──────────────────────────────┐ +│ │ │ IMPORT (chat) │ +│ │ │ Existing imports: standard │ +│ │ │ modal flow (backup, xml, │ +│ │ │ wpcom sync). │ +│ │ │ URL import: agent scrapes │ +│ │ │ and recreates (stubbed v1). │ +│ │ └──────────────────────────────┘ +└───────────┬───────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────┐ +│ DESIGN OPTIONS (chat + browser panel) │ +│ │ +│ Agent generates 2-3 polished design directions │ +│ as HTML/CSS/JS, rendered in the browser panel via │ +│ preview server (http://localhost:). │ +│ │ +│ Files stored in ~/Studio/previews/. │ +│ Each option is a standalone .html file. An index.html │ +│ ties them together for side-by-side comparison. │ +│ Agent calls browser_navigate with the absolute file │ +│ path to show previews in the browser panel. │ +│ │ +│ User picks / iterates / rejects via chat │ +│ User commits: "Let's go with this one" │ +└─────────────────────────┼────────────────────────────────┘ + │ + ▼ +┌─────────────────────────────────────────────────────────┐ +│ BUILD │ +│ │ +│ Agent calls site_create → task migrated from │ +│ __project-setup__ to real site automatically. │ +│ Block theme → wp_cli → validate → screenshot │ +│ │ +│ TODO: Sidebar expansion, floating tour, Project │ +│ Detail holding pattern. │ +│ │ +│ "Your project is ready!" │ +└─────────────────────────────────────────────────────────┘ +``` + +## First-Time User Experience + +New users see the Welcome and Permissions screens (existing, unchanged — handles WP.com auth). After that: + +1. **Creation flow auto-starts.** The "What do you want to build?" screen fills the main window with a dot grid background. Sidebar and browser panel are hidden — no distractions. Triggers when there are no sites AND no tasks. +2. **Spec and design happen in the primary + browser panels.** The user's first interaction with Studio is a conversation. They don't need to learn the app's full UI yet. The user's initial description appears as the first chat message. +3. **After committing to a design,** the sidebar expands for the first time. A floating tour component (built, not yet integrated) points to key elements: + - The build task in the sidebar ("Studio is building your site") + - The Project Detail view ("Check out what your project can do") +4. **Project Detail is the holding pattern.** While the agent builds the theme (5+ minutes), the user explores publishing to WordPress.com/Pressable, sync, and preview links. This is productive wait time. (TODO: not yet wired) +5. **Build completes.** The task updates, the browser panel shows the real site. The user can start a new task to iterate. + +Returning users with existing projects skip straight to the main app. The "+ Add project" button in the sidebar triggers the same creation flow, but with the sidebar visible. + +## Project Spec Skill + +The `site-spec` skill (`apps/cli/ai/plugin/skills/site-spec/SKILL.md`) gathers everything the agent needs before designing. It's a conversation, not a form. + +**Key principle:** The agent reads the user's initial message first and extracts everything it can — name, purpose, audience, tone, references. It only asks about what's genuinely missing. If the user front-loaded everything, it skips straight to design. + +**What the agent needs** (asks only for gaps): +- **Name** — suggest one if the user didn't provide it +- **Goals & Context** — purpose, audience, references +- **Structure** — one-page or multi-page (AskUserQuestion if unclear) +- **Stack** — WordPress theme, React+WP, Vue+WP, or "whatever works best" (AskUserQuestion, skip if obvious) +- **Tone & Style** — visual direction, colors, fonts, inspirations + +### Content Strategy + +Real-ish copy matching the project's purpose and tone, not lorem ipsum. Stock photos for imagery based on the project description. + +## Questionnaire System + +The agent uses `AskUserQuestion` to ask structured questions (stack choice, structure, etc.). This is a **separate IPC channel** from the permission system. + +### Architecture + +| Layer | Component | What it does | +|---|---|---| +| CLI | `headless.ts` `createIpcAskUserHandler()` | Sends `ai:question-request`, awaits `ai:question-response` | +| Main | `agent-manager.ts` | Forwards `ai:question-request` → renderer, `ai:question-response` → CLI | +| IPC | `task-question-request` event | Carries `QuestionRequest` (requestId, taskId, question, options) | +| Redux | `pendingQuestions` in tasks slice | Stores pending questions | +| UI | `TaskQuestionPrompt` | Renders question text + option buttons | +| UI | `TaskChatInput` | Detects pending question, routes typed text as answer | +| IPC | `respondToQuestionHandler` | Sends answer back through main → CLI | + +### Rendering + +- Question text shown above options +- Options render as a vertical list: bold label + muted description, hover turns text theme color +- No borders, no cards — clean text list +- Chat input placeholder changes to "Or type your own answer..." when a question is pending +- Typing in the chat input and submitting answers the question (bypasses the queue) + +## Design Previews + +### What they are + +Polished, impressive HTML/CSS/JS mockups. Not wireframes — these should look like real sites. The agent generates standalone `.html` files for each option, plus an `index.html` that shows all options side-by-side in iframes. + +### How they render + +The agent calls `browser_navigate` with the absolute file path (e.g., `/Users/.../previews/index.html`). The agent-manager detects the local file path, starts the preview server (`src/lib/preview-server.ts`), converts the path to `http://localhost:/index.html`, and sends the URL to the browser panel. CSP allows `http://localhost:*` in iframes. + +For setup tasks (no real site), the agent-manager sends the task ID instead of `SETUP_SITE_ID` so the browser panel's event matching works correctly. + +### Storage + +Preview files stored in `~/Studio/previews/`. Not temp — users can revisit. TODO: cleanup when a project is deleted. + +### Iteration + +Via chat — "Make the hero bigger", "More whitespace", "Can you try a dark version?". When ready, the user commits: "Let's go with this one" / "Build it". + +## Post-Creation Transition + +When the user commits to a design: + +1. **Agent calls `site_create`.** The `createSite` IPC handler automatically migrates any tasks with `SETUP_SITE_ID` to the newly created real site. The task now belongs to the real project. +2. **Sidebar should expand** (TODO: not yet automated). +3. **Floating tour should fire** (TODO: component exists at `src/components/new-ui/floating-tour.tsx` but not integrated). +4. **User should land on Project Detail view** (TODO: not wired). +5. **Build completes.** Task status updates. Browser panel can show the real running site. + +## Cleanup: Remove Add-Site Window — DONE + +The dedicated add-site Electron window has been fully removed: + +- **Deleted:** `add-site-window.ts`, `add-site-root.tsx` +- **Removed:** `openAddSiteWindow` / `closeAddSiteWindow` IPC handlers, preload bridge, renderer `view === 'add-site'` routing, `isAddSiteVisible` menu state +- **Redirected:** Sidebar button → `setCreatingProject(true)`, Menu → `create-project` IPC event, Deeplink → `create-project` route +- **Kept:** `CreateProjectFlow`, `ImportProjectStep`, `AddSiteModal` (existing imports), blueprint deeplink handler + +## Architecture + +### Main Window + +The creation flow runs in the main app window using the existing panel layout: + +- **Primary panel** — "What do you want to build?" chooser, then Task chat for spec conversation +- **Browser panel** — Design previews via preview server. Auto-expands when content arrives. +- **Sidebar** — Hidden during first-time creation. "Add project" button shows active state. + +### Components + +| Component | File | Status | +|---|---|---| +| Chooser (new vs. import) | `src/components/new-ui/create-project/create-project-flow.tsx` | Done | +| Import source picker | `src/components/new-ui/create-project/import-project-step.tsx` | UI done, handlers stubbed | +| Floating tour | `src/components/new-ui/floating-tour.tsx` | Built, not integrated | +| Task chat panel | `src/components/new-ui/tasks/task-chat-panel.tsx` | Done (max-width added) | +| Question prompt | `src/components/new-ui/tasks/task-question-prompt.tsx` | Done | +| Panel layout | `src/components/new-ui/panel-layout.tsx` | Done (auto-start, browser auto-expand) | +| Sidebar | `src/components/new-ui/sidebar.tsx` | Done (scroll fix, active state) | +| Task list | `src/components/new-ui/tasks/task-list.tsx` | Done (setup tasks, orphan cleanup) | +| Preview server | `src/lib/preview-server.ts` | Done | +| Site menu | `src/components/site-menu.tsx` | Done (deselect during creation) | + +### AI Pipeline + +| Component | File | Status | +|---|---|---| +| System prompt | `tools/common/ai/system-prompt.ts` | Updated (browser_navigate mentions file paths) | +| Site-spec skill | `apps/cli/ai/plugin/skills/site-spec/SKILL.md` | Rewritten (conversational, design phase mandatory) | +| Agent tools | `apps/cli/ai/tools.ts` | Updated (browser_navigate supports local files) | +| Agent manager | `apps/studio/src/modules/ai/lib/agent-manager.ts` | Updated (question IPC, preview server, SETUP_SITE_ID handling) | +| Headless agent | `apps/cli/commands/ai/headless.ts` | Updated (question channel, empty text fix) | +| IPC handlers | `apps/studio/src/modules/ai/lib/ipc-handlers.ts` | Updated (respondToQuestionHandler, task migration) | + +### Data + +- **Task metadata** — Stored in `appdata-v1.json`. Setup tasks use `siteId: '__project-setup__'`, migrated to real site on creation. +- **Chat messages** — Stored in `localStorage` via Redux listener +- **Design previews** — Stored in `~/Studio/previews/`, served via preview server +- **Session resume** — `TaskMetadata.sessionId` enables resuming if app closes mid-flow + +### Constants + +- `SETUP_SITE_ID = '__project-setup__'` — placeholder siteId for tasks before a real site exists + +## Terminology + +- **"Project"** in user-facing copy, not "site" +- Internal code keeps existing naming to avoid churn +- Sidebar uses "Projects" as the section header, "Add project" button +- Menu uses "New Project..." (Cmd+N) diff --git a/NEW-UI.md b/NEW-UI.md new file mode 100644 index 0000000000..f6c63310ba --- /dev/null +++ b/NEW-UI.md @@ -0,0 +1,107 @@ +# New UI Redesign + +This branch (`new-app-interface`) is a ground-up redesign of the Studio desktop app interface. The goal is to replace the existing single-sidebar layout with a flexible, modern three-panel architecture while establishing a proper design token system. + +## Status + +**Proof of Concept** — The panel structure, color system, navigation sidebar, settings window, and embedded browser are functional. The nav panel displays real site data with full site management controls. The primary panel renders SiteContentTabs or the Task chat. The secondary panel is a live browser preview of the active site with auto-authentication. The overview tab has been simplified to only show "Open in..." shortcuts (theme screenshot preview, Customize section, and all related thumbnail/theme-details infrastructure have been removed). + +## Architecture + +### Three-Panel Layout + +The app is built around three resizable panels using `react-resizable-panels`: + +- **PanelNavigation** — Left sidebar with Tasks and Sites sections. Shows real site data via `SiteMenu` with drag-and-drop reordering and start/stop controls. Collapsible via `Cmd+B` or by dragging narrow. +- **PanelPrimary** — Main content area. White/frame background. Always visible. Renders `SiteContentTabs` for the selected site, or `TaskChatPanel` when a task is selected. +- **PanelSecondary** — Embedded browser preview of the active site. Auto-authenticates via `/studio-auto-login` so the WP admin bar is visible. Includes back/forward/reload controls and an editable URL bar. Collapsible via `Cmd+Shift+B`. See `NEW-BROWSER.md` for full details. + +Each panel has min/max width constraints and drag-to-resize handles between them. Collapse/expand is animated. The primary panel toolbar adapts on macOS to make room for traffic lights when the nav panel is collapsed (including when collapsed via drag). + +**Panel persistence**: Panel sizes are saved/restored via `react-resizable-panels`' `useDefaultLayout` hook using `localStorage`. Collapsed state is tracked separately under the `panelLayout:collapsed` key so panels restore correctly on reload. + +### Navigation Sidebar + +The sidebar (`sidebar.tsx`) has two sections: + +- **Tasks** (top) — Task list with create, archive, and clear-archived controls. See `NEW-AGENT.md` for full details. +- **Sites** (bottom, pinned) — Header with site count and Start all / Stop all toggle. Renders the existing `SiteMenu` component which provides drag-and-drop reordering, context menus, start/stop controls, and spinner states for operations in progress. + +### Site Menu Updates + +The `SiteMenu` component (`site-menu.tsx`) has been updated for the new UI: + +- **Design tokens**: All hardcoded hex colors replaced with `chrome-*` tokens for proper light/dark mode support. +- **Animated drag-and-drop**: Items animate into position during drag using `translateY` transitions. An `orderMap` tracks each site's visual position while a `previewSites` state shows the reordered list. No empty spacer elements needed. + +### Toolbar + +A shared `Toolbar` component provides three slots -- `start`, `middle`, `end` -- where `middle` is absolutely centered regardless of the other slots' content width. Each panel has its own toolbar: + +- **Nav toolbar**: Settings button (end) +- **Primary toolbar**: Nav toggle (start), project name (middle), secondary toggle (end) +- **Secondary toolbar**: Back/forward/reload (start), editable URL input (fills remaining space). Styled with WP admin bar colors (`#1d2327` bg, `#a7aaad` text). Hidden during initial load; shows an indeterminate progress bar at the bottom during navigation. + +Toolbars use `@wordpress/components` `Button` with `icon` prop and icons from `@wordpress/icons`. + +### Color System + +Two token families, both defined as CSS custom properties with light/dark mode variants: + +**Chrome tokens** (`--color-chrome-*`) -- For the window background and navigation panel. Light mode uses a warm gray; dark mode uses near-black with white text at varying opacities. + +**Frame tokens** (`--color-frame-*`) -- For content panels. These existed before and are unchanged. + +All tokens are mapped to Tailwind classes (e.g., `bg-chrome`, `text-chrome-text-secondary`, `bg-frame`, `text-frame-text`). Panel separator handles also use these tokens. + +### Settings Window + +Settings opens in its own `BrowserWindow` rather than an in-app overlay. The renderer routes between the main app and settings based on a `?view=settings` URL parameter. The old modal-based settings UI has been fully removed — all settings now live in this window. + +**Tabs (user-facing):** +- **General** — Appearance (color scheme), language, code editor, terminal, Studio CLI toggle. All settings save instantly on change (no Save/Cancel buttons). +- **Account** — User info with Gravatar, logout, preview site quota/management, AI assistant prompt usage. Shows login prompt when not authenticated. +- **Skills** — Global WordPress skills management (install/remove across all sites). +- **MCP** — MCP server configuration JSON with copy button. + +**Tabs (dev-only, development builds):** +- **Automattician** — Platform override for UI testing (replaces the old floating DevController). +- **Colors** — Color token reference documentation. +- **WP Components** — WordPress component library showcase. +- **Studio Components** — Studio component demos with mock data. + +**Tab deep-linking:** `openSettingsWindow('skills')` opens the window directly to a specific tab via URL parameter (`?view=settings&tab=skills`). If the window is already open, it reloads to the requested tab and focuses. + +**Providers:** The settings window wraps with Redux, I18n, and Auth providers (minimal subset of the main app's provider stack — no site/onboarding providers needed). + +The **Studio Components** tab renders real Studio components (SiteMenu, Sidebar) with mock data using `MockProviders` that supply a fake `siteDetailsContext`, Redux store, and other required providers. + +## Key Files + +``` +apps/studio/src/ +├── components/ +│ ├── app.tsx # Root -- keyboard shortcuts, panel refs +│ ├── site-menu.tsx # Site list with drag-and-drop (updated for tokens) +│ └── new-ui/ +│ ├── panel-layout.tsx # Three-panel layout with persistence + browser +│ ├── toolbar.tsx # Start/middle/end toolbar component +│ ├── sidebar.tsx # Navigation: Tasks + Sites sections +├── hooks/ +│ └── use-browser-panel.ts # Browser panel state, navigation, auto-login +│ ├── settings-root.tsx # Settings window with all app settings +│ ├── studio-component-library.tsx # Studio component demos with mock data +│ └── color-system-reference.tsx # Color token docs (in settings) +├── settings-window.ts # Electron BrowserWindow for settings +├── index.css # Color token definitions +└── renderer.ts # Entry point with view routing +``` + +## Dev Tools + +Platform switching for UI testing lives in the settings window's **Automattician** tab (dev builds only). Changes propagate to the main window via `localStorage` cross-window events. + +## What's Next + +- Add site creation flow to the sidebar +- Port remaining functionality from the legacy UI diff --git a/apps/cli/ai/agent.ts b/apps/cli/ai/agent.ts index f953cae088..d6cdb497ae 100644 --- a/apps/cli/ai/agent.ts +++ b/apps/cli/ai/agent.ts @@ -1,5 +1,6 @@ import path from 'path'; import { query, type Query } from '@anthropic-ai/claude-agent-sdk'; +import { buildSystemPrompt } from '@studio/common/ai/system-prompt'; import { ALLOWED_TOOLS, STUDIO_ROOT, @@ -7,13 +8,12 @@ import { promptForApproval, type AskUserQuestion, } from 'cli/ai/security'; -import { buildSystemPrompt } from 'cli/ai/system-prompt'; import { createStudioTools } from 'cli/ai/tools'; export type { AskUserQuestion } from 'cli/ai/security'; export interface AiAgentConfig { - prompt: string; + prompt: string | AsyncIterable< import('@anthropic-ai/claude-agent-sdk').SDKUserMessage >; env?: Record< string, string >; model?: AiModelId; maxTurns?: number; diff --git a/apps/cli/ai/plugin/skills/site-spec/SKILL.md b/apps/cli/ai/plugin/skills/site-spec/SKILL.md index 0a2f6ec8b1..a6c1c59ab6 100644 --- a/apps/cli/ai/plugin/skills/site-spec/SKILL.md +++ b/apps/cli/ai/plugin/skills/site-spec/SKILL.md @@ -1,35 +1,66 @@ --- name: site-spec -description: Gather the site name and layout preference before building a WordPress site. Run this before creating any new site. +description: Gather the project name, goals, structure, stack, and tone before building. Run this before creating any new project. user-invokable: true --- -# Site Spec Discovery +# Project Spec Discovery -Before creating a new WordPress site, gather the user's basic preferences through a short interactive discovery phase. This produces a **Site Spec** that guides all subsequent design and development decisions. +Gather what you need to design the project. This is a **conversation**, not a form — be smart about what you already know. -## How to Run +## First: Read What the User Already Told You -Gather preferences through 2 rounds. Keep it concise. +Before asking anything, analyze the user's initial message. They often front-load a lot of information. Extract everything you can: +- Project name or working title +- What it's for and who it's for +- Visual direction, tone, or references +- Structure hints (single page, multiple pages, specific features) +- Technical preferences -**AskUserQuestion constraints**: Each call supports 1-4 questions, each with 2-4 options. An "Other" free-form option is automatically provided by the system — do NOT add one yourself. Keep option labels short (1-5 words). Only use AskUserQuestion for questions that have meaningful predefined options. For open-ended questions (like asking for a name), just ask in your text output — the user will type their answer in the prompt. +**Only ask about what's genuinely missing.** If the user said "a portfolio site for my photography with a minimal, clean feel" — you already have the name direction, purpose, audience, and tone. Don't ask about any of those. Jump straight to what you don't know (structure? stack?). -### Round 1 — Name +If the user gave you everything, skip straight to design. If they gave you almost everything, ask one or two clarifying questions and move on. Never robotically walk through all five rounds when the user already answered most of them. -Ask the user for their business/site name in your text output. **Stop here and wait for their reply** — do NOT call any tools or continue to the next round. The user needs a chance to type their answer in the prompt. +## What You Need (ask only for what's missing) -### Round 2 — Layout +**Name** — What's the project called? If the user described it but didn't name it, suggest a name and ask if it works. Don't force them to name it if they haven't — you can propose one. -After the user provides the name, use AskUserQuestion for: -- One-page site or multi-page site? (e.g., single scrollable page with sections vs. separate pages for each area) +**Goals & Context** — What's the project for? Who's the audience? Any reference URLs or images? Ask conversationally, not as a checklist. If the user already explained this, acknowledge it and move on. + +**Structure** — One-page or multi-page? Use AskUserQuestion with options only if you genuinely don't know. If the user described features that clearly imply multi-page (e.g. "user profiles, a feed, collections"), just confirm your assumption: "Sounds like this needs multiple pages — a feed, profile pages, collections. Sound right?" + +**Stack** — Use AskUserQuestion: +- **Whatever works best** — We'll pick the best approach for your project +- **WordPress theme** — Full WordPress with blocks and the editor +- **React + WordPress** — React frontend, WordPress backend +- **Vue + WordPress** — Vue frontend, WordPress backend + +Skip this entirely if the user specified a stack, or if the project clearly calls for a standard WordPress theme. + +**Tone & Style** — Visual direction, colors, fonts, inspirations. If the user already shared images or described the vibe, build on that: "Love the 90s direction — I'm thinking neon colors, chunky fonts, geometric patterns. Any specific colors or sites that inspire you?" Don't ask from scratch if they already set the direction. + +## AskUserQuestion Constraints + +Each call supports 1-4 questions, each with 2-4 options. An "Other" free-form option is automatically provided — do NOT add one yourself. Keep labels short (1-5 words). Only use AskUserQuestion for genuine multiple-choice moments. For open-ended questions, just ask in your text output. + +## Content Strategy + +Generate contextually appropriate content based on the spec — real-ish copy that matches the project's purpose and tone, not lorem ipsum. Use relevant stock photos for imagery. ## After Gathering Answers -Call `site_create` with the provided name and use the layout preference to guide all subsequent design decisions. +**CRITICAL: You MUST generate design previews before building anything.** Do NOT skip to `site_create` or theme building. + +### Design Preview Phase + +1. **Generate 2-3 design directions** as polished standalone HTML/CSS/JS files. Each should be a complete, impressive mockup — not a wireframe. Use real-ish content based on the spec. + +2. **Write each option** as a standalone `.html` file. Store them in `~/Studio/previews/`. Each file should be self-contained with inline CSS and any JS needed. Also create an `index.html` with iframes showing all options side-by-side for easy comparison. + +3. **Show the previews in the browser panel** by calling `browser_navigate` with the **full absolute file path** to the index.html (e.g. `browser_navigate("/Users/shaun/Studio/previews/index.html")`). The file will be served via a local HTTP server automatically. **You MUST call browser_navigate** — the user cannot see the files otherwise. + +4. **Tell the user** you've created the design options and describe each direction briefly. Ask them to pick one, iterate, or mix elements. -## When to Skip Discovery +5. **Wait for the user to commit** before building. They should explicitly say "build it" or pick an option. Until then, iterate on the designs based on feedback. -Do NOT ask questions if: -- The user already provided the name and layout preference in the initial prompt. Proceed directly with site creation. -- The user says "just build something" or "surprise me". Pick a bold creative direction yourself and proceed. -- The user explicitly asks to skip the setup or says they don't want questions. +6. **Only after the user commits** to a design direction should you proceed to `site_create` and theme building. diff --git a/apps/cli/ai/system-prompt.ts b/apps/cli/ai/system-prompt.ts index 9b9705acea..a1aeb8f369 100644 --- a/apps/cli/ai/system-prompt.ts +++ b/apps/cli/ai/system-prompt.ts @@ -1,104 +1,2 @@ -export function buildSystemPrompt(): string { - return `You are WordPress Studio AI, the AI assistant built into WordPress Studio CLI. Your name is "WordPress Studio AI". You manage and modify local WordPress sites using your Studio tools and generate content for these sites. - -IMPORTANT: You MUST use your mcp__studio__ tools to manage WordPress sites. Never create, start, or stop sites using Bash commands, shell scripts, or manual file operations. The Studio tools handle all server management, database setup, and WordPress provisioning automatically. -IMPORTANT: For any generated content for the site, these three principles are mandatory: - -- Gorgeous design: More details on the guidelines below. -- No HTML blocks and raw HTML: Check the block content guidelines below. -- No invalid block: Use the validate_blocks everytime to ensure that the blocks are 100% valid. - -## Workflow - -For any request that involves a WordPress site, you MUST first determine which site to use: - -- **"Create" / "build" / "make" a site**: Run the \`site-spec\` skill to gather the site name and layout preference FIRST, then proceed with site creation. Do NOT call site_list first. Do NOT reuse or repurpose any existing site. Every new project gets a fresh site. -- **User names a specific existing site**: Call site_list to find it. -- **User doesn't specify**: Ask the user whether to create a new site or use an existing one. -- **Resuming work on an existing site**: Use site_info to get details and continue working. - -Then continue with: - -1. **Get site details**: Use site_info to get the site path, URL, and credentials. -2. **Plan the design**: Before writing any code, review the site spec (from the site-spec skill) and the Design Guidelines below to plan the visual direction — layout, colors, typography, spacing. -3. **Write theme/plugin files**: Use Write and Edit to create files under the site's wp-content/themes/ or wp-content/plugins/ directory. -4. **Configure WordPress**: Use wp_cli to activate themes, install plugins, manage options, create posts and pages, edit and import content. The site must be running. Note: post content passed via \`wp post create\` or \`wp post update --post_content=...\` need to be pre-validated for editability and also validated using validate_blocks tool and adhere to the block content guidelines above as well. The \`wp_cli\` tool takes literal arguments, not shell commands: never use shell substitution or shell syntax such as \`$(cat file)\`, backticks, pipes, redirection, environment variables, or host temp-file paths to provide post content. Pass the literal content directly in \`--post_content=...\`, make \`--post_content\` the final argument in the command, and Studio will rewrite large content to a virtual temp file automatically. -5. **Check the misuse of HTML blocks**: Verify if HTML blocks were used as sections or not. If they were, convert them to regular core blocks and run block validation again. -6. **Check the result**: Use take_screenshot to capture the site's landing page on desktop and mobile and verify the design visually on both viewports, check for wrong spacing, alignment, colors, contrast, borders, hover styles and other visual issues. Fix any issues found. Pay particular attention to the navigation menu and the CTA buttons. The design needs to match your original expectations. - -## Available Studio Tools (prefixed with mcp__studio__) - -- site_create: Create a new WordPress site (name only — handles everything automatically) -- site_list: List all local WordPress sites with their status -- site_info: Get details about a specific site (path, URL, credentials, running status) -- site_start: Start a stopped site -- site_stop: Stop a running site -- site_delete: Delete a site from Studio and optionally move its files to trash -- preview_create: Create a hosted WordPress.com preview for a local site; this can take a few minutes, so tell the user to wait -- preview_list: List hosted WordPress.com previews for a local site -- preview_update: Update an existing hosted WordPress.com preview from a local site; this can take a few minutes, so tell the user to wait -- preview_delete: Delete a hosted WordPress.com preview by hostname -- wp_cli: Run WP-CLI commands on a running site -- validate_blocks: Validate block content for correctness on a running site (runs each block through its save() function in a real browser). Requires a site name or path. Call after every file write/edit that contains block content. -- take_screenshot: Take a full-page screenshot of a URL (supports desktop and mobile viewports). Use this to visually check the site after building it. - -## General rules - -- Design quality and visual ambition are not in conflict with using core blocks. Custom CSS targeting block classNames can achieve any visual design. The block structure is for editability; the CSS is for aesthetics. -- Do NOT modify WordPress core files. Only work within wp-content/. -- Before running wp_cli, ensure the site is running (site_start if needed). -- When building themes, always build block themes (NO CLASSIC THEMES). -- Always add the style.css as editor styles in the functions.php of the theme to make the editor match the frontend. -- For theme and page content custom CSS, put the styles in the main style.css of the theme. No custom stylesheets. -- Scroll animations must use progressive enhancement: CSS defines elements in their **final visible state** by default (full opacity, final position). JavaScript on the frontend adds the initial hidden state (e.g. \`opacity: 0\`, \`transform\`) and scroll-triggered transitions. This ensures elements are fully visible in the block editor (which loads theme CSS but not custom JS). -- All animations and transitions must respect \`prefers-reduced-motion\`. Add a \`@media (prefers-reduced-motion: reduce)\` block that disables or simplifies animations (e.g. \`animation: none; transition: none; scroll-behavior: auto;\`). - -## Block content guidelines - -- Only use \`core/html\` blocks for: - - Inline SVGs - - \`
\` elements and interactive inputs - - Animation/interaction markup with no block equivalent (marquee, cursor) - - A single \` +