[Website][Experiment] Share Playground by adamziel · Pull Request #3170 · WordPress/wordpress-playground

adamziel · 2026-01-22T17:51:05Z

Motivation for the change, related issues

You boot a Playground in your browser, click Share, and a friend visiting the link sees and interacts with the same WordPress instance — without you spinning up a server, exposing a port, or installing anything beyond the page they already have. The peer-to-peer plumbing is a PHP relay that long-polls between the two browsers: the host keeps a /poll connection open, the guest's iframe sends requests through /relay/<sid>/request/..., the relay shuttles them to the host, the host processes them through its own in-browser PHP-WASM, and the response comes back the same way.

The relay is just one self-contained PHP file so it can drop into WP.com Atomic, a static host with PHP, or anywhere PHP runs. There is no Node, no database, no bespoke server.

What's in this PR

The relay itself. packages/playground/website/public/relay.php implements the full wire protocol: session creation, host long-poll, guest request tunneling, response delivery, status / heartbeat, and explicit close. Sessions live under a per-system temp directory (never under the public web root). Concurrent dispatch is flock()-protected so two pollers can't double-deliver the same request. Hosts that stop polling are marked dead within ~40s, in-flight guest requests fail fast instead of hanging for 30s, and a guest opening the share link a few ms ahead of the host's first poll waits briefly instead of getting an immediate 503.

A prominent Share button. Previously buried inside Site Manager → Additional actions, it now lives directly in the main toolbar where people can actually see it.

A live collaborator list. Each guest tab generates a stable UUID, sends it on every status heartbeat, and gets a sticky "Guest 1", "Guest 2", … label. The host's modal polls /status every 3s and shows the live count plus labels; guests are pruned ~10s after they stop checking in.

A host-disconnected overlay on the guest. When the host stops sharing — clicked Stop Sharing, closed the tab, or just walked away — the guest's banner flips from "● Connected" to "● Host disconnected" and a frozen-iframe overlay explains what happened, instead of the iframe sitting on a stale request that times out after 30s.

One relay code path everywhere. Earlier iterations of this PR ran an in-memory TypeScript relay middleware in dev and relay.php only in production, which is exactly the "works in dev, broken in prod" arrangement we want to avoid. Dev now spawns php -S 127.0.0.1:5264 relay.php automatically alongside the vite server (see the dev:relay-php nx target) and proxies /relay/* to it. Same code, same wire protocol, same failure modes — there is no other implementation to drift from.

End-to-end tests. packages/playground/website/playwright/e2e/sharing.spec.ts covers opening the modal from both the toolbar and the dropdown, starting and stopping a share, copying the share URL, the single-guest happy path through the relay, the multi-guest collaborator list growing from 0 → 1 → 2 and shrinking back to 1 when a guest closes its tab, and the host-disconnected overlay appearing on the guest after the host stops sharing. Each multi-tab test opens its guests in isolated BrowserContexts and drives the heartbeat by hand from the test (see pingGuestHeartbeat) — same-context tabs in headless Chromium would otherwise compete for active-tab focus and starve each other's setInterval.

Testing Instructions

npm run dev

Then in two browser tabs:

Open http://localhost:5400/website-server/. Wait for WordPress to load. Click Share in the toolbar, then Start Sharing, and copy the link.
Open the share link in a second tab. The guest should connect within a few seconds and render the same WordPress site, with the host's admin bar visible.
Back in the host tab, the modal should now say "1 collaborator connected" and show "Guest 1".
Open the share link in a third tab. Host modal flips to "2 collaborators connected" with both Guest 1 and Guest 2.
Close the third tab. Host modal should drop back to "1 collaborator connected" within ~10s.
Click Stop Sharing on the host. The remaining guest should immediately flip to "● Host disconnected" with a frozen-iframe overlay.

To run the automated suite:

npx nx run playground-website:e2e:playwright -- --project=chromium sharing.spec.ts

Possible follow-ups

Direct WebRTC peer-to-peer so the relay only carries the initial handshake.
Letting the guest keep using the shared Playground after the host disappears.
Read-only mode for guests.
Surfacing relay errors in the host modal instead of only on the guest.

Enables sharing a Playground instance with others through an HTTP long-polling relay. The host browser processes WordPress requests and sends responses back through the relay to guest browsers. Key components: - Relay middleware for Vite dev server - TunnelHost class for host-side request processing - SharedPlaygroundViewer for guest-side rendering - Share modal UI with copy-to-clipboard functionality - URL rewriting for HTML, CSS, and redirect headers

The sharing feature now persists when the modal is closed, allowing users to share their Playground in the background. A status indicator in the toolbar shows when sharing is active. Clicking it reopens the share modal for management. Also adds a PHP implementation of the relay server for production deployments where Node.js isn't available. The JavaScript relay continues to be used for development.

The php-relay-middleware and relay-middleware files use Node.js modules (fs) and should not be exported from the client-facing barrel file. This was causing "fs.readFileSync" errors in the browser.

When a host long-poll times out, the middleware tried to remove its resolver from the pollResolvers array by searching for the request Promise via indexOf. The array actually holds resolver functions, so the lookup never matched and stale entries piled up. A later guest request would then shift() a dead resolver instead of the live one, silently dropping the request until it hit the 30s timeout and bubbled up as a 504 Gateway. Keep a direct reference to the resolver function and use that for the cleanup so the array stays accurate.

Closing the host tab used to leave the guest staring at a live-looking shell that would silently time out on the next click, surfacing a raw "Gateway timeout" JSON blob thirty seconds later. Now the relay tracks when the host last polled, the host fires a sendBeacon to an explicit /close endpoint on pagehide, and the guest polls a small /status endpoint and drops a friendly "Host disconnected" overlay as soon as the session goes cold.

The Share action used to live three clicks deep inside the Site Manager site-info panel, where nobody was likely to find it. Promote it to a primary button in the browser-chrome toolbar so a first-time visitor sees it right next to Save and the site switcher. The service worker also starts letting /relay/* traffic pass through to the network instead of trying to serve it from the cache, so the host can actually open a sharing session from the same tab.

When a host clicks Share they are mostly flying blind — there is no way to tell whether anyone actually joined. This turns the share modal into a live collaborator panel: the relay tracks each guest tab by a stable UUID it heartbeats on every /status poll, and the host re-polls that same endpoint to render a pill for every guest (anonymous 'Guest 1', 'Guest 2' labels are plenty for now). Guests that go quiet for more than ten seconds drop off on their own, so closing a tab visibly shrinks the list.

Two things broke when the PHP relay became the only relay path in dev mode. First, the host's TunnelHost kicks off /poll in the background and returns from startSharing immediately, so a guest opening the share link a few milliseconds later races the host's first poll and the guest's /request/ landed on the relay before hostConnected was true. With the in-process TS middleware everything was synchronous and the race was invisible; the file-based PHP relay loses it routinely. Wait briefly for the host to show up before bailing. Second, vite's proxy uses changeOrigin and rewrites the Host header to the relay's own port, so when the host's TunnelHost rewrites absolute WordPress URLs in the response HTML it misses every one of them and the iframe loads a broken page. Forward X-Forwarded-Host through as the Host header in the tunnel request when present. While we're here, parallel playwright workers were starving the PHP CLI server's pool because each long-poll holds its worker for 25s. Bump PHP_CLI_SERVER_WORKERS to 20 so 3 simultaneous tests have room to breathe.

php-relay-middleware.ts was an in-vite bridge that ran relay.php under a single shared PHP-WASM instance. Now that the dev server proxies /relay/* straight to a real php -S running in its own process, that bridge is dead code — and it couldn't have served the long-polling relay anyway, since one in-flight host poll would block every other request through the shared instance. The barrel re-exports of QueuedRequest and TunnelSession went with it: those are server-side session-state types that nothing on the client side ever touched.

The website's lint job runs with maxWarnings=0 and was failing on leftover diagnostic console.log calls from when the relay was being debugged interactively, on parameter properties (which the project forbids because Node.js type stripping can't handle them), on a couple of `Function` types in the TunnelHost listener bag, and on a handful of small things — an `import()` type annotation in the sharing test, an unused catch binding, a `let` that should have been a `const`. Routine cleanup, no behavior changes.

The Playwright e2e config that CI uses spins up a static `vite preview` server with the cors-proxy next to it, but no PHP relay — so every share test that gets past the modal step (start-sharing, stop-sharing, copy-to-clipboard, the multi-tab flows) was failing because /relay/* hit the static preview server and got back HTML instead of JSON. The CI orchestration script now boots a real `php -S` for the relay alongside the cors-proxy, the vite preview block proxies /relay/* through to it, and the relay advertises share URLs at the right base for whichever context it's running in (127.0.0.1 in CI, 127.0.0.1:5400/website-server/ in dev).

Two webkit-specific things tripped on the CI matrix that don't show up locally on chromium. The clipboard tests were trying to grant clipboard-write, which doesn't exist in webkit's permission table. The "should start sharing" test wasn't even using the clipboard, so the grant just gets dropped there. The "should copy" test legitimately needs clipboard-read to verify the copied URL, so it now asks for only clipboard-read on webkit (which is what webkit accepts) and the React handler's writeText() still works because it runs from a real user gesture. The two tests that open a guest tab with `context.newPage()` — "should allow guest to view host playground" and the host- disconnected overlay test — were deterministically failing on webkit because same-context tabs in headless webkit compete for focus and starve each other's poll loops. The earlier multi-guest test already worked around this by giving each guest its own BrowserContext; the same fix applies here.

navigator.clipboard.writeText is the modern path but webkit's headless mode (and firefox in some configurations) rejects it with NotAllowedError even after the right permission is granted. The share modal now falls back to a hidden textarea + execCommand('copy') when writeText is unavailable, and flips the Copy button to "Copied!" either way so the click always feels responsive. The e2e test stops trying to read the OS clipboard everywhere — it now asserts on the user-visible "Copied!" label, which works in all three browsers, and only on chromium (where Playwright's clipboard permission grant is honored end to end) does it additionally read the clipboard back to confirm the URL was actually placed there.

The relay's session, request and response state used to live as JSON files under DATA_DIR with flock() handling concurrent access. That works fine for single-host setups (dev, Atomic, anywhere every PHP worker shares a disk) but it's not portable to multi-host deployments where workers can't see each other's filesystems. Pull all the storage operations behind a small RelayStorage interface with two interchangeable backends: the existing flock-protected FileRelayStorage (still the default, so an out-of-the-box checkout keeps working without any database setup), and a new MysqlRelayStorage that runs the same operations on InnoDB tables. The atomicity guarantees are equivalent — flock(LOCK_EX) on the session file maps to SELECT ... FOR UPDATE inside a short transaction, and the non-blocking try-lock that prevents two pollers from grabbing the same request maps to the same FOR UPDATE pattern on a single-row SELECT against the requests table. Pick a backend with the PLAYGROUND_RELAY_BACKEND env var. The MySQL class reads its credentials from the standard WordPress DB_HOST, DB_USER, DB_PASSWORD, DB_NAME and DB_PORT constants when defined — so it can drop into a wp-config environment with zero extra wiring — and falls back to env vars of the same name otherwise. The schema is created lazily on first connect via CREATE TABLE IF NOT EXISTS.

Stand a real MySQL service container up alongside the playwright runner, install pdo_mysql, wait for the server to come up, and hand PLAYGROUND_RELAY_BACKEND=mysql plus the DB_* credentials to the playwright subprocess so the relay's mysql storage class is the one exercised end-to-end by sharing.spec.ts. The variables go through on the sudo command line rather than via `sudo -E` because Ubuntu's default sudoers policy resets the environment. The file backend keeps its local round-trip smoke test, but every share test that runs in CI now drives the mysql code path — session create, withSession's SELECT ... FOR UPDATE, the claimNextRequest dispatch race, and the cleanup query.

Adds an end-to-end test that does the thing the share feature is supposed to do: the host edits a post in its in-browser WordPress while a guest is connected through the relay, then the guest navigates to that post and sees the new title. The update goes through window.playgroundSites.getClient().run(), the same path a real collaborative tool would use to mutate host state, and the verification is a fresh navigation through the relay tunnel — so we're catching anything that could break live propagation, not just initial page delivery. While editing the relay itself, two small things that didn't sit right: The MySQL backend used to fall back to localhost / root / empty password / "playground_relay" if the credentials weren't set, which is the kind of "helpful" default that hides misconfiguration until something silently connects to the wrong database. It now refuses to start without DB_HOST, DB_USER, DB_PASSWORD and DB_NAME and tells the operator which one is missing. DB_PORT still defaults to 3306 because that's the universal MySQL port and not really a credential. The session timeout was 30 minutes, which made no sense once HOST_DEAD_AFTER_MS detected silent hosts in 40 seconds and guests flipped to the disconnect overlay seconds after that. Sessions only need to survive long enough for guests to render the right UI — five minutes is comfortably more than that and short enough that abandoned sessions stop piling up.

`npm run dev` boots five processes in parallel and the website server (port 5400) used to start with a fixed `sleep 1` ahead of it. That's not actually a readiness check — it's a guess — and when the remote dev server (port 4400) is slow to bind, the very first request the browser makes hits the website's vite proxy, which forwards everything that isn't /website-server or /relay to the remote, gets ECONNREFUSED, and prints a confusing "http proxy error: /manifest.json" line before everything self- heals on the next request. Replace the sleep with a tiny portable port wait that polls 127.0.0.1:4400 until it accepts a TCP connection or times out at 30s. The dev:standalone target only starts once the remote is genuinely ready, so the first navigation no longer races startup.

Two sources of noise: the PHP built-in server prints a "Development Server started" banner per worker — twenty-one lines at startup with PHP_CLI_SERVER_WORKERS=20 — plus an "Accepted"/"Closing" pair on every request and a periodic "Failed to poll event" warning. And when the relay is briefly unreachable (mid-restart, killed worker, whatever), vite logs an "http proxy error" stack trace once per guest poll — every three seconds, forever, for as long as the browser tab stays open. Wrap `php -S relay.php` in a small node helper that filters the known-noise lines off stderr and forwards everything else, so real PHP errors and our own error_log() output still surface. Same wrapper for the dev:relay-php and preview:relay-php targets so CI benefits too. For the proxy noise, the relay proxy block in vite.config.ts now has its own error handler that returns a clean 502 to the client, and a custom logger filters the matching "http proxy error" line out of vite's terminal output. Other proxy errors still log normally.

The guest viewer used setInterval to drive its /status?gid= polling loop, recomputed the request URL on every render, and depended on those URLs in two separate useEffects. The combination meant that every state update — flipping to "connected", an error message, anything — produced new string references for relayBaseUrl and statusUrl, both effects tore down and re-ran, fired a brand-new fetch immediately, and the previous in-flight fetch was only "logically" cancelled via a closure flag while the network request kept running in the background. On a fresh share-URL load this piled up several /status requests that the JS would never wait for, the guest stayed in "connecting" forever, and only a manual page refresh broke out of it. Replace the two effects with a single self-scheduling loop: - relayBaseUrl and statusUrl are now memoised so their references are stable across re-renders and the effect only re-runs when sessionId or guestId actually changes. - A shared AbortController cancels the in-flight fetch the moment the component unmounts, instead of leaving it to the network. - The polling rhythm is "fetch → wait for the response → setTimeout the next call" so two /status requests can never overlap. - The initial /request/ probe and the /status loop share the same cancellation, the same controller, and the same sawHostAlive state — the previous code reset that flag on every effect re-run, which is also what made the host-disconnected detection brittle in the first place.

The host's polling loop was firing handleRequest() concurrently for every request it claimed from the relay, ignoring the requestQueue/processQueue machinery sitting right next to it. PHP-Wasm in the host iframe is single- threaded and not reentrant, so as soon as a guest opened a share URL and the iframe fanned out a dozen sub-resource fetches, the host deadlocked and every request 504'd. The visible symptom was a guest stuck on "Connecting...". Route the polled request through queueRequest() so handlers run one at a time, which is what the existing queue was always meant to do.

The previous URL rewriter swept the response body with a handful of regexes that could lose attributes the moment a perfectly legal HTML construct showed up — a `>` inside a title attribute, an unquoted src, a comment containing a fake tag, a URL string sitting inside a <script> body. None of those are exotic; WordPress and its themes emit them every day. Worse, the regexes happily rewrote URL-shaped substrings inside JS literals, silently corrupting the script. Use a real HTML parser instead. The host runs in a browser tab so DOMParser is always there; the unit test runs under jsdom so the parser shape is the same in both environments. The new module classifies every URL through one isRewritableUrl() gate so href, srcset, inline style, and standalone CSS all answer the same question the same way: leave anchors, protocol-relative, data:, javascript:, mailto:, tel:, third-party, and already-relayed URLs strictly alone. The accompanying spec is intentionally adversarial. Every case is there because at least one obvious regex approach gets it wrong — keep it that way the next time someone is tempted to "simplify" this back into a one-liner.

Two follow-ups to the queue fix that the security review on the queue approach surfaced. Stop Sharing used to be a soft suggestion. If a guest request was already mid-flight when the user clicked stop, the in-flight handleRequest would cheerfully complete its PHP run, build a response, and POST it to /relay/null/response/... — emitting a misleading error event from a session the user had already torn down. Worse, since the host's WordPress is logged in as admin, a guest write request landing 50 ms before the click could still mutate the host's filesystem after the user thought they had cut the connection. We can't actually cancel a PHP request once it's running in the worker, but we can refuse to forward its result. Each request now gets its own AbortController, stopSharing() trips it, every await checkpoint in handleRequest() and sendResponse() bails on a torn- down session, and we double-check the session id matches the one we started with so a fast Stop → Start cycle can't accidentally deliver an old guest's response into a new session. The polling loop also used to drain the relay as fast as it could, appending to an unbounded in-memory queue. A misbehaving guest (or just a slow PHP run) could grow that queue without limit. Cap it at 32 entries and pause polling while it's full so the relay's long-poll keeps the next request waiting on its side instead of piling bytes into our RAM.

adamziel added [Type] Enhancement New feature or request [Aspect] Website [Package][@wp-playground] Website labels Jan 22, 2026

adamziel added 10 commits April 7, 2026 12:26

Fix: Remove server-only exports from relay-server index

2efbf88

The php-relay-middleware and relay-middleware files use Node.js modules (fs) and should not be exported from the client-facing barrel file. This was causing "fs.readFileSync" errors in the browser.

Migrate dev env to PHP relay middleware

3420509

adamziel force-pushed the add-share-button-php-relay branch from f30170e to c95c53b Compare April 7, 2026 17:45

adamziel added 13 commits April 7, 2026 20:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Website][Experiment] Share Playground#3170

[Website][Experiment] Share Playground#3170
adamziel wants to merge 23 commits intotrunkfrom
add-share-button-php-relay

adamziel commented Jan 22, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

adamziel commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation for the change, related issues

What's in this PR

Testing Instructions

Possible follow-ups

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

adamziel commented Jan 22, 2026 •

edited

Loading