Skip to content

feat: add Model Serving connector and plugin#239

Merged
pkosiec merged 6 commits intomainfrom
pkosiec/serving-1-core
Apr 10, 2026
Merged

feat: add Model Serving connector and plugin#239
pkosiec merged 6 commits intomainfrom
pkosiec/serving-1-core

Conversation

@pkosiec
Copy link
Copy Markdown
Member

@pkosiec pkosiec commented Apr 3, 2026

Summary

  • Add serving connector layer wrapping the Databricks SDK for endpoint invocation (invoke + SSE stream)
  • Add serving plugin with Express routes for /api/serving/:alias/invoke and /api/serving/:alias/stream
  • Add UPSTREAM_ERROR SSE error code for propagating Databricks API errors
  • Support named endpoint aliases for routing to multiple serving endpoints

Demo

model-serving-demo-compressed.mp4

PR Stack — Model Serving

# PR Description
1 this PR Serving connector & plugin
2 #240 Type generator, Vite plugin & UI hooks
3 #241 Dev-playground, template & docs

@pkosiec pkosiec force-pushed the pkosiec/serving-1-core branch from 218f9b5 to cab05df Compare April 9, 2026 14:14
pkosiec added 5 commits April 9, 2026 19:38
Add the core Model Serving plugin that provides an authenticated proxy
to Databricks Model Serving endpoints. Includes the connector layer
(SDK client wrapper) and the plugin layer (Express routes for
invoke/stream). Also adds UPSTREAM_ERROR SSE error code for propagating
API errors.

Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>
The serving plugin was not forwarding the abort signal to the serving
connector, unlike the genie plugin. Without the signal, the connector's
fetch request cannot be cancelled and the abort-check loop never triggers.

Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>
- Use SDK servingEndpoints.query() for invoke instead of raw fetch
- Use SDK apiClient.request({ raw: true }) for streaming SSE
- Fix exports() to support asUser via files plugin pattern
- Rename DATABRICKS_SERVING_ENDPOINT to DATABRICKS_SERVING_ENDPOINT_NAME
- Throw error on SSE buffer overflow instead of silent discard
- Add OBO rationale comment in injectRoutes
- Add SSE spec comments for empty line handling
- Add ServingEndpointHandle type with asUser support

Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>
The SDK's readableToWeb() closes the controller on "end" event.
Calling reader.cancel() unconditionally in the finally block causes
a "Controller is already closed" error on subsequent requests.
Only cancel when the signal was actually aborted (early termination).

Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>
…ndle

Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>
@pkosiec pkosiec force-pushed the pkosiec/serving-1-core branch from 2080612 to 1996ad6 Compare April 9, 2026 17:40
@pkosiec pkosiec force-pushed the pkosiec/serving-1-core branch from 22df022 to ac93525 Compare April 10, 2026 08:22
Eliminate double parse/serialize cycle in the serving streaming path.
Previously, the connector parsed upstream SSE into JS objects, then
StreamManager re-serialized them back to SSE for the browser. Since
none of the StreamManager features (reconnection replay, event
buffering, multi-client broadcast) are meaningful for serving streams,
pipe the raw ReadableStream directly to the Express response instead.

- Connector stream() now returns raw ReadableStream<Uint8Array>
- Plugin _handleStream pipes bytes via node:stream/promises pipeline
- Remove parsed SSE generator, ServingStreamOptions, mapUpstreamError
- Remove stream from ServingEndpointMethods (no programmatic consumers)
- Remove servingStreamDefaults (StreamManager no longer used)

Signed-off-by: Pawel Kosiec <pawel.kosiec@databricks.com>
@pkosiec pkosiec force-pushed the pkosiec/serving-1-core branch from ac93525 to 99bbfad Compare April 10, 2026 08:25
@pkosiec pkosiec enabled auto-merge (squash) April 10, 2026 13:23
@pkosiec pkosiec merged commit 9dc35f1 into main Apr 10, 2026
7 checks passed
@pkosiec pkosiec deleted the pkosiec/serving-1-core branch April 10, 2026 13:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants