Skip to content

nvproxy: reject opaque GSP legacy and NV2081_BINAPI control forwarding#12921

Open
ibondarenko1 wants to merge 1 commit intogoogle:masterfrom
ibondarenko1:security/nvproxy-block-opaque-gsp-binapi
Open

nvproxy: reject opaque GSP legacy and NV2081_BINAPI control forwarding#12921
ibondarenko1 wants to merge 1 commit intogoogle:masterfrom
ibondarenko1:security/nvproxy-block-opaque-gsp-binapi

Conversation

@ibondarenko1
Copy link
Copy Markdown

Previously, RM control commands with RM_GSS_LEGACY_MASK (bit 15) set or NV2081_BINAPI class were forwarded to the host NVIDIA driver via rmControlSimple(), which copies up to 1MB of guest-controlled opaque bytes without content validation.

While the typed handler map (controlCmd) validates parameters for all 183 known control commands, these two paths bypassed validation entirely. This is inconsistent with gVisor's defense-in-depth approach, where tpuproxy (TPU/VFIO passthrough) validates ALL ioctl parameters with typed handlers and rejects unknown commands.

This change rejects GSP legacy and NV2081_BINAPI controls with NV_ERR_NOT_SUPPORTED instead of forwarding opaque bytes. Standard CUDA/ML workloads should not be affected, as these are deprecated/undocumented interfaces. If legitimate use cases are identified, specific commands can be allowlisted with typed parameter validation.

Security impact: reduces the attack surface exposed to sandboxed workloads by preventing arbitrary opaque data from reaching the host NVIDIA kernel driver's GSP handler.

Previously, RM control commands with RM_GSS_LEGACY_MASK (bit 15) set or
NV2081_BINAPI class were forwarded to the host NVIDIA driver via
rmControlSimple(), which copies up to 1MB of guest-controlled opaque bytes
without content validation.

While the typed handler map (controlCmd) validates parameters for all 183
known control commands, these two paths bypassed validation entirely. This
is inconsistent with gVisor's defense-in-depth approach, where tpuproxy
(TPU/VFIO passthrough) validates ALL ioctl parameters with typed handlers
and rejects unknown commands.

This change rejects GSP legacy and NV2081_BINAPI controls with
NV_ERR_NOT_SUPPORTED instead of forwarding opaque bytes. Standard CUDA/ML
workloads should not be affected, as these are deprecated/undocumented
interfaces. If legitimate use cases are identified, specific commands can
be allowlisted with typed parameter validation.

Security impact: reduces the attack surface exposed to sandboxed workloads
by preventing arbitrary opaque data from reaching the host NVIDIA kernel
driver's GSP handler.
@google-cla
Copy link
Copy Markdown

google-cla bot commented Apr 10, 2026

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@ayushr2
Copy link
Copy Markdown
Collaborator

ayushr2 commented Apr 11, 2026

Hi @ibondarenko1. Thanks for the patch. In general, I agree with the motivation and direction of this patch and this will definitely reduce the driver surface we expose to the application. We aim to expose as little as possible to get most CUDA workloads running.

Standard CUDA/ML workloads should not be affected, as these are deprecated/undocumented interfaces.

I hope this is true. I have not tested this explicitly, but at least in the 525 driver, NV2081_BINAPI commands were still being used. Now most users are on 580+ in GKE. So the user-mode drivers might have been updated by now.

If legitimate use cases are identified, specific commands can be allowlisted with typed parameter validation.

Per the comment, these control commands are undocumented so we can't do the typed parameter validation.

@nixprime what do you think?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants