Skip to content

Cherry-pick:PR #39925: [ROCm] Remove deprecated hipCtx_t and HIP context APIs#794

Open
magaonka-amd wants to merge 1 commit intoROCm:rocm-jaxlib-v0.8.2from
magaonka-amd:fix/remove-hipctx-v082
Open

Cherry-pick:PR #39925: [ROCm] Remove deprecated hipCtx_t and HIP context APIs#794
magaonka-amd wants to merge 1 commit intoROCm:rocm-jaxlib-v0.8.2from
magaonka-amd:fix/remove-hipctx-v082

Conversation

@magaonka-amd
Copy link
Copy Markdown

Imported from GitHub PR openxla#39925

Summary

Remove deprecated hipCtx_t and HIP context APIs from the ROCm backend, replacing them with hipSetDevice/hipGetDevice as recommended by AMD since ROCm 1.9.

  • Simplify RocmContext to a header-only class with inline methods
  • Make RocmContext a value field of RocmExecutor (no heap allocation)
  • Remove 7 deprecated HIP context API calls
  • Remove 6 deprecated API wrappers from rocm_driver_wrapper.h
  • Delete rocm_context.cc ## Motivation
    AMD has deprecated all hipCtx* and hipDevicePrimaryCtx* APIs since ROCm 1.9 with the warning "might not be supported in future releases." On ROCm, hipCtx_t is a thin wrapper around a device ordinal -- the entire context lifecycle is a no-op. A future ROCm release could remove these APIs and break the build.

Test plan

  • Tested GPU device discovery (8x MI300X)
  • Tested matmul, memory allocation, multi-GPU peer transfer, device synchronization, rapid context switching across 8 GPUs Copybara import of the project:

--
d32fefc by Pham Binh phambinh@amd.com:

[ROCm] Remove deprecated hipCtx_t and HIP context APIs

Replace all deprecated hipCtx_t-based context management with hipSetDevice/hipGetDevice as recommended by AMD since ROCm 1.9.

On ROCm, hipCtx_t is a thin wrapper around a device ordinal and the entire context lifecycle (Retain/Release/SetCurrent/GetCurrent) is a no-op. AMD has deprecated these APIs with the warning "might not be supported in future releases."

Changes:

  • rocm_context.h: Simplify RocmContext to a trivial class holding only device_ordinal_. Remove Create(), context map, mutex, GetDeviceMemoryUsage, GetDeviceTotalMemory.
  • rocm_context.cc: Deleted. Method implementations (SetActive, IsActive, Synchronize) moved into rocm_executor.cc.
  • rocm_executor.h: Change RocmContext* to a value field initialized in the constructor. No heap allocation or factory needed.
  • rocm_executor.cc: Update all usages from pointer to address-of. Inline DeviceMemoryUsage and GetDeviceTotalMemory. Simplify DeviceFromContext() to use context->device_ordinal().
  • rocm_driver_wrapper.h: Remove 6 deprecated API wrappers (hipCtxGetDevice, hipCtxSetCurrent, hipDevicePrimaryCtxGetState, hipDevicePrimaryCtxSetFlags, hipDevicePrimaryCtxRetain, hipDevicePrimaryCtxRelease).
  • BUILD: Remove rocm_context.cc from srcs, simplify deps.

Merging this change closes openxla#39925

COPYBARA_INTEGRATE_REVIEW=openxla#39925 from ROCm:phambinh/remove-deprecated-hip-ctx-apis d32fefc PiperOrigin-RevId: 895760864

Imported from GitHub PR openxla#39925

## Summary
Remove deprecated hipCtx_t and HIP context APIs from the ROCm backend,
replacing them with hipSetDevice/hipGetDevice as recommended by AMD since
ROCm 1.9.
- Simplify RocmContext to a header-only class with inline methods
- Make RocmContext a value field of RocmExecutor (no heap allocation)
- Remove 7 deprecated HIP context API calls
- Remove 6 deprecated API wrappers from rocm_driver_wrapper.h
- Delete rocm_context.cc
## Motivation
AMD has deprecated all hipCtx* and hipDevicePrimaryCtx* APIs since ROCm 1.9
with the warning "might not be supported in future releases." On ROCm,
hipCtx_t is a thin wrapper around a device ordinal -- the entire context
lifecycle is a no-op. A future ROCm release could remove these APIs and
break the build.
## Test plan
- Tested GPU device discovery (8x MI300X)
- Tested matmul, memory allocation, multi-GPU peer transfer,
  device synchronization, rapid context switching across 8 GPUs
Copybara import of the project:

--
d32fefc by Pham Binh <phambinh@amd.com>:

[ROCm] Remove deprecated hipCtx_t and HIP context APIs

Replace all deprecated hipCtx_t-based context management with
hipSetDevice/hipGetDevice as recommended by AMD since ROCm 1.9.

On ROCm, hipCtx_t is a thin wrapper around a device ordinal and the
entire context lifecycle (Retain/Release/SetCurrent/GetCurrent) is
a no-op.  AMD has deprecated these APIs with the warning "might not
be supported in future releases."

Changes:
- rocm_context.h: Simplify RocmContext to a trivial class holding
  only device_ordinal_.  Remove Create(), context map, mutex,
  GetDeviceMemoryUsage, GetDeviceTotalMemory.
- rocm_context.cc: Deleted.  Method implementations (SetActive,
  IsActive, Synchronize) moved into rocm_executor.cc.
- rocm_executor.h: Change RocmContext* to a value field initialized
  in the constructor.  No heap allocation or factory needed.
- rocm_executor.cc: Update all usages from pointer to address-of.
  Inline DeviceMemoryUsage and GetDeviceTotalMemory.  Simplify
  DeviceFromContext() to use context->device_ordinal().
- rocm_driver_wrapper.h: Remove 6 deprecated API wrappers
  (hipCtxGetDevice, hipCtxSetCurrent, hipDevicePrimaryCtxGetState,
  hipDevicePrimaryCtxSetFlags, hipDevicePrimaryCtxRetain,
  hipDevicePrimaryCtxRelease).
- BUILD: Remove rocm_context.cc from srcs, simplify deps.

Merging this change closes openxla#39925

COPYBARA_INTEGRATE_REVIEW=openxla#39925 from ROCm:phambinh/remove-deprecated-hip-ctx-apis d32fefc
PiperOrigin-RevId: 895760864
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants