Skip to content

fea(car): support custom group device#2703

Open
TennyWang1223 wants to merge 6 commits intomainfrom
support_custom_group
Open

fea(car): support custom group device#2703
TennyWang1223 wants to merge 6 commits intomainfrom
support_custom_group

Conversation

@TennyWang1223
Copy link
Copy Markdown
Contributor

Motivation

ep_group can't rum custom allreduce
support customize communication group setup, custom can select which device in which group

Technical Details

add more if else and assert, create custom_all_reduce interface

Test Plan

see op_tests/multigpu_tests/test_custom_group_allreduce.py

Test Result

run with AITER_LOG_MORE=1 and see all launched kernels from aiter custom_all_reduce
accuracy pass, group set correct

Submission Checklist

Signed-off-by: TennyWang1223 <root@hjbog-srdc-24.amd.com>
@TennyWang1223 TennyWang1223 requested review from a team and valarLip April 12, 2026 04:32
@github-actions
Copy link
Copy Markdown
Contributor

🏷️ CI Guide

Runs automatically on every PR:

  • ✅ Pre-checks (submodule verification, code formatting)
  • ✅ Aiter op tests (gfx942 + gfx950)
  • ✅ Triton tests (only when aiter/ops/triton/** or related paths are changed)

Extended tests (opt-in via labels):

Label Tests
ci:triton-355 Run Triton tests on MI355 in addition to MI325
ci:sglang SGLang integration tests
ci:atom ATOM benchmark (DeepSeek-R1 + GPT-OSS)
ci:vllm vLLM benchmark
ci:all All of the above

Add labels via the sidebar or gh pr edit 2703 --add-label <label>

TennyWang1223 and others added 5 commits April 12, 2026 04:36
Signed-off-by: TennyWang1223 <root@hjbog-srdc-24.amd.com>
Signed-off-by: TennyWang1223 <root@hjbog-srdc-24.amd.com>
Signed-off-by: TennyWang1223 <root@hjbog-srdc-24.amd.com>
Signed-off-by: TennyWang1223 <root@hjbog-srdc-24.amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant