merge amd-main into amd-staging by z1-cciauto · Pull Request #2171 · ROCm/llvm-project

z1-cciauto · 2026-04-13T15:06:29Z

No description provided.

…0409183835

…0409202645

…0410122621

…lvm#191395) AArch64::getSVEPseudoMap calls are visible in compile-time profiles even on non-SVE targets. I think CodeGenMapTable could be improved, it's currently emitting a constexpr array sorted by opcode and a hand-rolled binary search over that array, however the AArch64ExpandPseudoInsts pass is missing a simple check for pseudo instructions before expanding. This avoids the compile-time cost. https://llvm-compile-time-tracker.com/compare.php?from=0d42811ea4658b3e86a3801b3bc848324f8540f8&to=9e2434de84577ca1c5e6de8fe8d75c6b8e282b3f&stat=instructions%3Au

…lvm#191119) Instead of bailing out if the original divisor exceeds HBitWidth, allow divisors that fit in HBitWidth after removing trailing zeros. PartialRem now needs a low and high part. Shifting RemL left now needs to handle shifting into RemH. Assisted-by: Claude Sonnet 4.5

) Fixes llvm#191292

…ine-packing) (llvm#188192) A transform pass to lower flat layout `vector.contract` operation to (a) amx.tile_mulf for BF16, or (b) amx.tile_muli for Int8 packed types via `online` packing. TODOs: On an another `patch` planned to re-factor this pass + retiring `convert-vector-to-amx` pass.

Part of the work to remove trivial VP intrinsics from the RISC-V backend, see https://discourse.llvm.org/t/rfc-remove-codegen-support-for-trivial-vp-intrinsics-in-the-risc-v-backend/87999 This splits off 4 intrinsics from llvm#179622.

llvm#190397) Implement Loop-splitting #pragma omp split construct with counts clause. Posting this PR after the revert of PR ([llvm#183261](llvm#183261)) Changes: 1. Added `openmp/runtime/test/transform/split/lit.local.cfg` 2. Enforced ICE for `counts` clause items in `SemaOpenMP.cpp` (minor change) 3. Updated tests `split_messages.cpp`, `split_omp_fill.cpp`, `split_diag_errors.c`. 4. Removed `nonconstant_count.cpp`

Similar to fp16 ldexp, we cannot create illegal types for bf16 during lowering so should promote.

…89319) Narrow the new setImplicitDefaultValue() guard so existing default bindings are preserved only for aggregate-like cases. The previous change was too broad and regressed normal zero-initialization, causing new int[10]{} to be modeled as undefined and emit a garbage-value warning instead of the expected analyzer reports.

This patch encapsulates the check for wether a `SUnit` is clustered, rather than letting it scatter across call sites. Currently there is only a single user, but more users can show up, and I think it provides a cleaner API even for that single user.

…k. NFC (llvm#191767)

…#191300) Updates the matmul verifier to check input and output shapes are valid. Also adds some tests for verifier failures which were previously not covered.

…lvm#191768) hasScalarTail currently returns incorrect results when queried after runtime checks have been added. Generalize and harden by checking if the middle block is a predecessor of the scalar preheader.

…integer type (llvm#191696) Fix llvm#191337

llvm#191766) The direct base of those pointers is not a union, i.e. `getRecord()` returns `nullptr`.

This patch removes the `-aarch64-new-sme-abi=<true/false>` option (which has been defaulted to "true" since LLVM 22), and removes the Selection DAG lowering for the SME ABI. There should be no functional changes for the default path (`-aarch64-new-sme-abi=true`).

This PR intends to add the log2p1f16 function to shared math, along with adding tests for it and bazel which was missed in [f0ce26d](llvm@f0ce26d).

This PR intends to add the log10p1f16 function to shared math, along with adding tests for it and Bazel which was missed in [a7d1a87](llvm@a7d1a87).

ConceptDecl doesn't have an associated template declaration, and it doesn't introduce a type either. Fixes llvm#188914

…#191771)

…m#191703) Fixes: llvm#191688

Construct the OptimizationRemarkEmitter in AArch64StackTagging on demand if not available and requires, this avoids computing several analysis in all pipelines.

llvm#183310) … features and test suite This involved: - Implementing the `__sanitizer_print_stack_trace` interface - Adding the common signal handlers - Correctly set the tool name - Cache the binary name before running

… failure (llvm#188973) DimOfReifyRankedShapedTypeOpInterface::matchAndRewrite called reifyDimOfResult via the PatternRewriter. Some implementations delegate to the coarse-grained reifyResultShapes, which creates ops for ALL dimensions (e.g. a tensor.dim) before discovering that a specific dimension is not reifiable (signalled by an empty OpFoldResult). The pattern then returned failure() once it saw the empty OpFoldResult, but the newly created ops were already in the IR. Under MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS this triggered "pattern returned failure but IR did change". Fix: record the op immediately before the matched dim op, so we can identify ops inserted during the reification attempt. If reification returns an empty (unreifiable) OpFoldResult, erase those newly created ops before returning failure, restoring the IR to its original state. Assisted-by: Claude Code

…m#188968) Two bugs were introduced/revealed by MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS: 1. `ACCLoopTilingImpl::matchAndRewrite` returned `success()` for loops with no tile values, triggering "pattern returned success but IR did not change". Fixed by returning `failure()` instead. 2. `moveOpsAndReplaceIVs` moved ops between blocks via `splice()` and updated operands via `replaceAllUsesInRegionWith()` without notifying the rewriter. This caused "operation fingerprint changed" errors since the moved ops' parent op and operands changed without `startOpModification`/ `finalizeOpModification`. Fixed by wrapping all moved ops (and their nested ops) with rewriter modification notifications. Assisted-by: Claude Code

This helps to not lower, improving the number of nodes that we expand into and improving the quality of the generated code. Originally a part of llvm#177158 by Ryan Cowan

…#191778)

…vm#189554) Replace `dyn_cast<IntrinsicInst> + getIntrinsicID()` chains with PatternMatch combinators where applicable

…eger sampled type (llvm#190742) Fix spirv-val failure: ``` error: line 20: Capability Int64ImageEXT is required when using Sampled Type of 64-bit int %spirv_Image = OpTypeImage %ulong 3D 0 0 0 0 Unknown ReadOnly ``` Detect when OpTypeImage uses a 64-bit integer as its sampled type and automatically add the Int64ImageEXT capability and SPV_EXT_shader_image_int64 extension related to llvm#190736

The documentation incorrectly stated that this metadata enables/disables vectorization, but it actually controls predication. Also clarify that enabling predication implicitly enables vectorization.

Use `{{.*}}` instead of `i64` for `memset` size type, as 32-bit platforms use `i32`.

…rations (llvm#188743) UpdateVCEPass only queried capabilities via QueryCapabilityInterface and SPIRV type capabilities, but did not check capabilities implied by decoration attributes on ops. Specifically, the DescriptorSet and Binding decorations—represented by the `binding` and `descriptor_set` attributes on `spirv.GlobalVariable`—require the `Shader` capability per the SPIR-V spec Decoration table, but this was not deduced. Add an explicit check in UpdateVCEPass: when a `spirv.GlobalVariable` has a `binding` or `descriptor_set` attribute, require the `Shader` capability. Part of llvm#168357 Assisted-by: Claude Code

Fixes unused-but-set globals on non-Unix paths in kmp_alloc.cpp

… unit-test compiles (llvm#191564) 1. `check_cxx_compiler_flag` marked `COMPILER_RT_HAS_EXTERNAL_FLAG` true for clang-cl, so `/experimental:external` and `/external:anglebrackets` were passed and clang-cl warned they were unused. Now only probe with real MSVC (not Clang); set the flag false otherwise; rely on that in asan/interception/ubsan. 2. Custom compile lines for unit tests didn’t get C++17, so headers hit `-Wc++17-extensions` (e.g. `constexpr if` / message-less `static_assert` in FuzzedDataProvider / asan_fake_stack). Now append `-std=c++${CMAKE_CXX_STANDARD}` or `-std=c++17` for C++ sources in `clang_compile()` for both standalone and non-standalone builds.

Previously clang-cl was generating "warning: comparison of integers of different signs".

Improve error handling with the following changes: - If `kvm_open2()` fails in `DoLoadCore()`, log error with a message obtained from the function. - Return false or continue for loop if memory read fails. - Rename `!error.Success()` to `error.Fail()` for readability (NFC). Signed-off-by: Minsoo Choo <minsoochoo0122@proton.me>

…-type te…" (llvm#191798) Reverts llvm#189510 Crashes lldb on certain type of debug info. See llvm#189510 (comment) for more details.

…n assemblyFormat (llvm#188726) Using an optional attribute directly in assemblyFormat (e.g., `$attr attr-dict`) without wrapping it in an optional group causes the generated printer to call `printAttribute` with a null `Attribute` when the attribute is absent. This leads to a crash in the alias initializer when it calls `getAlias` on a null attribute. Add a validation check in `OpFormatParser::verifyAttributes` that detects this pattern and emits a diagnostic error with a helpful note pointing users to the correct `($attr^)?` syntax. Fixes llvm#58064 Assisted-by: Claude Code

…7281) Add binary arithmetic multiplication, division, remainder to DIL. This patch also passes DILMode to the parser to check if binary multiplication is allowed by the mode. This cannot be done in the lexer alone, because it allows token `*` as a dereference operator for legacy mode, but that token could also be a binary multiplication allowed only in full mode.

…sa) (llvm#188741) The "riscv-isa" LLVM module flag stores its value as an MDTuple containing MDStrings (e.g. `\!{\!"rv64i2p1", \!"m2p0"}`). Previously, this fell through the unrecognized-key path in `convertModuleFlagValueFromMDTuple`, which emitted a warning and dropped the flag during import. This patch adds generic handling for MDTuples whose operands are all MDStrings: - Import: convert to `ArrayAttr<StringAttr>` in `convertModuleFlagValueFromMDTuple` - Export: convert `ArrayAttr<StringAttr>` back to an MDTuple of MDStrings in `convertModuleFlagValue`, enabling a lossless round-trip - Verifier: allow `ArrayAttr<StringAttr>` as a valid `ModuleFlagAttr` value for keys not otherwise handled by specific verifier branches Fixes llvm#188122 Assisted-by: Claude Code

…part 43) (llvm#191753) Tests converted from test/Lower/Intrinsics: transpose.f90, transpose_opt.f90, trim.f90, ubound.f90, ubound01.f90

llvm#191385) …n test suite Secondary pr to enable tests after llvm#183310 enables the features

This patch adds the `strnlen_s` function from Annex K. In order to reduce duplication between `strnlen` and `strnlen_s`, the common logic has been extracted to a new internal function which both now call. In addition to the function definition, the patch adds a unit test and a fuzzing test.

…on (llvm#191792) OpReturnValue with a pointer type is invalid in SPIR-V Logical addressing model (Vulkan). The functions in the test return OpAccessChain results, which are pointers related to llvm#190736

…hLR (llvm#175991) When users request branch protection with PAuthLR on targets that do not support the PAuthLR instructions, the PAUTH_EPILOGUE falls back to using hint-space instructions. This fallback sequence uses X16 as a temporary register, but X16 was not listed in the clobber set. Because Speculative Load Hardening uses X16, this omission made SLH incompatible with this PAUTH_EPILOGUE path. Mark X16 as clobbered so the compiler does not assume X16 is preserved across the epilogue, restoring compatibility with Speculative Load Hardening and avoiding incorrect register liveness assumptions. The clobber is added in C++ rather than TableGen, as X16 is only clobbered when PAuthLR is requested as a branch protection variation and should not be treated as clobbered unconditionally.

…edure pointers. (llvm#183268) Fixes llvm#177505. This patch updates an existing external procedure symbol with the correct function signature and argument attributes, so it can be safely used as a proc_target without signature conflicts. --------- Co-authored-by: jeanPerier <jean.perier.polytechnique@gmail.com>

…0413123215

z1-cciauto · 2026-04-13T15:11:37Z

PSDB Link: https://compiler-ci.amd.com/job/compiler-psdb-amd-staging/5102

github-actions bot and others added 30 commits April 9, 2026 18:38

Merge remote-tracking branch 'upstream/main' into upstream_merge_2026…

278adb1

…0409183835

Upstream merge 2026-04-09 14:38 EDT (#2119)

6bc78cd

Merge remote-tracking branch 'upstream/main' into upstream_merge_2026…

1cdc824

…0409202645

Upstream merge 2026-04-09 16:26 EDT (#2122)

e18ebca

Merge remote-tracking branch 'upstream/main' into upstream_merge_2026…

5f9027d

…0410122621

Upstream merge 2026-04-10 08:26 EDT (#2131)

b723246

[SLP] Fix handling of strided loads during re-vectorization (llvm#191294

f647f0c

) Fixes llvm#191292

[AArch64] Fix legalization of bf16 ldexp. (llvm#190805)

82442a5

Similar to fp16 ldexp, we cannot create illegal types for bf16 during lowering so should promote.

[VPlan] Assert ComputeReductionResult isn't predicated in middle bloc…

a959796

…k. NFC (llvm#191767)

[mlir][tosa] Improve matmul verifier to check shape information (llvm…

53e01f1

…#191300) Updates the matmul verifier to check input and output shapes are valid. Also adds some tests for verifier failures which were previously not covered.

[VPlan] Directly check if middle block is pred of scalar preheader. (l…

6073fde

…lvm#191768) hasScalarTail currently returns incorrect results when queried after runtime checks have been added. Generalize and harden by checking if the middle block is a predecessor of the scalar preheader.

[clang-tidy] Fix a false positive when converting a bool to a signed …

abece58

…integer type (llvm#191696) Fix llvm#191337

[clang][bytecode] Fix placement new on multidimensional array elements (

923340b

llvm#191766) The direct base of those pointers is not a union, i.e. `getRecord()` returns `nullptr`.

[libc][math] Fix: add log2p1f16 to shared math (llvm#189179)

ec94434

This PR intends to add the log2p1f16 function to shared math, along with adding tests for it and bazel which was missed in [f0ce26d](llvm@f0ce26d).

[libc][math] Fix: add log10p1f16 to shared math (llvm#189185)

bdc1192

This PR intends to add the log10p1f16 function to shared math, along with adding tests for it and Bazel which was missed in [a7d1a87](llvm@a7d1a87).

[Clangd] Don't traverse ConceptDecl in typeForNode (llvm#191654)

f65301d

ConceptDecl doesn't have an associated template declaration, and it doesn't introduce a type either. Fixes llvm#188914

[ORC] Forward declare DylibManager in ExecutorProcessControl.h. (llvm…

70b9fec

…#191771)

[libc++] Fix the mdspan ElementType complete object type mandate (llv…

df6c820

…m#191703) Fixes: llvm#191688

[AArch64] Don't forcefully add ORE to O0 pipeline (llvm#191476)

75ca0f7

Construct the OptimizationRemarkEmitter in AArch64StackTagging on demand if not available and requires, this avoids computing several analysis in all pipelines.

[libc][bazel][math][NFC] Fix deps (llvm#191785)

358f3d7

joker-eph and others added 25 commits April 13, 2026 12:20

[AArch64][GISel] Clamp bitcast to v2i64 (llvm#191360)

72d3533

This helps to not lower, improving the number of nodes that we expand into and improving the quality of the generated code. Originally a part of llvm#177158 by Ryan Cowan

[ORC] Forward declare MemoryAccess in ExecutorProcessControl.h. (llvm…

6dbf9d1

…#191778)

[NFC][SPIR-V] Use PatternMatch combinators in SPIRVEmitIntrinsics (ll…

4c3ef83

…vm#189554) Replace `dyn_cast<IntrinsicInst> + getIntrinsicID()` chains with PatternMatch combinators where applicable

[LangRef] Correct description of predicate.enable MD (llvm#191496)

8dcd471

The documentation incorrectly stated that this metadata enables/disables vectorization, but it actually controls predication. Also clarify that enabling predication implicitly enables vectorization.

[clang][test] Fix 32-bit bot failures after cc419f1 (llvm#191551)

137c53c

Use `{{.*}}` instead of `i64` for `memset` size type, as 32-bit platforms use `i32`.

[openmp] Silence warnings when building on Windows (llvm#191556)

988e00e

Fixes unused-but-set globals on non-Unix paths in kmp_alloc.cpp

[GSYM] Silence cast warning (llvm#191561)

d9a9962

Previously clang-cl was generating "warning: comparison of integers of different signs".

Revert "[lldb][DWARFASTParserClang] Handle pointer-to-member-data non…

a98ecc9

…-type te…" (llvm#191798) Reverts llvm#189510 Crashes lldb on certain type of debug info. See llvm#189510 (comment) for more details.

[flang][NFC] Converted five tests from old lowering to new lowering (…

848bf3e

…part 43) (llvm#191753) Tests converted from test/Lower/Intrinsics: transpose.f90, transpose_opt.f90, trim.f90, ubound.f90, ubound01.f90

[TySan][Sanitizer Common] Enable TySan testing in the sanitizer commo… (

dd0c5eb

llvm#191385) …n test suite Secondary pr to enable tests after llvm#183310 enables the features

[NFC][SPIR-V] Fix logical-struct-access.ll to pass spirv-val validati…

e027a17

…on (llvm#191792) OpReturnValue with a pointer type is invalid in SPIR-V Logical addressing model (Vulkan). The functions in the test return OpAccessChain results, which are pointers related to llvm#190736

Merge remote-tracking branch 'upstream/main' into upstream_merge_2026…

498e792

…0413123215

Upstream merge 2026-04-13 08:32 EDT (#2169)

00f5c9a

merge amd-main into amd-staging

14bebf9

z1-cciauto requested review from antiagainst and kuhar as code owners April 13, 2026 15:06

z1-cciauto requested a review from a team April 13, 2026 15:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

merge amd-main into amd-staging#2171

merge amd-main into amd-staging#2171
z1-cciauto wants to merge 55 commits intoamd-stagingfrom
upstream_merge_202604131106

z1-cciauto commented Apr 13, 2026

Uh oh!

z1-cciauto commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

z1-cciauto commented Apr 13, 2026

Uh oh!

z1-cciauto commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants