Skip to content

merge amd-main into amd-staging#2171

Open
z1-cciauto wants to merge 55 commits intoamd-stagingfrom
upstream_merge_202604131106
Open

merge amd-main into amd-staging#2171
z1-cciauto wants to merge 55 commits intoamd-stagingfrom
upstream_merge_202604131106

Conversation

@z1-cciauto
Copy link
Copy Markdown
Collaborator

No description provided.

github-actions bot and others added 30 commits April 9, 2026 18:38
…lvm#191395)

AArch64::getSVEPseudoMap calls are visible in compile-time profiles even on
non-SVE targets. I think CodeGenMapTable could be improved, it's currently
emitting a constexpr array sorted by opcode and a hand-rolled binary search
over that array, however the AArch64ExpandPseudoInsts pass is missing a simple
check for pseudo instructions before expanding. This avoids the compile-time
cost.

https://llvm-compile-time-tracker.com/compare.php?from=0d42811ea4658b3e86a3801b3bc848324f8540f8&to=9e2434de84577ca1c5e6de8fe8d75c6b8e282b3f&stat=instructions%3Au
…lvm#191119)

Instead of bailing out if the original divisor exceeds HBitWidth,
allow divisors that fit in HBitWidth after removing trailing zeros.

PartialRem now needs a low and high part. Shifting RemL left
now needs to handle shifting into RemH.

Assisted-by: Claude Sonnet 4.5
…ine-packing) (llvm#188192)

A transform pass to lower flat layout `vector.contract` operation to (a)
amx.tile_mulf for BF16, or (b) amx.tile_muli for Int8 packed types via
`online` packing.

TODOs: On an another `patch` planned to re-factor this pass + retiring
`convert-vector-to-amx` pass.
llvm#190397)

Implement Loop-splitting #pragma omp split construct with counts clause.
Posting this PR after the revert of PR
([llvm#183261](llvm#183261))

Changes: 

1. Added `openmp/runtime/test/transform/split/lit.local.cfg`
2. Enforced ICE for `counts` clause items in `SemaOpenMP.cpp` (minor
change)
3. Updated tests `split_messages.cpp`, `split_omp_fill.cpp`,
`split_diag_errors.c`.
4. Removed `nonconstant_count.cpp`
Similar to fp16 ldexp, we cannot create illegal types for bf16 during
lowering so should promote.
…89319)

Narrow the new setImplicitDefaultValue() guard so existing default
bindings are preserved only for aggregate-like cases.

The previous change was too broad and regressed normal
zero-initialization, causing new int[10]{} to be modeled as undefined
and emit a garbage-value warning instead of the expected analyzer
reports.
This patch encapsulates the check for wether a `SUnit` is clustered,
rather than letting it scatter across call sites. Currently there is
only a single user, but more users can show up, and I think it provides
a cleaner API even for that single user.
…#191300)

Updates the matmul verifier to check input and output shapes are valid.

Also adds some tests for verifier failures which were previously not
covered.
…lvm#191768)

hasScalarTail currently returns incorrect results when queried after
runtime checks have been added. Generalize and harden by checking if the
middle block is a predecessor of the scalar preheader.
llvm#191766)

The direct base of those pointers is not a union, i.e. `getRecord()`
returns `nullptr`.
This patch removes the `-aarch64-new-sme-abi=<true/false>` option (which
has been defaulted to "true" since LLVM 22), and removes the Selection
DAG lowering for the SME ABI.

There should be no functional changes for the default path
(`-aarch64-new-sme-abi=true`).
This PR intends to add the log2p1f16 function to shared math, along with
adding tests for it and bazel which was missed in
[f0ce26d](llvm@f0ce26d).
This PR intends to add the log10p1f16 function to shared math, along
with adding tests for it and Bazel which was missed in
[a7d1a87](llvm@a7d1a87).
ConceptDecl doesn't have an associated template declaration, and it
doesn't introduce a type either.

Fixes llvm#188914
Construct the OptimizationRemarkEmitter in AArch64StackTagging on demand
if not available and requires, this avoids computing several analysis in
all pipelines.
llvm#183310)

… features and test suite
This involved:
- Implementing the `__sanitizer_print_stack_trace` interface
- Adding the common signal handlers
- Correctly set the tool name
- Cache the binary name before running
… failure (llvm#188973)

DimOfReifyRankedShapedTypeOpInterface::matchAndRewrite called
reifyDimOfResult via the PatternRewriter. Some implementations delegate
to the coarse-grained reifyResultShapes, which creates ops for ALL
dimensions (e.g. a tensor.dim) before discovering that a specific
dimension is not reifiable (signalled by an empty OpFoldResult).

The pattern then returned failure() once it saw the empty OpFoldResult,
but the newly created ops were already in the IR. Under
MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS this triggered "pattern
returned failure but IR did change".

Fix: record the op immediately before the matched dim op, so we can
identify ops inserted during the reification attempt. If reification
returns an empty (unreifiable) OpFoldResult, erase those newly created
ops before returning failure, restoring the IR to its original state.

Assisted-by: Claude Code
joker-eph and others added 25 commits April 13, 2026 12:20
…m#188968)

Two bugs were introduced/revealed by
MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS:

1. `ACCLoopTilingImpl::matchAndRewrite` returned `success()` for loops
with no tile values, triggering "pattern returned success but IR did not
change". Fixed by returning `failure()` instead.

2. `moveOpsAndReplaceIVs` moved ops between blocks via `splice()` and
updated operands via `replaceAllUsesInRegionWith()` without notifying
the rewriter. This caused "operation fingerprint changed" errors since
the moved ops' parent op and operands changed without
`startOpModification`/ `finalizeOpModification`. Fixed by wrapping all
moved ops (and their nested ops) with rewriter modification
notifications.

Assisted-by: Claude Code
This helps to not lower, improving the number of nodes that we expand
into and improving the quality of the generated code.

Originally a part of llvm#177158 by Ryan Cowan
…vm#189554)

Replace `dyn_cast<IntrinsicInst> + getIntrinsicID()` chains with
PatternMatch combinators where applicable
…eger sampled type (llvm#190742)

Fix spirv-val failure:
```
error: line 20: Capability Int64ImageEXT is required when using Sampled Type of 64-bit int
  %spirv_Image = OpTypeImage %ulong 3D 0 0 0 0 Unknown ReadOnly
```

Detect when OpTypeImage uses a 64-bit integer as its sampled type and
automatically add the Int64ImageEXT capability and
SPV_EXT_shader_image_int64 extension

related to llvm#190736
The documentation incorrectly stated that this metadata enables/disables
vectorization, but it actually controls predication. 
Also clarify that enabling predication implicitly enables vectorization.
Use `{{.*}}` instead of `i64` for `memset` size type, as 32-bit
platforms use `i32`.
…rations (llvm#188743)

UpdateVCEPass only queried capabilities via QueryCapabilityInterface and
SPIRV type capabilities, but did not check capabilities implied by
decoration attributes on ops. Specifically, the DescriptorSet and
Binding decorations—represented by the `binding` and `descriptor_set`
attributes on `spirv.GlobalVariable`—require the `Shader` capability per
the SPIR-V spec Decoration table, but this was not deduced.

Add an explicit check in UpdateVCEPass: when a `spirv.GlobalVariable`
has a `binding` or `descriptor_set` attribute, require the `Shader`
capability.

Part of llvm#168357

Assisted-by: Claude Code
Fixes unused-but-set globals on non-Unix paths in kmp_alloc.cpp
… unit-test compiles (llvm#191564)

1. `check_cxx_compiler_flag` marked `COMPILER_RT_HAS_EXTERNAL_FLAG` true
for clang-cl, so `/experimental:external` and `/external:anglebrackets`
were passed and clang-cl warned they were unused. Now only probe with
real MSVC (not Clang); set the flag false otherwise; rely on that in
asan/interception/ubsan.

2. Custom compile lines for unit tests didn’t get C++17, so headers hit
`-Wc++17-extensions` (e.g. `constexpr if` / message-less `static_assert`
in FuzzedDataProvider / asan_fake_stack). Now append
`-std=c++${CMAKE_CXX_STANDARD}` or `-std=c++17` for C++ sources in
`clang_compile()` for both standalone and non-standalone builds.
Previously clang-cl was generating "warning: comparison of integers of
different signs".
Improve error handling with the following changes:

- If `kvm_open2()` fails in `DoLoadCore()`, log error with a message
obtained from the function.
- Return false or continue for loop if memory read fails.
- Rename `!error.Success()` to `error.Fail()` for readability (NFC).

Signed-off-by: Minsoo Choo <minsoochoo0122@proton.me>
…-type te…" (llvm#191798)

Reverts llvm#189510

Crashes lldb on certain type of debug info. See
llvm#189510 (comment)
for more details.
…n assemblyFormat (llvm#188726)

Using an optional attribute directly in assemblyFormat (e.g., `$attr
attr-dict`) without wrapping it in an optional group causes the
generated printer to call `printAttribute` with a null `Attribute` when
the attribute is absent. This leads to a crash in the alias initializer
when it calls `getAlias` on a null attribute.

Add a validation check in `OpFormatParser::verifyAttributes` that
detects this pattern and emits a diagnostic error with a helpful note
pointing users to the correct `($attr^)?` syntax.

Fixes llvm#58064

Assisted-by: Claude Code
…7281)

Add binary arithmetic multiplication, division, remainder to DIL.
This patch also passes DILMode to the parser to check if binary
multiplication is allowed by the mode. This cannot be done in the lexer
alone, because it allows token `*` as a dereference operator for legacy
mode, but that token could also be a binary multiplication allowed only
in full mode.
…sa) (llvm#188741)

The "riscv-isa" LLVM module flag stores its value as an MDTuple
containing MDStrings (e.g. `\!{\!"rv64i2p1", \!"m2p0"}`). Previously,
this fell through the unrecognized-key path in
`convertModuleFlagValueFromMDTuple`, which emitted a warning and dropped
the flag during import.

This patch adds generic handling for MDTuples whose operands are all
MDStrings:
- Import: convert to `ArrayAttr<StringAttr>` in
`convertModuleFlagValueFromMDTuple`
- Export: convert `ArrayAttr<StringAttr>` back to an MDTuple of
MDStrings in `convertModuleFlagValue`, enabling a lossless round-trip
- Verifier: allow `ArrayAttr<StringAttr>` as a valid `ModuleFlagAttr`
value for keys not otherwise handled by specific verifier branches

Fixes llvm#188122

Assisted-by: Claude Code
…part 43) (llvm#191753)

Tests converted from test/Lower/Intrinsics: transpose.f90,
transpose_opt.f90, trim.f90, ubound.f90, ubound01.f90
llvm#191385)

…n test suite

Secondary pr to enable tests after
llvm#183310 enables the features
This patch adds the `strnlen_s` function from Annex K.

In order to reduce duplication between `strnlen` and `strnlen_s`, the
common logic has been extracted to a new internal function which both
now call.

In addition to the function definition, the patch adds a unit test and a
fuzzing test.
…on (llvm#191792)

OpReturnValue with a pointer type is invalid in SPIR-V Logical
addressing model (Vulkan). The functions in the test return
OpAccessChain results, which are pointers

related to llvm#190736
…hLR (llvm#175991)

When users request branch protection with PAuthLR on targets that do not
support the PAuthLR instructions, the PAUTH_EPILOGUE falls back to using
hint-space instructions. This fallback sequence uses X16 as a temporary
register, but X16 was not listed in the clobber set.

Because Speculative Load Hardening uses X16, this omission made SLH
incompatible with this PAUTH_EPILOGUE path.

Mark X16 as clobbered so the compiler does not assume X16 is preserved
across the epilogue, restoring compatibility with Speculative Load
Hardening and avoiding incorrect register liveness assumptions. The
clobber is added in C++ rather than TableGen, as X16 is only clobbered
when PAuthLR is requested as a branch protection variation and should
not be treated as clobbered unconditionally.
…edure pointers. (llvm#183268)

Fixes llvm#177505.

This patch updates an existing external procedure symbol with the
correct function signature and argument attributes, so it can be safely
used as a proc_target without signature conflicts.

---------

Co-authored-by: jeanPerier <jean.perier.polytechnique@gmail.com>
@z1-cciauto z1-cciauto requested a review from a team April 13, 2026 15:06
@z1-cciauto
Copy link
Copy Markdown
Collaborator Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.