merge main into amd-main by z1-cciauto · Pull Request #2175 · ROCm/llvm-project

z1-cciauto · 2026-04-13T16:17:36Z

No description provided.

…ks (llvm#189522) Add `utils::diagDeprecatedCheckAlias` so checks can detect whether they are running under a deprecated name without enabling the new names. This commit also comes with an example with `zircon` module. It is deprecated in 22 release but we didn't provide a note for it before.

…#191797) These #includes are only needed in the SimpleRemoteEPC.cpp implementation.

…vm#191622) They have never existed since the initial public checkin.

For primitive array elements, we would accidentally activate the element and then immediate de-activate the array root, which is wrong. Ignore the element from the beginning to the later check never even compares with the element.

It's fine if they are uninitialized.

…ax. (llvm#191799) Extend test coverage with dedicated epilogue vectorization tests for dead first-order recurrences and FMinMaxNum reductions. Add users to FORs in existing tests where the dead FORs appeared unintentional.

@erichkeane

As requested by @erichkeane here: llvm#190329 (comment)

…pe builtins (llvm#190969) When promoting scalar arguments to vectors for builtins like `ldexp`, `pown`, and `rootn`, use the correct vector type matching the argument element type instead of always using the return type: these builtins take an integer argument but at the same time have floating point return type Fix `ldexp` test that does not pass spirv-val and add similar tests for `pown` and `rootn` related to llvm#190736

Pass GISelValueTracking* through isKnownNeverNaN and isKnownNeverSNaN so that the implementation can call computeKnownFPClass to derive NaN information from value tracking, rather than only looking at flags and direct constant definitions. Update all callers. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…lvm#190519) Most clients don't have a notion of "address" and pass arbitrary values (including `0` and `sizeof(void *)`) to `DataExtractor` constructors. This makes address-extraction methods dangerous to use. Those clients that do have a notion of address can use other methods like `getUnsigned()` to extract an address, or they can derive from `DataExtractor` and add convenience methods if extracting an address is routine. `DWARFDataExtractor` is an example, where the removed methods were actually moved. This does not remove `AddressSize` argument of `DataExtractor` constructors yet, but makes it unused and overloads constructors in preparation for their deletion. I'll be removing uses of the to-be-deleted constructors in follow-up patches.

Enables LC_UUID load commands to be added with the addLoadCommand method. This will be used in future MachOPlatform changes to add support for adding UUIDs to MachO JITDylibs.

…#175870) Replace the manual check in `verifyRemoved()` with `AssertingVH` instrumentation. For cases where the leader table becomes very large, this is a cheaper way to verify we don't have dangling entries in the leader table. For this change, we must implement a move constructor for `AssertingVH` so that we can keep the first entry as an inline-allocated node that will be handled correctly as the table grows.

Part of the work to remove trivial VP intrinsics from the RISC-V backend, see https://discourse.llvm.org/t/rfc-remove-codegen-support-for-trivial-vp-intrinsics-in-the-risc-v-backend/87999 This PR expands four intrinsics before codegen, but doesn't remove the codegen handling yet as both DAGCombiner and type legalization can create these nodes. vp.fneg and vp.fpext are expanded in lockstep with the fma/fmuladd intrinsics since some test cases for vfmacc etc. also use these intrinsics, and mixing dynamic and constant vls causes some of the more complex patterns to be missed. The fixed-length VP vfmacc, vfmsac, vfnmacc and vfnmsac tests also need to replace the EVL of the vp.merge/vp.select with an immediate otherwise the resulting vmerge.vvm can't be folded into them. This only happens for fixed vector intrinsics with no passthru, since we end up with a constant vl from the fixed vector and dynamic vl from the vp.merge that prevents folding. As far as I'm aware we don't emit fixed length vp.merges in practice, since we only emit vp.merge in the loop vectorizer, and we only use it with EVL tail folding which requires a scalable VF.

This patch fixes 2 problems in lldb-server argument parser: 1. Let's try to start lldb-server with incorrect arguments ``` ./lldb-server platform --listen *:1111--server ``` Current behavior * lldb-server run in gdbserver mode with port 1111 Expected behavior * fail, as `1111–server` is not a number 2. And try to start lldb-server with host:port specification without colon ``` ./lldb-server gdbserver 1111 ./test Launched './test' as process 186... lldb-server-local_build lldb-server: llvm-project/lldb/source/Host/common/TCPSocket.cpp:245: virtual Status lldb_private::TCPSocket::Listen(llvm::StringRef, int): Assertion `error.Fail()' failed. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Stack dump: 0. Program arguments: ./lldb-server gdbserver 1111 ./test Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it): 0 lldb-server 0x0000002ab86d0ca2 1 lldb-server 0x0000002ab86ced06 2 lldb-server 0x0000002ab86d1428 3 linux-vdso.so.1 0x0000003f8e7fd800 __vdso_rt_sigreturn + 0 4 libc.so.6 0x0000003f8e2b264a 5 libc.so.6 0x0000003f8e27b1ac gsignal + 18 6 libc.so.6 0x0000003f8e26c14c abort + 180 7 libc.so.6 0x0000003f8e2760cc 8 libc.so.6 0x0000003f8e27610e __assert_perror_fail + 0 9 lldb-server 0x0000002ab86eb628 10 lldb-server 0x0000002ab86f1010 11 lldb-server 0x0000002ab86eeee0 12 lldb-server 0x0000002ab86eee5c 13 lldb-server 0x0000002ab863ef3a 14 lldb-server 0x0000002ab864067c 15 lldb-server 0x0000002ab86438da 16 libc.so.6 0x0000003f8e26c476 17 libc.so.6 0x0000003f8e26c51e __libc_start_main + 116 18 lldb-server 0x0000002ab863ce64 Aborted ``` We expect to see an error instead of lldb-server crash in this case

…#191359) Both implementations are currently equivalent. This is likely a leftover from the past, when `llvm::Optional` existed.

Expose existing trylock internal operation to posix interface. POSIX.1-2024 only specifies the `EBUSY` error case. Assisted-by: Codex with gpt-5.4 default fast

Part of the work to remove trivial VP intrinsics from the RISC-V backend, see https://discourse.llvm.org/t/rfc-remove-codegen-support-for-trivial-vp-intrinsics-in-the-risc-v-backend/87999 This splits off 4 intrinsics from llvm#179622.

…191819) This change aims to make it easier for MachOPlatform clients to customize JITDylib MachO headers. At MachOPlatform construction time clients can now supply a MachOPlatform::HeaderOptionsBuilder. The supplied callback will be called by setupJITDylib to create the HeaderOptions for the JITDylib being set up. No testcase: Constructing a MachOPlatform instance requires the ORC runtime, which we can't require for LLVM unit or regression suite tests. We should look at testing this functionality in the new ORC runtime once it's ready.

For a loop-nest-generating construct this function returns the number of loops in the generated loop nest. A loop-nest-transformation construct can be thought of as replacing N nested loops with K nested loops, where N = GetAffectedNestDepthWithReason K = GetGeneratedNestDepthWithReason

This change improves the lifetime safety checker to detect when constructor parameters escape to class fields and suggest appropriate `[[clang::lifetimebound]]` annotations. ```cpp struct A { View v; A(const MyObj& obj) : v(obj) {} // Now suggests [[clang::lifetimebound]] }; ```

…lvm#191834) …cess). 6dbf9d1 forward declared the MemoryAccess class in ExecutorProcessControl.h, breaking some examples that were depending on the transitive include. (See e.g. https://lab.llvm.org/buildbot/#/builders/80/builds/21875). This commit adds the missing #includes to the broken examples.

These are now listed in the asciidoc spec here https://github.com/riscv/riscv-p-spec I got some help on this from AI, but I reviewed it. Test cases were fully generated with AI.

…m#191818) Add GICv5 `ICH_PPI_HVIR{0,1}_EL2` system registers (Interrupt Controller PPI Hide Virtual Interrupt Registers). These registers are added because a hypervisor may want to only expose a subset of the PPIs to the virtual machine and hide the remaining PPIs. The only way the hypervisor can do this is by trapping all the PPI ICV registers which leads to additional code complexity and adds performance overhead especially for nested virtualization. These are documented here: https://developer.arm.com/documentation/111107/latest/AArch64-Registers/ICH-PPI-HVIR-n--EL2--Interrupt-Controller-PPI-Hide-Virtual-Interrupt-Registers

…follow LLVM conventions (llvm#191134) Follow-up to #[189948](llvm#189948 (comment)). Addresses review feedback Co-authored-by: padivedi <padivedi@amd.com>

…kernels (llvm#191770) Don't use the L0 heuristics if all the dimensions are specified by the user code.

…#190026) Add translation from the MLIR OpenMP depend clause with iterator modifier to LLVM IR. `buildDependData` (in OpenMPToLLVMIRTranslation) allocates a single `kmp_depend_info` array sized to hold both locator (non-iterated) and iterated entries. Locator dependencies use the existing static path (a vector of `DependData`), while iterated dependencies use a dynamically-sized path (`DepArray`, `NumDeps`). The reason both paths are not unified under the dynamic allocation is that the existing locator path emits actual `kmp_depend_info` entries inside OMPIRBuilder methods (`createTask`, `createTarget`), whereas the iterator path must emit the iterator loop in OpenMPToLLVMIRTranslation (since the convention is to not pass MLIR ops into the OMPIRBuilder). Unifying them would require modifying existing depend clause tests. The `OMPIRBuilder::DependenciesInfo` struct is extended to hold either a `SmallVector<DependData>` (locator path) or a pre-built `{DepArray, NumDeps}` pair (iterator path). The single-entry `emitTaskDependency` helper is made public so the translation layer can fill individual `kmp_depend_info` entries inside the iterator loop body. This patch is part of the feature work for llvm#188061. Assisted with copilot.

This is a set of squashed reverts of recent clang doc patches, since its breaking something on Darwin builders: https://lab.llvm.org/buildbot/#/builders/23/builds/19172 Revert "[clang-doc][nfc] Default initialize all StringRef members (llvm#191641)" This reverts commit 155b9b3. Revert "[clang-doc] Initialize StringRef members in Info types (llvm#191637)" This reverts commit 489dab3. Revert "[clang-doc] Initialize member variable (llvm#191570)" This reverts commit 5d64a44. Revert "[clang-doc] Merge data into persistent memory (llvm#190056)" This reverts commit 21e0034. Revert "[clang-doc] Support deep copy between arenas for merging (llvm#190055)" This reverts commit c70dae8.

This PR improves native binary generation by avoiding `llvm::sys::ExecuteAndWait` call for ocloc and instead leveraging `oclocInvoke()` that consumes an in-memory SPIR-V string. Co-authored-by: Artem Kroviakov <artem.kroviakov@intel.com>

…91003)

Refactors the copysign math family to be header-only. Closes llvm#182136 Target Functions: - copysign - copysignbf16 - copysignf - copysignf128 - copysignf16 - copysignl --------- Co-authored-by: bassiounix <muhammad.m.bassiouni@gmail.com>

…profiles (llvm#191523) This change optimizes the basename matching logic in `SampleProfileMatcher::matchFunctionsWithoutProfileByBasename` by replacing the existing O(N*M) nested loop with an O(N+M) hash-based lookup, while strictly preserving the original matching semantics. The previous implementation relied on a substring heuristic (`ProfName.contains(BaseName)`) to bypass expensive demangling operations during the nested iteration; however, in codebases with common or overlapping function names, this heuristic frequently evaluated to true, resulting in redundant demangling and quadratic time complexity. The updated approach demangles each profile name exactly once and utilizes a `StringMap` to perform O(1) lookups against the orphan functions. This eliminates the need for the substring pre-check while maintaining the exact same constraints: establishing a strict 1:1 mapping between orphaned IR functions and profile entries, and correctly identifying and rejecting ambiguous matches where multiple entities share the same demangled basename. Results in a 9x speedup on a large module with common basenames.

z1-cciauto · 2026-04-13T16:23:35Z

PSDB Build Link: http://mlse-bdc-20dd129:8065/#/builders/11/builds/208

zeyi2 and others added 30 commits April 13, 2026 12:43

[ORC] Sink a #include in SimpleRemoteEPC.h, and remove another. (llvm…

f4da0ca

…#191797) These #includes are only needed in the SimpleRemoteEPC.cpp implementation.

[lldb] Remove declarations of two non-existent constructors (NFC) (ll…

80d72ae

…vm#191622) They have never existed since the initial public checkin.

[clang][bytecode] Don't check anonymous union in memcpy op (llvm#191783)

14f2556

It's fine if they are uninitialized.

XFAIL symbolizer test for TySan (llvm#191810)

7083e9d

[CIR][NFC] Add NYI for OMPSplitDirective stmt (llvm#191791)

a042785

As requested by @erichkeane here: llvm#190329 (comment)

[LV] Add test for reverse load with scatter store. nfc (llvm#189928)

28e237a

[ORC] Add MachOBuilder support for LC_UUID load commands. (llvm#191807)

a2bf43d

Enables LC_UUID load commands to be added with the addLoadCommand method. This will be used in future MachOPlatform changes to add support for adding UUIDs to MachO JITDylibs.

[NFC] Replace expectedToStdOptional with expectedToOptional (llvm…

52a250e

…#191359) Both implementations are currently equivalent. This is likely a leftover from the past, when `llvm::Optional` existed.

[libc] add posix_mutex_trylock support (llvm#191531)

91c0fdf

Expose existing trylock internal operation to posix interface. POSIX.1-2024 only specifies the `EBUSY` error case. Assisted-by: Codex with gpt-5.4 default fast

[RISCV] Add an initial set of InstAliases for P extension. (llvm#180315)

3f45921

These are now listed in the asciidoc spec here https://github.com/riscv/riscv-p-spec I got some help on this from AI, but I reviewed it. Test cases were fully generated with AI.

[NFC][UniformityAnalysis] Rename variables in uniformity analysis to …

f058eaa

…follow LLVM conventions (llvm#191134) Follow-up to #[189948](llvm#189948 (comment)). Addresses review feedback Co-authored-by: padivedi <padivedi@amd.com>

[OFFLOAD][L0] Handle group sizes correctly for multidimensional bare …

e0adc50

…kernels (llvm#191770) Don't use the L0 heuristics if all the dimensions are specified by the user code.

AlexisPerry and others added 4 commits April 13, 2026 11:55

[flang] Adding meeting notes for the April 8, 2026 Flang call (llvm#1…

058d80d

…91003)

merge main into amd-main

2e0ebfa

z1-cciauto requested a review from a team April 13, 2026 16:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

merge main into amd-main#2175

merge main into amd-main#2175
z1-cciauto wants to merge 34 commits intoamd-mainfrom
upstream_merge_202604131217

z1-cciauto commented Apr 13, 2026

Uh oh!

z1-cciauto commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Conversation

z1-cciauto commented Apr 13, 2026

Uh oh!

z1-cciauto commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants