Merged
Conversation
…189657)" (llvm#191912) This reverts commit 67c893e due to buildbot breakage (llvm#189657 (comment), llvm#189657 (comment)).
Extend copyMetadata to every call-to-call replacement in AMDGPULowerIntrinsics, not just the single-wave s_barrier → wave_barrier path. This covers: - s_cluster_barrier → wave_barrier (single-wave) - s_cluster_barrier → signal_isfirst + wait + signal + wait (multi-wave) - s_barrier → signal + wait (split barriers) Add GFX11 and GFX12 RUN lines and test functions for all lowering paths to verify metadata preservation. Made-with: Cursor
…1925) Closes llvm#191910 --------- Co-authored-by: Joseph Huber <huberjn@outlook.com>
…lvm#191745) This reverts commit 4abb927. The code is not needed since 121f5a9 because the C compiler is now always just-built clang in in-tree build. In addition, CMAKE_AR is llvm-ar and CMAKE_RANLIB is llvm-ranlib.
…fier (llvm#191849) Currrently the signature of `result(..)` is: ```python result(*, infer_type: bool = False, default_factory: Callable[[], Any] | None = None, kw_only: bool = False) -> Result ``` so when users use `result(infer_type=True)`, the type checkers will still get `kw_only=False` (from the signature), but actually the `kw_only` should be `True` (it should follow the value of `infer_type`). users can use `result(infer_type=True, kw_only=True)` but it's unnecessarily verbose. So it may introduce an incompatibility when we start to use `dataclass_transform`. currently it's fine because we just don't use `dataclass_transform`. But when we use, we may require a breaking change. This PR migrates such use to a new field specifier named `infer_result()`.
These seemed to have gotten removed here.
…llvm#188189) Upstreaming clangIR PR: llvm/clangir#2092 This PR adds support for emitting llvm.used and llvm.compiler.used global arrays in CIR. Added addUsedGlobal() and addCompilerUsedGlobal() methods to CIRGenModule Adds __hip_cuid_* to llvm.compiler.used for HIP compilation. Followed OGCG implementation in clang/lib/CodeGen/CodeGenModule.cpp
…test (llvm#191936) Using -fopenmp uses the default openmp lib, which defaults to libomp but may be something else. This test only passes with libomp, so it passes when using default, but fails downstream if configured for something else, like libgomp.
) When the linker is specified as ld, toolchain applies special handling by invoking (triple)-ld instead of resolving ld via standard PATH lookup. This causes GNU ld installed via the system package manager to take the precedence (since (triple)-ld appears earlier in the search path), effectively overriding ld.lld. As a result, we set the default Linker on FreeBSD to ld.lld to indicate we want to use lld by default.
…lvm#190250) A bare `!$omp declare target` could incorrectly mark `_QQmain` as `omp.declare_target` when it appeared in an interface body inside a named main program. That pulled host-only callees into device compilation and caused offload link failures. Fix this by skipping main programs in the implicit-capture path. Also add a regression test for the named-main interface case and update `real10.f90` to use a valid container for the bare `declare target` form. This fixes offload link failures where `_QQmain` was incorrectly treated as a device function and pulled in host-only symbols such as Fortran I/O runtime calls. Minimal reproducer: ```fortran program named_main interface subroutine sub_a(x) !$omp declare target integer, intent(inout) :: x end subroutine end interface integer :: v !$omp target call sub_a(v) !$omp end target end program
…m#191841) `AddressSize` parameter is not used by `DataExtractor` and will be removed in the future. See llvm#190519 for more context. I took the liberty of switching from using the `StringRef` constructor overload to `ArrayRef` where appropriate.
…m#191864) `AddressSize` parameter is not used by `DataExtractor` and will be removed in the future. See llvm#190519 for more context.
Updated [TidyFastCheck.inc](https://github.com/llvm/llvm-project/blob/main/clang-tools-extra/clangd/TidyFastChecks.inc#L1) that has been stale for a while using this [script](https://github.com/llvm/llvm-project/blob/main/clang-tools-extra/clangd/TidyFastChecks.py), as discussed in llvm#190531. In the thread, there was some conversation on the limitations of doing this manually at every new release (adding the script to the release checklist would definitely help) but it seems like this is the only low-risk solution for now.
In the RVV Clang builtins generator, a new prototype descriptor `d` was added to represent vectors with `2 x LMUL`. The `.ll` tests were generated by LLM and I have reviewed them. And the .c tests were generated by riscv-non-isa/riscv-rvv-intrinsic-doc#431.
llvm#191731) Add `UseFact`s for field origins when calling instance methods. Fixes llvm#182945 --------- Co-authored-by: Utkarsh Saxena <usx@google.com>
…191956) This reduces the bytecode output for the copy constructor of a struct such as: ```c++ struct Buffer { struct { char D[N]; } V; Buffer() = default; }; ``` from ``` Buffer<5>::(unnamed struct)::(unnamed struct at array.cpp:873:3) 0x7d38d2de3f80 frame size: 104 arg size: 96 rvo: 0 this arg: 1 0 GetPtrThisField 16 16 GetParamPtr 0 32 GetPtrFieldPop 16 48 InitScope 0 64 SetLocalPtr 40 80 GetLocalPtr 40 96 ArrayDecay 104 ExpandPtr 112 ConstUint64 0 128 ArrayElemPtrPopUint64 136 LoadPopSint8 144 InitElemSint8 0 160 GetLocalPtr 40 176 ArrayDecay 184 ExpandPtr 192 ConstUint64 1 208 ArrayElemPtrPopUint64 216 LoadPopSint8 224 InitElemSint8 1 240 GetLocalPtr 40 256 ArrayDecay 264 ExpandPtr 272 ConstUint64 2 288 ArrayElemPtrPopUint64 296 LoadPopSint8 304 InitElemSint8 2 320 GetLocalPtr 40 336 ArrayDecay 344 ExpandPtr 352 ConstUint64 3 368 ArrayElemPtrPopUint64 376 LoadPopSint8 384 InitElemSint8 3 400 GetLocalPtr 40 416 ArrayDecay 424 ExpandPtr 432 ConstUint64 4 448 ArrayElemPtrPopUint64 456 LoadPopSint8 464 InitElemSint8 4 480 FinishInitPop 488 Destroy 0 504 Destroy 0 520 RetVoid ``` (where `N = 5`). to: ``` Buffer<5>::(unnamed struct)::(unnamed struct at array.cpp:873:3) 0x7c85b9fe3f80 frame size: 0 arg size: 96 rvo: 0 this arg: 1 0 GetPtrThisField 16 16 GetParamPtr 0 32 GetPtrFieldPop 16 48 CopyArraySint8 0 0 5 80 FinishInitPop 88 RetVoid ```
…llvm#186593) Add new SelectionDAG pattern matchers for funnel shifts: - m_FShL and m_FShR as ternary wrappers for ISD::FSHL/ISD::FSHR - m_FShLLike and m_FShRLike to match: -- direct FSHL/FSHR nodes -- ROTL/ROTR equivalents (binding both X and Y to the same rotate operand) -- OR(SHL(X, C), SRL(Y, BW - C)) forms (including commuted OR) Also add unit tests covering positive and negative cases for: - direct funnel-shif matching - rotate equivalence matching - OR-based funnel-shift-like patterns Fixes llvm#185880
Fixes llvm#190502 Added implementation of helper combineOrWithGF2P8AFFINEQB and wired the logic with combineOrXorWithSETCC: Fold: (GF2P8AFFINEQB(X, Y, Imm) or_disjoint SplatVal) -> GF2P8AFFINEQB(X, Y, Imm ^ SplatVal) When OR is disjoint (no common bits), the splat constant can be folded directly into the GF2P8AFFINEQB immediate via XOR.
Fixes a problem that tryCompressVPMOVPattern incorrectly folds instruction using extended registers into VEX. Introduced relevant tests in MIR. AI Statement: I used AI to write the tests. Fixes llvm#191304
…mdspan` taking `(data_handle_type, mapping_type, accessor_type)` and the corresponding constructor (llvm#191950) No functional change; this only removes a redundant const qualifier. Fixes: llvm#189860
…190838) Almost all recipes now go through ::computeCost to properly compute their costs using the VPlan-based cost model. There are currently no known cases where the VPlan-based cost model returns an incorrect cost vs the legacy cost model. I check the remaining open issues with reports of the assertion triggering and in all cases the VPlan-based cost model is more accurate, which is causing the divergence. There are still some fall-back paths, mostly via precomputeCosts, but those cannot be easily removed without triggering the assert, as the VPlan-based cost model is more accurate for those cases. An example of this is llvm#187056. Fixes llvm#38575. Fixes llvm#149651. Fixes llvm#182646. Fixes llvm#183739. Fixes llvm#187523. PR: llvm#190838
…lvm#190139) For some cores it is preferable to choose a destination predicate register that does not match the governing predicate. The hint is conservative in that it tries not to pick a callee-save register if it's not already used/allocated for other purposes, as that would introduce new spills/fills. Note that this might be preferable if the instruction is executed in a loop, but it might also be less preferable for small functions that have an SVE interface (p4-p15 are caller-preserved). It is enabled for all cores by default, but it can be disabled by adding the `disable-distinct-dst-reg-cmp-match` feature. This feature can also be added to specific cores if this behaviour is undesirable.
Extend the existing NonNarrowingCastsOptimization to also cover casts between floating point types f32, f16, bf16, f8E4M3FN and F8E5M2. Avoid introducing direct casts between f8 types since those are not allowed in TOSA. Also expand the set of cases that are considering non-narrowing by only checking if the cast we're trying to remove is non-narrowing. Example i16 -> i32 -> i8 would have been rejected before, but it is now safely converted to a single i16 -> i8 tosa.cast, since the behaviour should identical for the entire input space. Finally disallow the optimization in the case when the cast that we would remove involves integer types of different signedness. Signed-off-by: Ian Tayler Lessa <ian.taylerlessa@arm.com>
…lvm#191820) SPIR-V cannot encode hidden for now, which leads to quirky errors. For now we deal with this at run time, as part of JIT. Once SPIR-V learns about `hidden` it'll be revisited.
This patch builds on llvm#184659 and llvm#184649 and adds cost modelling for new dot instructions variants, codegened in those patches.
…#186896) This builds on the MCLFIRewriter infrastructure to add the AArch64-specific LFI rewriter, which rewrites AArch64 instructions for LFI sandboxing during the assembler step. The initial rewriter handles system instructions: system calls, thread pointer accesses, and also rejects modifications to reserved registers.
llvm#191814) Initially such ops were marked Pure wrongly since they could overflow or underflow the accumulator and result in undefined behavior. Signed-off-by: Davide Grohmann <davide.grohmann@arm.com>
This relates to llvm#35980.
…1576) This introduces two macros that do the same `UnwindLogMsg()`/`UnwindLogMsgVerbose()` functions, but allow using `formatv()`-style formatting. In addition to the benefits that the `formatv()` function provides, this makes `log enable -F lldb unwind` print the correct methods names from which the messages originate (previously, it printed the name of one of those two helper methods). I didn't replace all function calls with macros because there are too many of them for one PR. This only replaces calls whose format string contains no specifiers or only '%s' specifiers.
This relates to llvm#35980.
This relates to llvm#35980.
This relates to llvm#35980.
This relates to llvm#35980.
This relates to llvm#35980.
This relates to llvm#35980.
This relates to llvm#35980.
This relates to llvm#35980.
Ret was uint32_t truncating the uint64_t __readlink return, and was compared against the unrelated getdents64 BufSize (1024) instead of sizeof(TargetPath) (NameMax, 4096). A truncated readlink of exactly NameMax bytes also wrote one byte past TargetPath.
…lvm#192032) Replace calls to `UnwindLogMsg()`/`UnwindLogMsgVerbose()` with `UNWIND_LOG`/`UNWIND_LOG_VERBOSE` macros introduced in 8417922. This replaces calls whose format string contains only '%d' and sometimes '%s' specifiers, the rest will be addressed in a future patch. As a result of this change, the `UnwindLogMsgVerbose()` is no longer used and has been removed.
…#189657)" (llvm#191939) This reverts commit bfff42c.
Collaborator
Author
ronlieb
approved these changes
Apr 14, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.