Open
Conversation
…lvm#191395) AArch64::getSVEPseudoMap calls are visible in compile-time profiles even on non-SVE targets. I think CodeGenMapTable could be improved, it's currently emitting a constexpr array sorted by opcode and a hand-rolled binary search over that array, however the AArch64ExpandPseudoInsts pass is missing a simple check for pseudo instructions before expanding. This avoids the compile-time cost. https://llvm-compile-time-tracker.com/compare.php?from=0d42811ea4658b3e86a3801b3bc848324f8540f8&to=9e2434de84577ca1c5e6de8fe8d75c6b8e282b3f&stat=instructions%3Au
…lvm#191119) Instead of bailing out if the original divisor exceeds HBitWidth, allow divisors that fit in HBitWidth after removing trailing zeros. PartialRem now needs a low and high part. Shifting RemL left now needs to handle shifting into RemH. Assisted-by: Claude Sonnet 4.5
…ine-packing) (llvm#188192) A transform pass to lower flat layout `vector.contract` operation to (a) amx.tile_mulf for BF16, or (b) amx.tile_muli for Int8 packed types via `online` packing. TODOs: On an another `patch` planned to re-factor this pass + retiring `convert-vector-to-amx` pass.
Part of the work to remove trivial VP intrinsics from the RISC-V backend, see https://discourse.llvm.org/t/rfc-remove-codegen-support-for-trivial-vp-intrinsics-in-the-risc-v-backend/87999 This splits off 4 intrinsics from llvm#179622.
llvm#190397) Implement Loop-splitting #pragma omp split construct with counts clause. Posting this PR after the revert of PR ([llvm#183261](llvm#183261)) Changes: 1. Added `openmp/runtime/test/transform/split/lit.local.cfg` 2. Enforced ICE for `counts` clause items in `SemaOpenMP.cpp` (minor change) 3. Updated tests `split_messages.cpp`, `split_omp_fill.cpp`, `split_diag_errors.c`. 4. Removed `nonconstant_count.cpp`
Similar to fp16 ldexp, we cannot create illegal types for bf16 during lowering so should promote.
…89319) Narrow the new setImplicitDefaultValue() guard so existing default bindings are preserved only for aggregate-like cases. The previous change was too broad and regressed normal zero-initialization, causing new int[10]{} to be modeled as undefined and emit a garbage-value warning instead of the expected analyzer reports.
This patch encapsulates the check for wether a `SUnit` is clustered, rather than letting it scatter across call sites. Currently there is only a single user, but more users can show up, and I think it provides a cleaner API even for that single user.
…#191300) Updates the matmul verifier to check input and output shapes are valid. Also adds some tests for verifier failures which were previously not covered.
…lvm#191768) hasScalarTail currently returns incorrect results when queried after runtime checks have been added. Generalize and harden by checking if the middle block is a predecessor of the scalar preheader.
llvm#191766) The direct base of those pointers is not a union, i.e. `getRecord()` returns `nullptr`.
This patch removes the `-aarch64-new-sme-abi=<true/false>` option (which has been defaulted to "true" since LLVM 22), and removes the Selection DAG lowering for the SME ABI. There should be no functional changes for the default path (`-aarch64-new-sme-abi=true`).
This PR intends to add the log2p1f16 function to shared math, along with adding tests for it and bazel which was missed in [f0ce26d](llvm@f0ce26d).
This PR intends to add the log10p1f16 function to shared math, along with adding tests for it and Bazel which was missed in [a7d1a87](llvm@a7d1a87).
ConceptDecl doesn't have an associated template declaration, and it doesn't introduce a type either. Fixes llvm#188914
Construct the OptimizationRemarkEmitter in AArch64StackTagging on demand if not available and requires, this avoids computing several analysis in all pipelines.
llvm#183310) … features and test suite This involved: - Implementing the `__sanitizer_print_stack_trace` interface - Adding the common signal handlers - Correctly set the tool name - Cache the binary name before running
… failure (llvm#188973) DimOfReifyRankedShapedTypeOpInterface::matchAndRewrite called reifyDimOfResult via the PatternRewriter. Some implementations delegate to the coarse-grained reifyResultShapes, which creates ops for ALL dimensions (e.g. a tensor.dim) before discovering that a specific dimension is not reifiable (signalled by an empty OpFoldResult). The pattern then returned failure() once it saw the empty OpFoldResult, but the newly created ops were already in the IR. Under MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS this triggered "pattern returned failure but IR did change". Fix: record the op immediately before the matched dim op, so we can identify ops inserted during the reification attempt. If reification returns an empty (unreifiable) OpFoldResult, erase those newly created ops before returning failure, restoring the IR to its original state. Assisted-by: Claude Code
…m#188968) Two bugs were introduced/revealed by MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS: 1. `ACCLoopTilingImpl::matchAndRewrite` returned `success()` for loops with no tile values, triggering "pattern returned success but IR did not change". Fixed by returning `failure()` instead. 2. `moveOpsAndReplaceIVs` moved ops between blocks via `splice()` and updated operands via `replaceAllUsesInRegionWith()` without notifying the rewriter. This caused "operation fingerprint changed" errors since the moved ops' parent op and operands changed without `startOpModification`/ `finalizeOpModification`. Fixed by wrapping all moved ops (and their nested ops) with rewriter modification notifications. Assisted-by: Claude Code
This helps to not lower, improving the number of nodes that we expand into and improving the quality of the generated code. Originally a part of llvm#177158 by Ryan Cowan
…vm#189554) Replace `dyn_cast<IntrinsicInst> + getIntrinsicID()` chains with PatternMatch combinators where applicable
…eger sampled type (llvm#190742) Fix spirv-val failure: ``` error: line 20: Capability Int64ImageEXT is required when using Sampled Type of 64-bit int %spirv_Image = OpTypeImage %ulong 3D 0 0 0 0 Unknown ReadOnly ``` Detect when OpTypeImage uses a 64-bit integer as its sampled type and automatically add the Int64ImageEXT capability and SPV_EXT_shader_image_int64 extension related to llvm#190736
The documentation incorrectly stated that this metadata enables/disables vectorization, but it actually controls predication. Also clarify that enabling predication implicitly enables vectorization.
Use `{{.*}}` instead of `i64` for `memset` size type, as 32-bit
platforms use `i32`.
…rations (llvm#188743) UpdateVCEPass only queried capabilities via QueryCapabilityInterface and SPIRV type capabilities, but did not check capabilities implied by decoration attributes on ops. Specifically, the DescriptorSet and Binding decorations—represented by the `binding` and `descriptor_set` attributes on `spirv.GlobalVariable`—require the `Shader` capability per the SPIR-V spec Decoration table, but this was not deduced. Add an explicit check in UpdateVCEPass: when a `spirv.GlobalVariable` has a `binding` or `descriptor_set` attribute, require the `Shader` capability. Part of llvm#168357 Assisted-by: Claude Code
Fixes unused-but-set globals on non-Unix paths in kmp_alloc.cpp
… unit-test compiles (llvm#191564) 1. `check_cxx_compiler_flag` marked `COMPILER_RT_HAS_EXTERNAL_FLAG` true for clang-cl, so `/experimental:external` and `/external:anglebrackets` were passed and clang-cl warned they were unused. Now only probe with real MSVC (not Clang); set the flag false otherwise; rely on that in asan/interception/ubsan. 2. Custom compile lines for unit tests didn’t get C++17, so headers hit `-Wc++17-extensions` (e.g. `constexpr if` / message-less `static_assert` in FuzzedDataProvider / asan_fake_stack). Now append `-std=c++${CMAKE_CXX_STANDARD}` or `-std=c++17` for C++ sources in `clang_compile()` for both standalone and non-standalone builds.
Previously clang-cl was generating "warning: comparison of integers of different signs".
Improve error handling with the following changes: - If `kvm_open2()` fails in `DoLoadCore()`, log error with a message obtained from the function. - Return false or continue for loop if memory read fails. - Rename `!error.Success()` to `error.Fail()` for readability (NFC). Signed-off-by: Minsoo Choo <minsoochoo0122@proton.me>
…-type te…" (llvm#191798) Reverts llvm#189510 Crashes lldb on certain type of debug info. See llvm#189510 (comment) for more details.
…n assemblyFormat (llvm#188726) Using an optional attribute directly in assemblyFormat (e.g., `$attr attr-dict`) without wrapping it in an optional group causes the generated printer to call `printAttribute` with a null `Attribute` when the attribute is absent. This leads to a crash in the alias initializer when it calls `getAlias` on a null attribute. Add a validation check in `OpFormatParser::verifyAttributes` that detects this pattern and emits a diagnostic error with a helpful note pointing users to the correct `($attr^)?` syntax. Fixes llvm#58064 Assisted-by: Claude Code
…7281) Add binary arithmetic multiplication, division, remainder to DIL. This patch also passes DILMode to the parser to check if binary multiplication is allowed by the mode. This cannot be done in the lexer alone, because it allows token `*` as a dereference operator for legacy mode, but that token could also be a binary multiplication allowed only in full mode.
…sa) (llvm#188741) The "riscv-isa" LLVM module flag stores its value as an MDTuple containing MDStrings (e.g. `\!{\!"rv64i2p1", \!"m2p0"}`). Previously, this fell through the unrecognized-key path in `convertModuleFlagValueFromMDTuple`, which emitted a warning and dropped the flag during import. This patch adds generic handling for MDTuples whose operands are all MDStrings: - Import: convert to `ArrayAttr<StringAttr>` in `convertModuleFlagValueFromMDTuple` - Export: convert `ArrayAttr<StringAttr>` back to an MDTuple of MDStrings in `convertModuleFlagValue`, enabling a lossless round-trip - Verifier: allow `ArrayAttr<StringAttr>` as a valid `ModuleFlagAttr` value for keys not otherwise handled by specific verifier branches Fixes llvm#188122 Assisted-by: Claude Code
…part 43) (llvm#191753) Tests converted from test/Lower/Intrinsics: transpose.f90, transpose_opt.f90, trim.f90, ubound.f90, ubound01.f90
llvm#191385) …n test suite Secondary pr to enable tests after llvm#183310 enables the features
This patch adds the `strnlen_s` function from Annex K. In order to reduce duplication between `strnlen` and `strnlen_s`, the common logic has been extracted to a new internal function which both now call. In addition to the function definition, the patch adds a unit test and a fuzzing test.
…on (llvm#191792) OpReturnValue with a pointer type is invalid in SPIR-V Logical addressing model (Vulkan). The functions in the test return OpAccessChain results, which are pointers related to llvm#190736
…hLR (llvm#175991) When users request branch protection with PAuthLR on targets that do not support the PAuthLR instructions, the PAUTH_EPILOGUE falls back to using hint-space instructions. This fallback sequence uses X16 as a temporary register, but X16 was not listed in the clobber set. Because Speculative Load Hardening uses X16, this omission made SLH incompatible with this PAUTH_EPILOGUE path. Mark X16 as clobbered so the compiler does not assume X16 is preserved across the epilogue, restoring compatibility with Speculative Load Hardening and avoiding incorrect register liveness assumptions. The clobber is added in C++ rather than TableGen, as X16 is only clobbered when PAuthLR is requested as a branch protection variation and should not be treated as clobbered unconditionally.
…edure pointers. (llvm#183268) Fixes llvm#177505. This patch updates an existing external procedure symbol with the correct function signature and argument attributes, so it can be safely used as a proc_target without signature conflicts. --------- Co-authored-by: jeanPerier <jean.perier.polytechnique@gmail.com>
Collaborator
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.