-
Notifications
You must be signed in to change notification settings - Fork 375
cuda::is_trivially_copyable
#8265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
fbusato
wants to merge
24
commits into
NVIDIA:main
Choose a base branch
from
fbusato:relaxed-type-traits
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 12 commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
ea3f956
is_trivially_copyable_relaxed
fbusato 9dd60db
add documentation
fbusato 179a81b
describe custom specialization
fbusato 2c33c2e
move to internal function
fbusato fbade34
address padding
fbusato deb622c
fix clang
fbusato a553bb6
do not handle volatile
fbusato de42a45
unused-local-typedef
fbusato 4e7873d
fix clang pragma
fbusato 6e5021f
simplify conditions
fbusato 8ff54f4
improve documentation
fbusato c4c1504
fix operator==
fbusato e603a96
Update docs/libcudacxx/extended_api/type_traits/is_trivially_copyable…
fbusato cde3d1e
Update docs/libcudacxx/extended_api/type_traits/is_trivially_copyable…
fbusato c20fb89
add recursive struct check
fbusato 35f9d15
add comment
fbusato 4db130a
Merge branch 'relaxed-type-traits' of github.com:fbusato/cccl into re…
fbusato 1f6254c
fix nvrtc
fbusato ab184ff
rename to cuda::is_trivially_copyable
fbusato 6c5f19e
update documentation
fbusato 79f4310
test nvfp only in CUDA >= 12.3
fbusato 4506e40
update bit_cast implementation
fbusato 80b09fa
add documentation
fbusato cd776c9
fix compile warnings/errors
fbusato File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
110 changes: 110 additions & 0 deletions
110
docs/libcudacxx/extended_api/type_traits/is_trivially_copyable_relaxed.rst
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,110 @@ | ||
| .. _libcudacxx-extended-api-type_traits-is_trivially_copyable_relaxed: | ||
|
|
||
| ``cuda::is_trivially_copyable_relaxed`` | ||
| ======================================= | ||
|
|
||
| Defined in the ``<cuda/type_traits>`` header. | ||
|
|
||
| .. code:: cuda | ||
|
|
||
| namespace cuda { | ||
|
|
||
| template <typename T> | ||
| constexpr bool is_trivially_copyable_relaxed_v = /* see below */; | ||
|
|
||
| template <typename T> | ||
| using is_trivially_copyable_relaxed = cuda::std::bool_constant<is_trivially_copyable_relaxed_v<T>>; | ||
|
|
||
| } // namespace cuda | ||
|
|
||
| ``cuda::is_trivially_copyable_relaxed_v<T>`` is a variable template that extends ``cuda::std::is_trivially_copyable`` to also recognize CUDA extended floating-point scalar and vector types as trivially copyable. | ||
|
|
||
| A type ``T`` satisfies ``cuda::is_trivially_copyable_relaxed`` if any of the following holds: | ||
|
|
||
| - ``T`` is trivially copyable. | ||
| - ``T`` is an extended floating-point scalar type (e.g. ``__half``, ``__nv_bfloat16``, ``__nv_fp8_e4m3``). | ||
| - ``T`` is an extended floating-point vector type (e.g. ``__half2``, ``__nv_bfloat162``, ``__nv_fp8x2_e4m3``). | ||
|
|
||
| The trait also propagates through composite types: | ||
|
|
||
| - C-style arrays: ``T[N]`` and ``T[]`` are relaxed trivially copyable when ``T`` is. | ||
| - ``cuda::std::array<T, N>``: relaxed trivially copyable when ``T`` is. | ||
| - ``cuda::std::pair<T1, T2>``: relaxed trivially copyable when both ``T1`` and ``T2`` are and the object has no padding. | ||
fbusato marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| - ``cuda::std::tuple<Ts...>``: relaxed trivially copyable when all ``Ts...`` are and the object has no padding. | ||
|
|
||
| ``const`` qualification is handled transparently, while ``volatile`` is compiler dependent. | ||
|
|
||
| .. note:: | ||
|
|
||
| The type trait cannot determine if a structure (``struct`` or ``class``) contains extended floating-point types, and thus it recognizes the type as *trivially copyable*. The user must manually specialize the type trait for such types. | ||
fbusato marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| Custom Specialization | ||
| --------------------- | ||
|
|
||
| Users may specialize ``cuda::is_trivially_copyable_relaxed_v`` for types whose semantics allow copying with ``memcpy``, but which the compiler does not consider to be trivially copyable. | ||
|
|
||
| A `trivially copyable <https://en.cppreference.com/w/cpp/language/classes.html>`__ class is a class that | ||
|
|
||
| - has at least one eligible copy constructor, move constructor, copy assignment operator, or move assignment operator, | ||
| - each eligible copy constructor is trivial | ||
fbusato marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| - each eligible move constructor is trivial | ||
| - each eligible copy assignment operator is trivial | ||
| - each eligible move assignment operator is trivial, and | ||
| - has a non-deleted trivial destructor. | ||
|
|
||
| .. warning:: | ||
|
|
||
| The user is responsible for ensuring that the type is actually trivially copyable when specializing this variable template. Otherwise, the behavior is undefined. | ||
|
|
||
| A common case is a type that wraps extended floating-point fields and provides user-defined copy operations | ||
| solely to add ``__host__ __device__`` annotations: | ||
|
|
||
| .. code:: cuda | ||
|
|
||
| struct HalfWrapper { | ||
| __half value; | ||
| }; | ||
|
|
||
| struct NonTriviallyCopyable { | ||
| __host__ __device__ NonTriviallyCopyable(const NonTriviallyCopyable&) {} | ||
| }; | ||
|
|
||
| // Specializing the variable template | ||
| template <> | ||
| constexpr bool cuda::is_trivially_copyable_relaxed_v<HalfWrapper> = true; | ||
|
|
||
| template <> | ||
| constexpr bool cuda::is_trivially_copyable_relaxed_v<NonTriviallyCopyable> = true; | ||
|
|
||
| static_assert(cuda::is_trivially_copyable_relaxed_v<HalfWrapper>); | ||
| static_assert(cuda::is_trivially_copyable_relaxed_v<NonTriviallyCopyable>); | ||
|
|
||
| Examples | ||
| -------- | ||
|
|
||
| .. code:: cuda | ||
|
|
||
| #include <cuda/type_traits> | ||
| #include <cuda/std/array> | ||
| #include <cuda/std/tuple> | ||
| #include <cuda/std/utility> | ||
|
|
||
| #include <cuda_fp16.h> | ||
|
|
||
| // Standard trivially copyable types | ||
| static_assert(cuda::is_trivially_copyable_relaxed_v<int>); | ||
| static_assert(cuda::is_trivially_copyable_relaxed_v<float>); | ||
|
|
||
| // Extended floating-point types | ||
| static_assert(cuda::is_trivially_copyable_relaxed_v<__half>); | ||
| static_assert(cuda::is_trivially_copyable_relaxed_v<__nv_bfloat16>); | ||
| static_assert(cuda::is_trivially_copyable_relaxed_v<__half2>); | ||
|
|
||
| // Padding-free composite types containing extended floating-point types | ||
| static_assert(cuda::is_trivially_copyable_relaxed_v<__half[4]>); | ||
| static_assert(cuda::is_trivially_copyable_relaxed_v<cuda::std::array<__half, 4>>); | ||
| static_assert(cuda::is_trivially_copyable_relaxed_v<cuda::std::pair<__half, __half>>); | ||
| static_assert(cuda::is_trivially_copyable_relaxed_v<cuda::std::tuple<__half, __half>>); | ||
|
|
||
| // Composites with padding are not trivially copyable relaxed | ||
| static_assert(!cuda::is_trivially_copyable_relaxed_v<cuda::std::pair<__half, int>>); | ||
72 changes: 72 additions & 0 deletions
72
libcudacxx/include/cuda/__type_traits/is_trivially_copyable_relaxed.h
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,72 @@ | ||
| //===----------------------------------------------------------------------===// | ||
| // | ||
| // Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. | ||
| // See https://llvm.org/LICENSE.txt for license information. | ||
| // SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception | ||
| // SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. | ||
| // | ||
| //===----------------------------------------------------------------------===// | ||
|
|
||
| #ifndef __CUDA__TYPE_TRAITS_IS_TRIVIALLY_COPYABLE_RELAXED_H | ||
| #define __CUDA__TYPE_TRAITS_IS_TRIVIALLY_COPYABLE_RELAXED_H | ||
|
|
||
| #include <cuda/std/detail/__config> | ||
|
|
||
| #if defined(_CCCL_IMPLICIT_SYSTEM_HEADER_GCC) | ||
| # pragma GCC system_header | ||
| #elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_CLANG) | ||
| # pragma clang system_header | ||
| #elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_MSVC) | ||
| # pragma system_header | ||
| #endif // no system header | ||
|
|
||
| #include <cuda/__type_traits/is_vector_type.h> | ||
| #include <cuda/std/__cstddef/types.h> | ||
| #include <cuda/std/__fwd/array.h> | ||
| #include <cuda/std/__fwd/pair.h> | ||
| #include <cuda/std/__fwd/tuple.h> | ||
| #include <cuda/std/__type_traits/integral_constant.h> | ||
| #include <cuda/std/__type_traits/is_extended_floating_point.h> | ||
| #include <cuda/std/__type_traits/is_trivially_copyable.h> | ||
| #include <cuda/std/__type_traits/remove_const.h> | ||
|
|
||
| #include <cuda/std/__cccl/prologue.h> | ||
|
|
||
| _CCCL_BEGIN_NAMESPACE_CUDA | ||
|
|
||
| //! Users are allowed to specialize this variable template for their own types | ||
| template <typename _Tp> | ||
| constexpr bool is_trivially_copyable_relaxed_v = | ||
fbusato marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| ::cuda::std::is_trivially_copyable_v<::cuda::std::remove_const_t<_Tp>> | ||
| || ::cuda::std::__is_extended_floating_point_v<::cuda::std::remove_const_t<_Tp>> | ||
| #if _CCCL_HAS_CTK() | ||
| || ::cuda::is_extended_fp_vector_type_v<::cuda::std::remove_const_t<_Tp>> | ||
| #endif // _CCCL_HAS_CTK() | ||
| ; | ||
|
|
||
| template <typename _Tp> | ||
| constexpr bool is_trivially_copyable_relaxed_v<_Tp[]> = is_trivially_copyable_relaxed_v<_Tp>; | ||
|
|
||
| template <typename _Tp, ::cuda::std::size_t _Size> | ||
| constexpr bool is_trivially_copyable_relaxed_v<_Tp[_Size]> = is_trivially_copyable_relaxed_v<_Tp>; | ||
|
|
||
| template <typename _Tp, ::cuda::std::size_t _Size> | ||
| constexpr bool is_trivially_copyable_relaxed_v<::cuda::std::array<_Tp, _Size>> = is_trivially_copyable_relaxed_v<_Tp>; | ||
|
|
||
| template <typename _T1, typename _T2> | ||
| constexpr bool is_trivially_copyable_relaxed_v<::cuda::std::pair<_T1, _T2>> = | ||
| is_trivially_copyable_relaxed_v<_T1> && is_trivially_copyable_relaxed_v<_T2>; | ||
|
|
||
| template <typename... _Ts> | ||
| constexpr bool is_trivially_copyable_relaxed_v<::cuda::std::tuple<_Ts...>> = | ||
| (is_trivially_copyable_relaxed_v<_Ts> && ...); | ||
|
|
||
| // defined as alias so users cannot specialize it (they should specialize the variable template instead) | ||
| template <typename _Tp> | ||
| using is_trivially_copyable_relaxed = ::cuda::std::bool_constant<is_trivially_copyable_relaxed_v<_Tp>>; | ||
|
|
||
| _CCCL_END_NAMESPACE_CUDA | ||
|
|
||
| #include <cuda/std/__cccl/epilogue.h> | ||
|
|
||
| #endif // __CUDA__TYPE_TRAITS_IS_TRIVIALLY_COPYABLE_RELAXED_H | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.