Conversation
|
I wanted to start a discussion for some time to add such a trait, but I would really keep this feature internally for now until it is sufficiently baked. I would really like this feature to make |
davebayer
left a comment
There was a problem hiding this comment.
I am not a fan of this trait. We bend C++ rules to fix poorly designed nvfp types. We would have to basically change every use of is_trivially_copyable and is_trivially_copy_constructible to this trait to make it work consistently in libcu++.
I don't think this is a good idea, we should rather insist of the nvfp types being fixed.
docs/libcudacxx/extended_api/type_traits/is_trivially_copyable_relaxed.rst
Outdated
Show resolved
Hide resolved
| Users may specialize ``cuda::is_trivially_copyable_relaxed`` for their own types whose memory representation is safe to copy | ||
| with ``memcpy`` but that the compiler does not consider trivially copyable. |
There was a problem hiding this comment.
not for the types that we care about. Said that, the user could provide an object that triggers UB. I can highlight this point in the documentation but we cannot do anything to explicitly prevent it.
Integrated and extended the tests that you point out. Everything works |
the actual issue is that the user can extend the trait to custom types. It needs to be public |
there are no plan for that. nvfp types are not trivially copyable for optimization purposes. |
This isn't going to happen and we shouldn't delude ourselves into thinking it ever will. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Please move this out of the PR.
This is a nontrivial piece of code and the current implementation seems really costly.
I want this in a separate PR so that we can properly review it
There was a problem hiding this comment.
Fine, I based the implementation on the code from stdexec. Eric already review it
There was a problem hiding this comment.
I am not against merging it, but I want it in a separate PR
libcudacxx/test/libcudacxx/cuda/type_traits/is_trivially_copyable.aggr.pass.cpp
Show resolved
Hide resolved
libcudacxx/test/libcudacxx/cuda/type_traits/is_trivially_copyable.pass.cpp
Show resolved
Hide resolved
Co-authored-by: Michael Schellenberger Costa <miscco@nvidia.com>
…laxed-type-traits
| #if defined(_CCCL_BUILTIN_BIT_CAST) | ||
| return _CCCL_BUILTIN_BIT_CAST(_To, __from); | ||
| #else // ^^^ _CCCL_BUILTIN_BIT_CAST ^^^ / vvv !_CCCL_BUILTIN_BIT_CAST vvv | ||
| static_assert(is_trivially_default_constructible_v<_To>, |
There was a problem hiding this comment.
C++ specification does not impose this constrain https://eel.is/c++draft/bit.cast. is_trivially_default_constructible_v is too strict. cuda::std::complex fails for example.
We only need to check if _To __temp; compiles. The best match is C++20 std::default_initializable
This comment has been minimized.
This comment has been minimized.
…elaxed-type-traits # Conflicts: # libcudacxx/include/cuda/std/__cccl/builtin.h
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
😬 CI Workflow Results🟥 Finished in 3h 41m: Pass: 92%/162 | Total: 8d 13h | Max: 3h 41m | Hits: 38%/392424See results here. |
Description
Followup of the discussion in
warp_shuffleoriginal behavior (revert #8210) #8254The PR introduces
cuda::is_trivially_copyable_relaxedto support types that are actually trivially copyable but not recognized by the C++std::is_trivially_copyabletype trait.The new trait supports:
cuda::std::array.cuda::std::pair.cuda::std::tuple.Potentially affected paths by
std::is_trivially_copyableunsupported cases:
cuda/std/__atomic/types/base.h:39cuda/std/__atomic/types/reference.h:39cuda/__memcpy_async/memcpy_async_barrier.h:61cuda/__memcpy_async/memcpy_async_tx.h:58cuda/__container/buffer.h:111cuda/__algorithm/copy.h:73cuda/__algorithm/fill.h:42cuda/std/string_view:147fallback to slower path:
cuda/std/__algorithm/copy.h:102cuda/std/__algorithm/copy_backward.h:50cuda/std/__algorithm/move.h:54cuda/std/__algorithm/move_backward.h:53cuda/std/__bit/bit_cast.h:55cub/device/dispatch/kernels/kernel_histogram.cuh:81cub/detail/uninitialized_copy.cuh:33thrust/type_traits/is_trivially_relocatable.h:213cudax/include/cuda/experimental/__kernel/kernel_ref.cuh:54Require #8347