Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Atomics backend refactor #1631

Merged
merged 74 commits into from
May 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
74 commits
Select commit Hold shift + click to select a range
32a0227
Delete <cuda/std/atomic> header
wmaxey Apr 11, 2024
4be887d
Move atomic from libcxx to top-level
wmaxey Apr 11, 2024
b36fec6
Move PTX backends from libcxx to <cuda/std/__atomic/...>
wmaxey Apr 11, 2024
52a60bb
Delete remaining atomics backends. Move MSVC backend
wmaxey Apr 11, 2024
76294d1
First pass at making atomic use new backends
wmaxey Apr 13, 2024
a780c26
Change atomic_storage operator()() to get()
wmaxey Apr 18, 2024
217527d
Fixup: Change desired of compexch to accept by value.
wmaxey Apr 18, 2024
eaaa670
Fix merge conflicts (LIBCUDACXX->CCCL)
wmaxey Apr 18, 2024
452fc3b
Fix another merge conflict (LIBCUDACXX->CCCL)
wmaxey Apr 18, 2024
91f8b11
Simplify tag dispatch in the atomic backend
wmaxey Apr 18, 2024
0e6c0c0
Make tests work when full path is specified to lit.
wmaxey Apr 23, 2024
a370a02
Update barrier, latch, and semaphore, to use new atomic_impl.
wmaxey Apr 26, 2024
bf801ce
Make changes to atomic work.
wmaxey Apr 26, 2024
64b31af
Rearrange headers and update latch/barrier.
wmaxey May 2, 2024
a625d3f
Update codegen to reflect new header layout.
wmaxey May 2, 2024
4a4782b
Make platform.h define `LIBCUDACXX_ATOMIC_BLAH_LOCK_FREE`.
wmaxey May 2, 2024
fc97437
Fix missing <cstdint> in generated ptx file.
wmaxey May 3, 2024
f91a7c1
`__cuda_std__` mode does not require use of host atomics checks.
wmaxey May 3, 2024
8fb4c6e
Fix missing `_If` in types.h.
wmaxey May 3, 2024
c4546f5
Fix missing <cstdint> in derived PTX file.
wmaxey May 3, 2024
077e3d3
Remove uneeded headers from base.h.
wmaxey May 3, 2024
b38a43a
Fix type mixup in `__atomic_wait`.
wmaxey May 3, 2024
6e4e947
Change heterogeneous tests to permutate over H/D launcher combinations.
wmaxey May 3, 2024
936daaf
Change tests to use `validate_pinned` API.
wmaxey May 3, 2024
6a76eb5
Fix a couple issues with `__atomic_underlying_t` in __atomics/types
wmaxey May 3, 2024
639f41f
Merge branch 'main' into fea/atomic_refactor_simplify
wmaxey May 3, 2024
2b3ed47
[pre-commit.ci] auto code formatting
pre-commit-ci[bot] May 3, 2024
5164484
Fix `<atomic>` header include guard.
wmaxey May 3, 2024
9241238
Move thread_count trait around and remove unused sink.
wmaxey May 3, 2024
835cea3
Fix mistakes from merging clang-format changes.
wmaxey May 3, 2024
bff1381
[pre-commit.ci] auto code formatting
pre-commit-ci[bot] May 3, 2024
608e63c
Merge branch 'main' into fea/atomic_refactor_simplify
wmaxey May 8, 2024
4a478b9
Merge branch 'main' into fea/atomic_refactor_simplify
wmaxey May 8, 2024
198490e
Add system header guards to every new `__atomic` header
wmaxey May 8, 2024
321706f
Delete `<atomic>` header synopsis.
wmaxey May 8, 2024
a5e3d88
Fix push/pop macros in `<atomic>`.
wmaxey May 8, 2024
836addb
`ATOMIC_VAR_INIT->LIBCUDACXX_ATOMIC_VAR_INIT`
wmaxey May 8, 2024
137b854
Make `<cuda/atomic>` tests include the correct header.
wmaxey May 8, 2024
38f188a
Fix typing with volatile atomic types.
wmaxey May 8, 2024
634dfae
Include correct header for `cuda::atomic`.
wmaxey May 8, 2024
e417e9b
Fix underlying_t in `notify_wait.h`
wmaxey May 8, 2024
90a182c
Revert using `volatile` in latch.
wmaxey May 8, 2024
8929189
Make helpers more useful in tests.
wmaxey May 8, 2024
6cdbf6b
Prevent non-CUDA compilers from seeing PTX.
wmaxey May 9, 2024
a886bae
Make the MSVC atomic header a little more friendly.
wmaxey May 9, 2024
38d5f36
Make the derived PTX header only visible to CUDA compilers.
wmaxey May 9, 2024
80978c3
Remove the defaulted scope specifier on the atomic type layer.
wmaxey May 9, 2024
52bbcf1
Remove the defaulted scope specifier from the atomics API layer.
wmaxey May 9, 2024
8e9cd55
Fix missing cassert in several tests.
wmaxey May 9, 2024
ef66a51
Revert mistaken `LIBCUDACXX_ATOMIC_FLAG_INIT` change.
wmaxey May 9, 2024
83d3414
Fix bad atomic alignment errors.
wmaxey May 9, 2024
55cd1ec
[pre-commit.ci] auto code formatting
pre-commit-ci[bot] May 9, 2024
b3b8e60
Make internal owned memory atomic APIs have a default ctor.
wmaxey May 9, 2024
50e37d7
Fully qualify the atomics APIs in the cuda/atomic header.
wmaxey May 9, 2024
1cdffdb
Add missing type_traits to host.h
wmaxey May 9, 2024
082137f
Use `_LIBCUDACXX_INLINE_VISIBILITY` for API functions.
wmaxey May 9, 2024
45a75d2
Reorder derived PTX functions attribute declarations.
wmaxey May 9, 2024
aaf5e94
traits fixups in `__atomic/types`
wmaxey May 9, 2024
a015830
Default ctors and sprinkle noexcept around on some `__atomic/types` A…
wmaxey May 9, 2024
8d90f56
Apply suggestions to common.h.
wmaxey May 9, 2024
63b6f5d
Remove full namespace qualifier in atomic storage trait.
wmaxey May 9, 2024
6620229
modernization fixes to order.h.
wmaxey May 9, 2024
e00ebf3
Move includes cuda/atomic.h and cuda/barrier.h.
wmaxey May 9, 2024
310f06a
Adjust header error block and add missing includes in `<cuda/std/atom…
wmaxey May 9, 2024
1398c3c
Add missing system header block in cuda/atomic
wmaxey May 10, 2024
678006c
Fix invalid use of typename.
wmaxey May 10, 2024
b3a24fe
`_LIBCUDACXX_TRAITS->_CCCL_TRAIT`
wmaxey May 10, 2024
923f61e
[pre-commit.ci] auto code formatting
pre-commit-ci[bot] May 10, 2024
3ade90a
Fix return type of host atomics.
wmaxey May 10, 2024
b297715
Fix cassert missing from generated atomic header.
wmaxey May 10, 2024
f40a59f
Fix visibility of host atomics in NVRTC build.
wmaxey May 10, 2024
b841304
Add more tests to bad_atomic_alignment.pass.cpp
wmaxey May 10, 2024
52788ff
Fix alignment warnings in host compare_exchange layer.
wmaxey May 10, 2024
044b350
[pre-commit.ci] auto code formatting
pre-commit-ci[bot] May 10, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions libcudacxx/codegen/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,8 @@ target_compile_features(

add_dependencies(libcudacxx.atomics.codegen codegen)

set(atomic_generated_output "${libcudacxx_BINARY_DIR}/codegen/atomic_cuda_generated.h")
set(atomic_install_location "${libcudacxx_SOURCE_DIR}/include/cuda/std/detail/libcxx/include/support/atomic")
set(atomic_generated_output "${libcudacxx_BINARY_DIR}/codegen/cuda_ptx_generated.h")
set(atomic_install_location "${libcudacxx_SOURCE_DIR}/include/cuda/std/__atomic/functions")

add_custom_target(
libcudacxx.atomics.codegen.execute
Expand All @@ -32,13 +32,13 @@ add_dependencies(libcudacxx.atomics.codegen libcudacxx.atomics.codegen.execute)

add_custom_target(
libcudacxx.atomics.codegen.install
COMMAND ${CMAKE_COMMAND} -E copy "${atomic_generated_output}" "${atomic_install_location}/atomic_cuda_generated.h"
BYPRODUCTS "${atomic_install_location}/atomic_cuda_generated.h"
COMMAND ${CMAKE_COMMAND} -E copy "${atomic_generated_output}" "${atomic_install_location}/cuda_ptx_generated.h"
BYPRODUCTS "${atomic_install_location}/cuda_ptx_generated.h"
)

add_dependencies(libcudacxx.atomics.codegen.install libcudacxx.atomics.codegen.execute)

add_test(
NAME libcudacxx.atomics.codegen.diff
COMMAND ${CMAKE_COMMAND} -E compare_files "${atomic_install_location}/atomic_cuda_generated.h" "${atomic_generated_output}"
COMMAND ${CMAKE_COMMAND} -E compare_files "${atomic_install_location}/cuda_ptx_generated.h" "${atomic_generated_output}"
)
39 changes: 35 additions & 4 deletions libcudacxx/codegen/codegen.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ int main()

std::vector<std::string> cv_qualifier{"volatile ", ""};

std::ofstream out("atomic_cuda_generated.h");
std::ofstream out("cuda_ptx_generated.h");

out << R"XXX(//===----------------------------------------------------------------------===//
//
Expand All @@ -78,8 +78,36 @@ int main()
//
//===----------------------------------------------------------------------===//

// This is a autogenerated file, we want to ensure that it contains exactly the contentes we want to generate
// This is an autogenerated file, we want to ensure that it contains exactly the contents we want to generate
// clang-format off

#ifndef _LIBCUDACXX___ATOMIC_FUNCTIONS_CUDA_PTX_GENERATED_H
#define _LIBCUDACXX___ATOMIC_FUNCTIONS_CUDA_PTX_GENERATED_H

#include <cuda/std/detail/__config>

#if defined(_CCCL_IMPLICIT_SYSTEM_HEADER_GCC)
# pragma GCC system_header
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_CLANG)
# pragma clang system_header
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_MSVC)
# pragma system_header
#endif // no system header

#include <cuda/std/cassert>
#include <cuda/std/cstdint>
wmaxey marked this conversation as resolved.
Show resolved Hide resolved

#include <cuda/std/__type_traits/enable_if.h>
#include <cuda/std/__type_traits/is_signed.h>
#include <cuda/std/__type_traits/is_unsigned.h>

#include <cuda/std/__atomic/scopes.h>
#include <cuda/std/__atomic/order.h>

_LIBCUDACXX_BEGIN_NAMESPACE_STD

#if defined(_CCCL_CUDA_COMPILER)

)XXX";

auto scopenametag = [&](auto scope) {
Expand Down Expand Up @@ -302,11 +330,11 @@ int main()
{
out << "template<class _Type, _CUDA_VSTD::__enable_if_t<sizeof(_Type)==" << sz / 8 << ", int> = 0>\n";
out << "_CCCL_DEVICE bool __atomic_compare_exchange_cuda(" << cv
<< "_Type *__ptr, _Type *__expected, const _Type *__desired, bool, int __success_memorder, int "
<< "_Type *__ptr, _Type *__expected, const _Type __desired, bool, int __success_memorder, int "
"__failure_memorder, "
<< scopenametag(s.first) << ") {\n";
out << " uint" << sz << "_t __tmp = 0, __old = 0, __old_tmp;\n";
out << " memcpy(&__tmp, __desired, " << sz / 8 << ");\n";
out << " memcpy(&__tmp, &__desired, " << sz / 8 << ");\n";
out << " memcpy(&__old, __expected, " << sz / 8 << ");\n";
out << " __old_tmp = __old;\n";
out << " NV_DISPATCH_TARGET(\n";
Expand Down Expand Up @@ -503,6 +531,9 @@ int main()
}
}

out << "\n#endif // defined(_CCCL_CUDA_COMPILER)\n";
out << "\n_LIBCUDACXX_END_NAMESPACE_STD\n";
out << "\n#endif // _LIBCUDACXX___ATOMIC_FUNCTIONS_CUDA_PTX_GENERATED_H\n";
out << "\n// clang-format on\n";

return 0;
Expand Down
6 changes: 3 additions & 3 deletions libcudacxx/examples/rtc_example.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -50,11 +50,11 @@ template<class T> static constexpr T min(T a, T b) { return a < b ? a : b; }

struct trie {
struct ref {
cuda::std::atomic<trie*> ptr = ATOMIC_VAR_INIT(nullptr);
cuda::std::atomic<trie*> ptr = LIBCUDACXX_ATOMIC_VAR_INIT(nullptr);
// the flag will protect against multiple pointer updates
cuda::std::atomic_flag flag = ATOMIC_FLAG_INIT;
cuda::std::atomic_flag flag = LIBCUDACXX_ATOMIC_FLAG_INIT;
} next[26];
cuda::std::atomic<int> count = ATOMIC_VAR_INIT(0);
cuda::std::atomic<int> count = LIBCUDACXX_ATOMIC_VAR_INIT(0);
};
__host__ __device__
int index_of(char c) {
Expand Down
6 changes: 3 additions & 3 deletions libcudacxx/examples/trie.cu
Original file line number Diff line number Diff line change
Expand Up @@ -36,11 +36,11 @@ struct trie
{
struct ref
{
cuda::atomic<trie*, cuda::thread_scope_device> ptr = ATOMIC_VAR_INIT(nullptr);
cuda::atomic<trie*, cuda::thread_scope_device> ptr = LIBCUDACXX_ATOMIC_VAR_INIT(nullptr);
// the flag will protect against multiple pointer updates
cuda::std::atomic_flag flag = ATOMIC_FLAG_INIT;
cuda::std::atomic_flag flag = LIBCUDACXX_ATOMIC_FLAG_INIT;
} next[26];
cuda::std::atomic<short> count = ATOMIC_VAR_INIT(0);
cuda::std::atomic<short> count = LIBCUDACXX_ATOMIC_VAR_INIT(0);
};
__host__ __device__ int index_of(char c)
{
Expand Down
6 changes: 3 additions & 3 deletions libcudacxx/examples/trie_mt.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -36,11 +36,11 @@ struct trie
{
struct ref
{
std::atomic<trie*> ptr = ATOMIC_VAR_INIT(nullptr);
std::atomic<trie*> ptr = LIBCUDACXX_ATOMIC_VAR_INIT(nullptr);
// the flag will protect against multiple pointer updates
std::atomic_flag flag = ATOMIC_VAR_INIT(0);
std::atomic_flag flag = LIBCUDACXX_ATOMIC_VAR_INIT(0);
} next[26];
std::atomic<int> count = ATOMIC_VAR_INIT(0);
std::atomic<int> count = LIBCUDACXX_ATOMIC_VAR_INIT(0);
};
int index_of(char c)
{
Expand Down
10 changes: 9 additions & 1 deletion libcudacxx/include/cuda/atomic
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,14 @@
#ifndef _CUDA_ATOMIC
#define _CUDA_ATOMIC

#include <cuda/std/atomic>
#include <cuda/std/__cuda/atomic.h>

#if defined(_CCCL_IMPLICIT_SYSTEM_HEADER_GCC)
# pragma GCC system_header
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_CLANG)
# pragma clang system_header
#elif defined(_CCCL_IMPLICIT_SYSTEM_HEADER_MSVC)
# pragma system_header
#endif // no system header

#endif // _CUDA_ATOMIC
Loading
Loading