Skip to content

Commit

Permalink
Fix stream not being set when calling hipMemsetAsync (#3244)
Browse files Browse the repository at this point in the history
* Fix stream not being set when calling hipMemsetAsync

* fix clang format issue

* Fix missing handle for additional ZeroOutBuffer call in 6.2

---------

Co-authored-by: Jun Liu <[email protected]>
  • Loading branch information
BrianHarrisonAMD and junliume authored Sep 10, 2024
1 parent e275d9d commit d62f773
Showing 1 changed file with 5 additions and 4 deletions.
9 changes: 5 additions & 4 deletions src/include/miopen/solver/implicitgemm_ck_util.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -376,9 +376,10 @@ class TransposeInstance
Run(handle, kernels, out_ptr, buf_handle.get());
}

void ZeroOutBuffer()
void ZeroOutBuffer(const Handle& handle)
{
[[maybe_unused]] auto status = hipMemset(buf_handle.get(), 0, tensor_sz);
[[maybe_unused]] auto status =
hipMemsetAsync(buf_handle.get(), 0, tensor_sz, handle.GetStream());
assert(status == hipSuccess);
}

Expand Down Expand Up @@ -702,7 +703,7 @@ ConvSolution InitInvokerFactoryNCHW(const ExecutionContext& ctx,
/// \todo: Will need SetTensor() to properly zero out non-packed tensors
if(output_tr_inst.GetConvOperandTag() == internal::ConvOperandTag::Weights)
{
output_tr_inst.ZeroOutBuffer();
output_tr_inst.ZeroOutBuffer(handle);
}

std::array<internal::TransposeInstanceTagged*, 3> tr_ptrs = {
Expand Down Expand Up @@ -845,7 +846,7 @@ ConvSolution InitInvokerFactoryWrwNCHW(const ExecutionContext& ctx,
/// \todo: Will need SetTensor() to properly zero out non-packed tensors
if(output_tr_inst.GetConvOperandTag() == internal::ConvOperandTag::Weights)
{
output_tr_inst.ZeroOutBuffer();
output_tr_inst.ZeroOutBuffer(handle);
}

std::array<internal::TransposeInstanceTagged*, 3> tr_ptrs = {
Expand Down

0 comments on commit d62f773

Please sign in to comment.