Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CUDA memory allocation error in mexWtW2 #32

Open
marius10p opened this issue Oct 2, 2020 · 3 comments
Open

CUDA memory allocation error in mexWtW2 #32

marius10p opened this issue Oct 2, 2020 · 3 comments
Labels
bug Something isn't working

Comments

@marius10p
Copy link
Collaborator

Sometimes this happens:

Finding merges: 100%|################################################################| 508/508 [00:06<00:00, 84.13it/s]

Traceback (most recent call last):

File "d:\github\pykilosort\pykilosort\gui\sorter.py", line 108, in run

self.context = run_spikesort(self.context)

File "d:\github\pykilosort\pykilosort\main.py", line 434, in run_spikesort

out = splitAllClusters(ctx, False)

File "d:\github\pykilosort\pykilosort\postprocess.py", line 755, in splitAllClusters

WtW, iList = getMeWtW(W.astype(cp.float32), U.astype(cp.float32), Nnearest)

File "d:\github\pykilosort\pykilosort\learn.py", line 523, in getMeWtW

wtw0 = mexWtW2(Params, W[:, :, i], W[:, :, j], utu0)

File "d:\github\pykilosort\pykilosort\learn.py", line 485, in mexWtW2

d_Params = cp.asarray(Params, dtype=np.float64, order='F')

File "C:\Users\Marius\anaconda3\envs\pyks2\lib\site-packages\cupy\creation\from_data.py", line 66, in asarray

return core.array(a, dtype, False, order)

File "cupy\core\core.pyx", line 1692, in cupy.core.core.array

File "cupy\core\core.pyx", line 1744, in cupy.core.core.array

File "cupy\core\core.pyx", line 1741, in cupy.core.core.array

File "cupy\cuda\pinned_memory.pyx", line 212, in cupy.cuda.pinned_memory.alloc_pinned_memory

File "cupy\cuda\pinned_memory.pyx", line 286, in cupy.cuda.pinned_memory.PinnedMemoryPool.malloc

File "cupy\cuda\pinned_memory.pyx", line 306, in cupy.cuda.pinned_memory.PinnedMemoryPool.malloc

File "cupy\cuda\pinned_memory.pyx", line 303, in cupy.cuda.pinned_memory.PinnedMemoryPool.malloc

File "cupy\cuda\pinned_memory.pyx", line 177, in cupy.cuda.pinned_memory._malloc

File "cupy\cuda\pinned_memory.pyx", line 178, in cupy.cuda.pinned_memory._malloc

File "cupy\cuda\pinned_memory.pyx", line 29, in cupy.cuda.pinned_memory.PinnedMemory.init

File "cupy\cuda\runtime.pyx", line 239, in cupy.cuda.runtime.hostAlloc

File "cupy\cuda\runtime.pyx", line 145, in cupy.cuda.runtime.check_status

cupy.cuda.runtime
.
CUDARuntimeError
:
cudaErrorIllegalAddress: an illegal memory access was encountered

@rossant
Copy link
Collaborator

rossant commented Oct 2, 2020

Is this happening on a rerun of a given dataset, or a new run in a fresh directory without any remaining cache files?

@marius10p
Copy link
Collaborator Author

Fresh run from inside the GUI. @shashwatsridhar has also gotten this before on a different dataset, though not on this one.

@alexmorley
Copy link
Collaborator

Given d_Params = cp.asarray(Params, dtype=np.float64, order='F') I'd have a strong suspicion that something in Params isn't a float (or is NaN).

We probably should check the type / values of all the arrays before they end up on the GPU. Could work nicely with some wrapper around all of the CUDA kernel calls.

@alexmorley alexmorley added the bug Something isn't working label Nov 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants