Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sind and cosd not working with Float32 oneArrays #463

Closed
pvillacorta opened this issue Sep 11, 2024 · 1 comment
Closed

sind and cosd not working with Float32 oneArrays #463

pvillacorta opened this issue Sep 11, 2024 · 1 comment

Comments

@pvillacorta
Copy link

pvillacorta commented Sep 11, 2024

Supose 4 different arrays:

using CUDA, oneAPI

a = CuArray([0.1, 0.2, 0.3])
b = CuArray([0.1f0, 0.2f0, 0.3f0])
c = oneArray([0.1, 0.2, 0.3])
d = oneArray([0.1f0, 0.2f0, 0.3f0])

We calculate sind by broadcasting:

julia> sind.(a) #works
3-element CuArray{Float64, 1, CUDA.DeviceMemory}:
 0.001745328365898309
 0.0034906514152237326
 0.00523596383141958

julia> sind.(b) #works
3-element CuArray{Float32, 1, CUDA.DeviceMemory}:
 0.0017453284
 0.0034906515
 0.005235964

julia> sind.(c) #works
3-element oneArray{Float64, 1, oneAPI.oneL0.DeviceBuffer}:
 0.001745328365898309
 0.0034906514152237326
 0.00523596383141958

These three work. However, when we try with a oneArray composed by Float32 elements, it throws the following error:

julia> sind.(d)
ERROR: InvalidIRError: compiling MethodInstance for (::GPUArrays.var"#34#36")(::oneAPI.oneKernelContext, ::oneDeviceVector{…}, ::Base.Broadcast.Broadcasted{…}, ::Int64) resulted in invalid LLVM IR
Reason: unsupported call to an unknown function (call to gpu_malloc)
Stacktrace:
  [1] malloc
    @ ~/.julia/packages/GPUCompiler/Y4hSX/src/runtime.jl:88
  [2] macro expansion
    @ ~/.julia/packages/GPUCompiler/Y4hSX/src/runtime.jl:183
  [3] macro expansion
    @ ./none:0
  [4] box
    @ ./none:0
  [5] box_float32
    @ ~/.julia/packages/GPUCompiler/Y4hSX/src/runtime.jl:212
  [6] sind
    @ ./special/trig.jl:1183
  [7] _broadcast_getindex_evalf
    @ ./broadcast.jl:709
  [8] _broadcast_getindex
    @ ./broadcast.jl:682
  [9] getindex
    @ ./broadcast.jl:636
 [10] #34
    @ ~/.julia/packages/GPUArrays/qt4ax/src/host/broadcast.jl:59
Hint: catch this exception as `err` and call `code_typed(err; interactive = true)` to introspect the erronous code with Cthulhu.jl
Stacktrace:
  [1] check_ir(job::GPUCompiler.CompilerJob{GPUCompiler.SPIRVCompilerTarget, oneAPI.oneAPICompilerParams}, args::LLVM.Module)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/Y4hSX/src/validation.jl:147
  [2] macro expansion
    @ ~/.julia/packages/GPUCompiler/Y4hSX/src/driver.jl:458 [inlined]
  [3] macro expansion
    @ ~/.julia/packages/TimerOutputs/Lw5SP/src/TimerOutput.jl:253 [inlined]
  [4] macro expansion
    @ ~/.julia/packages/GPUCompiler/Y4hSX/src/driver.jl:457 [inlined]
  [5] 
    @ GPUCompiler ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:103
  [6] emit_llvm
    @ ~/.julia/packages/GPUCompiler/Y4hSX/src/utils.jl:97 [inlined]
  [7] 
    @ GPUCompiler ~/.julia/packages/GPUCompiler/Y4hSX/src/driver.jl:136
  [8] codegen
    @ ~/.julia/packages/GPUCompiler/Y4hSX/src/driver.jl:115 [inlined]
  [9] 
    @ GPUCompiler ~/.julia/packages/GPUCompiler/Y4hSX/src/driver.jl:111
 [10] compile
    @ ~/.julia/packages/GPUCompiler/Y4hSX/src/driver.jl:103 [inlined]
 [11] #58
    @ ~/.julia/packages/oneAPI/z4Axk/src/compiler/compilation.jl:81 [inlined]
 [12] JuliaContext(f::oneAPI.var"#58#59"{GPUCompiler.CompilerJob{…}}; kwargs::@Kwargs{})
    @ GPUCompiler ~/.julia/packages/GPUCompiler/Y4hSX/src/driver.jl:52
 [13] JuliaContext(f::Function)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/Y4hSX/src/driver.jl:42
 [14] compile(job::GPUCompiler.CompilerJob)
    @ oneAPI ~/.julia/packages/oneAPI/z4Axk/src/compiler/compilation.jl:80
 [15] actual_compilation(cache::Dict{…}, src::Core.MethodInstance, world::UInt64, cfg::GPUCompiler.CompilerConfig{…}, compiler::typeof(oneAPI.compile), linker::typeof(oneAPI.link))
    @ GPUCompiler ~/.julia/packages/GPUCompiler/Y4hSX/src/execution.jl:237
 [16] cached_compilation(cache::Dict{…}, src::Core.MethodInstance, cfg::GPUCompiler.CompilerConfig{…}, compiler::Function, linker::Function)
    @ GPUCompiler ~/.julia/packages/GPUCompiler/Y4hSX/src/execution.jl:151
 [17] macro expansion
    @ ~/.julia/packages/oneAPI/z4Axk/src/compiler/execution.jl:203 [inlined]
 [18] macro expansion
    @ ./lock.jl:267 [inlined]
 [19] zefunction(f::GPUArrays.var"#34#36", tt::Type{Tuple{…}}; kwargs::@Kwargs{})
    @ oneAPI ~/.julia/packages/oneAPI/z4Axk/src/compiler/execution.jl:198
 [20] zefunction
    @ ~/.julia/packages/oneAPI/z4Axk/src/compiler/execution.jl:195 [inlined]
 [21] macro expansion
    @ ~/.julia/packages/oneAPI/z4Axk/src/compiler/execution.jl:66 [inlined]
 [22] #launch_heuristic#93
    @ ~/.julia/packages/oneAPI/z4Axk/src/gpuarrays.jl:17 [inlined]
 [23] launch_heuristic
    @ ~/.julia/packages/oneAPI/z4Axk/src/gpuarrays.jl:15 [inlined]
 [24] _copyto!
    @ ~/.julia/packages/GPUArrays/qt4ax/src/host/broadcast.jl:78 [inlined]
 [25] copyto!
    @ ~/.julia/packages/GPUArrays/qt4ax/src/host/broadcast.jl:44 [inlined]
 [26] copy
    @ ~/.julia/packages/GPUArrays/qt4ax/src/host/broadcast.jl:29 [inlined]
 [27] materialize(bc::Base.Broadcast.Broadcasted{oneAPI.oneArrayStyle{…}, Nothing, typeof(sind), Tuple{…}})
    @ Base.Broadcast ./broadcast.jl:903
 [28] top-level scope
    @ REPL[10]:1
Some type information was truncated. Use `show(err)` to see complete types.

The same happens with cosd.
I do not know if it also happens with AMD and/or Metal.jl, I am not able to test it.
Thank you!

@maleadt
Copy link
Member

maleadt commented Sep 11, 2024

Yeah, this is somewhat known: #65
We currently do not support device-side exceptions, when they require allocations. We ought to implement a bump pointer allocator or something, but that hasn't happened.

So I think we can close this in favor of #65?

@maleadt maleadt closed this as completed Sep 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants