Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[low priority] CategoricalArrays support #761

Open
4 of 7 tasks
asinghvi17 opened this issue Sep 24, 2024 · 7 comments
Open
4 of 7 tasks

[low priority] CategoricalArrays support #761

asinghvi17 opened this issue Sep 24, 2024 · 7 comments

Comments

@asinghvi17
Copy link
Collaborator

asinghvi17 commented Sep 24, 2024

Right now, some in memory operations work with a Raster that wraps a CategoricalArray.

Here's an mwe:

using Rasters, CategoricalArrays

grain_order = ["clay", "silt", "sand"]
grain_char = rand(grain_order, 6, 6)
grain_fact = CategoricalArray(grain_char, levels = grain_order)

# wrap the categorical array in a Raster object
grain = Raster(grain_fact, (X(LinRange(-1.5, 1.5, 6)), Y(LinRange(-1.5, 1.5, 6))))

This works fine, I'll update the issue to check if things like zonal work as intended.

Writing does not work:

using ArchGDAL
write(grain, "grain.tif")

ERROR: MethodError: Cannot `convert` an object of type Type{CategoricalValue{String, UInt32}} to an object of type ArchGDAL.GDALDataType

Closest candidates are:
  convert(::Type{ArchGDAL.GDALDataType}, ::Type{UInt32})
   @ ArchGDAL ~/.julia/packages/ArchGDAL/xZUPv/src/utils.jl:142
  convert(::Type{ArchGDAL.GDALDataType}, ::Type{UInt16})
   @ ArchGDAL ~/.julia/packages/ArchGDAL/xZUPv/src/utils.jl:142
  convert(::Type{ArchGDAL.GDALDataType}, ::Type{UInt8})
   @ ArchGDAL ~/.julia/packages/ArchGDAL/xZUPv/src/utils.jl:142
  ...

Stacktrace:
  [1] unsafe_create(filename::String; driver::ArchGDAL.Driver, width::Int64, height::Int64, nbands::Int64, dtype::DataType, options::Vector{…})
    @ ArchGDAL ~/.julia/packages/ArchGDAL/xZUPv/src/dataset.jl:407
  [2] unsafe_create
    @ ~/.julia/packages/ArchGDAL/xZUPv/src/dataset.jl:398 [inlined]
  [3] create(f::RastersArchGDALExt.var"#37#40"{}, args::String; kwargs::@Kwargs{})
    @ ArchGDAL ~/.julia/packages/ArchGDAL/xZUPv/src/context.jl:266
  [4] _create_with_driver(f::RastersArchGDALExt.var"#16#18"{}, filename::String, dims::Tuple{…}, T::Type, missingval::Nothing; options::Dict{…}, driver::String, _block_template::Raster{…}, chunks::Rasters.NoKW, kw::@Kwargs{})
    @ RastersArchGDALExt ~/.julia/dev/Rasters/ext/RastersArchGDALExt/gdal_source.jl:375
  [5] _create_with_driver
    @ ~/.julia/dev/Rasters/ext/RastersArchGDALExt/gdal_source.jl:351 [inlined]
  [6] write(filename::String, ::Rasters.GDALsource, A::Raster{…}; force::Bool, verbose::Bool, missingval::Rasters.NoKW, kw::@Kwargs{})
    @ RastersArchGDALExt ~/.julia/dev/Rasters/ext/RastersArchGDALExt/gdal_source.jl:59
  [7] write
    @ ~/.julia/dev/Rasters/ext/RastersArchGDALExt/gdal_source.jl:50 [inlined]
  [8] write(filename::String, A::Raster{…}; source::Rasters.GDALsource, kw::@Kwargs{})
    @ Rasters ~/.julia/dev/Rasters/src/write.jl:78
  [9] write(filename::String, A::Raster{…})
    @ Rasters ~/.julia/dev/Rasters/src/write.jl:74
 [10] top-level scope
    @ REPL[52]:1
Some type information was truncated. Use `show(err)` to see complete types.

so maybe there need to be some more specific dispatches there.

@asinghvi17
Copy link
Collaborator Author

asinghvi17 commented Sep 24, 2024

Also, holy shit, polygonize just works :O
Screenshot 2024-09-23 at 7 03 09 PM

@asinghvi17
Copy link
Collaborator Author

asinghvi17 commented Sep 24, 2024

zonal works but polygonize does not roundtrip through it, not sure what's going on there.

julia> using StatsBase; Rasters.zonal(countmap, grain; of = GO.polygonize(grain))
3-element Vector{Union{Missing, Dict{CategoricalValue{String, UInt32}, Int64}}}:
 Dict{CategoricalValue{String, UInt32}, Int64}("sand" => 1, "clay" => 1)
 Dict{CategoricalValue{String, UInt32}, Int64}("sand" => 1)
 Dict{CategoricalValue{String, UInt32}, Int64}()

@asinghvi17
Copy link
Collaborator Author

Aaaah I didn't realize DimensionalArrays has a CategoricalArrays extension! This makes much more sense now :D

@rafaqz
Copy link
Owner

rafaqz commented Sep 24, 2024

Yeah maybe @tiemvanderdeure added that

@rafaqz
Copy link
Owner

rafaqz commented Sep 24, 2024

What do you want write to do. Write strings? Or an Integer?

Ok seems a UInt8 or UInt16 if there are more than 256 are what to do.

The dbf table trick is kinda weird but we could write one in a RastersCategoricalArraysExt extension, and have a categorical keyword to force reading it?

Maybe netcdf can handle categories without that

@tiemvanderdeure
Copy link
Contributor

Yes I added a small extensions that defines the basic operation of CategoricalArrays on a DimArray, which is why you can call categorical and a Raster and it will behave. I never got round to making a similar extension for Rasters though.

@tiemvanderdeure
Copy link
Contributor

Another one to add to the list is to make missingval make sense for categorical rasters - we probably need some dispatch of _convert_missingval.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants