Listing every format that could be represented as virtual zarr #218
Labels
help wanted
Extra attention is needed
Kerchunk
Relating to the kerchunk library / specification itself
references generation
Reading byte ranges from archival files
usage example
Real world use case examples
Let's list all the file formats that could potentially be represented efficiently as "virtual zarr" - i.e. zarr + chunk manifests.
The important criteria here is that the format must store data in a small number of contiguous chunks, such that access using http range requests to object storage is efficient. This rules out some formats, for example I don't think we can efficiently access this format that @kmuehlbauer mentioned over in openradar/xradar#187 (comment):
If we start thinking of Zarr as a "SuperFormat" (super as in superset, not as in super-duper), then this is the list of existing formats comprising that set of what can be referenced using chunk manifests (see zarr-developers/zarr-specs#287).
Definitely can support:
Probably can support:
.npz
filesMaybe can support?
.mat
files (specification documented here)Probably can't support:
(The checkboxes indicate whether or not a working implementation already exists - going through kerchunks' in-memory format as an intermediate or creating a
ManifestArray
directly.)cc @jhamman @d-v-b
The text was updated successfully, but these errors were encountered: