Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected results? Strange array shapes? Missing data? Read here first. #152

Open
gjoseph92 opened this issue May 16, 2022 · 0 comments
Open

Comments

@gjoseph92
Copy link
Owner

gjoseph92 commented May 16, 2022

If stackstac.stack is producing unexpected results, it's possible (but not certain!) that the problem is that the STAC metadata doesn't match up with the actual GeoTIFFs.

stackstac determines the resolution, bounds, CRS, array size, etc. up front only from the STAC metadata—it's careful to not look at the underlying data (GeoTIFFs) at all. If the STAC metadata says, for example, that an item is 1024x1024 pixels at 5m resolution, but the GeoTIFF is actually 1m resolution, then stackstac will pick an output bounding box and resolution 5x larger than what the actual data calls for.

Additionally, while compute-ing each dask chunk, stackstac skips even opening files that don't spatially overlap with the chunk, according to STAC metadata. If the STAC metadata is wrong, and a file does in fact overlap, then stackstac will never know, and your result may have unexpected sections of NaNs/missing data.

What can you do about it?

  • Verify that the STAC metadata and actual data don't match. Use gdalinfo or xr.open_rasterio to look at the spatial parameters of a few files in question, and compare them to the STAC entries for those items. This is a pretty manual process.
  • Try setting resolution=, epsg=, and bounds= explicitly.
  • If you still get missing data with that, you'll need to pre-process the STAC metadata and correct it yourself before passing it into stackstac.stack.
  • In all cases, please open an issue with the data provider (Microsoft Planetary Computer, etc.).

If there isn't a mismatch between STAC metadata and actual data, and you're getting unexpected results, there are a couple other things to be aware of:

  • If you're seeing half-pixel offsets, be aware of the xy_coords='topleft' default. If you're working with rioxarray, you may want to use stackstac.stack(..., xy_coords='cender').
  • Also be aware of the snap_bounds=True default, especially if you're passing custom bounds.

If you've checked all those things, and you're still getting unexpected results, then there's probably a stackstac bug. Please open an issue!

Past issues ultimately due to incorrect STAC metadata or xy_coords/snap_bounds:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant