Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Black stripes on mosaicking #217

Closed
lhcorralo opened this issue Sep 7, 2023 · 3 comments · Fixed by #219
Closed

Black stripes on mosaicking #217

lhcorralo opened this issue Sep 7, 2023 · 3 comments · Fixed by #219

Comments

@lhcorralo
Copy link

lhcorralo commented Sep 7, 2023

Hi!

Description

I am running some tests with stackstac, and I have an issue regarding the mosaicking: some stripes appear. Note that I did the tests in a Jupyter notebook, despite I guess it will not be any difference

Code

The code is

from dask.distributed import Client, LocalCluster
from pystac import ItemCollection
from pystac_client import Client
import stackstac
import numpy as np
from matplotlib import pyplot as plt
  
# Init Dask
cluster = LocalCluster(processes=False, n_workers=2, threads_per_worker=2, memory_limit='8GiB')
client = Client(cluster)

# Search parameters
bbox = [-3.5, 40.0, -1.0, 41.5]

# Get items
catalog = Client.open("https://planetarycomputer.microsoft.com/api/stac/v1")
search = catalog.search(collections=["cop-dem-glo-30"], bbox=bbox)
items_pc: ItemCollection = search.item_collection()

# Stack and mosaic
stack = stackstac.stack(items_pc)
mosaic = stackstac.mosaic(stack)

# plot
fig = plt.figure()
plt.imshow(mosaic[0].compute())

and the result is as appears in
mosaic

As the files are DEM files, mosaicking results should be similar using another functions, so I tried max in order to evaluate if the problem was regarding stacking/input sources or mosaicking.

mosaic = stack.max(dim='time')

# plot
fig = plt.figure()
plt.imshow(mosaic[0].compute())

In this case, the operation seems to give good results

max

Let me know if I can provide more info

Thanks!

@gjoseph92
Copy link
Owner

Thanks for the good report @lhcorralo. It looks like the underlying issue is rasterio/rasterio#2916.

Short explanation: the xarray is basically composed of multiple tiles, spatially. It's also reading from multiple items, which cover different areas. When a tile doesn't overlap with an item at all, stackstac has an optimization to return all NaNs for the tile. But when the tile overlaps with an item, we use rasterio to read it. It looks like in this particular dataset, the GeoTIFFs don't have a nodata value set, so GDAL is filling in 0s for the pixels outside the bounds of the item, but not actually telling us that they should be masked out. However, the mosaic operation is only treating NaNs as nodata, so the 0s are left in, resulting in the stripes. That's also why max works: any actual data value is greater than 0.

I'll push up a fix for this, but for now, you can probably work around this with:

stack = stackstac.stack(items_pc, fill_value=0)
mosaic = stackstac.mosaic(stack, nodata=0)

That will make stackstac use 0s for empty tiles, matching up with GDAL's nodata value. Then you mosaic out all the 0s.

As a sidenote, this is yet another reason to move away from VRTs, which already needs to happen: #196.

@lhcorralo
Copy link
Author

lhcorralo commented Sep 8, 2023

Thanks!

Yes, it works as a workaround: it is easy to get a fill_value/nodata for elevation values (I have seen -9999 sometimes), so no problem.

@gjoseph92
Copy link
Owner

FYI @lhcorralo, the fix is now released in 0.5.0, so you should be able to go back to using mosaic. Thanks for the great bug report!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants