Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error raised when array has additional 2D coordinates #166

Open
santisoler opened this issue Nov 11, 2021 · 7 comments
Open

Error raised when array has additional 2D coordinates #166

santisoler opened this issue Nov 11, 2021 · 7 comments

Comments

@santisoler
Copy link
Contributor

santisoler commented Nov 11, 2021

In Harmonica we usually produce 2D grids out of scattered data whose dimensions are northing and easting. In addition, we often have an extra coordinate: the upward, which is the vertical coordinate of each point of the grid. The upward coordinate is included into the list of coordinates inside the array, but as a 2D array with the same dimensions as the grid itself.

When trying to compute the fft over this kind of arrays, xrft is raising and error because it recognizes the upward as a bad coordinate. I leave a minimal working example to reproduce the error:

import numpy as np
import xarray as xr
import xrft

# Build a sample array
# --------------------
# Generate easting and northing
easting = np.linspace(-5e3, 3e3, 91)
northing = np.linspace(4e3, 10e3, 61)

# Generate sample data and the upward coordinate
ee, nn = np.meshgrid(easting, northing)
upward = np.ones_like(ee)
data = ee ** 2 + nn ** 2

# Build the array
dims = ("northing", "easting")
da = xr.DataArray(
    data,
    coords={"easting": easting, "northing": northing, "upward": (dims, upward)},
    dims=dims,
)
print(da)

# Compute FFT
# -----------
xrft.fft(da, true_phase=True, true_amplitude=True)
<xarray.DataArray (northing: 61, easting: 91)>
array([[4.10000000e+07, 4.01190123e+07, 3.92538272e+07, ...,
        2.39649383e+07, 2.44745679e+07, 2.50000000e+07],
       [4.18100000e+07, 4.09290123e+07, 4.00638272e+07, ...,
        2.47749383e+07, 2.52845679e+07, 2.58100000e+07],
       [4.26400000e+07, 4.17590123e+07, 4.08938272e+07, ...,
        2.56049383e+07, 2.61145679e+07, 2.66400000e+07],
       ...,
       [1.21040000e+08, 1.20159012e+08, 1.19293827e+08, ...,
        1.04004938e+08, 1.04514568e+08, 1.05040000e+08],
       [1.23010000e+08, 1.22129012e+08, 1.21263827e+08, ...,
        1.05974938e+08, 1.06484568e+08, 1.07010000e+08],
       [1.25000000e+08, 1.24119012e+08, 1.23253827e+08, ...,
        1.07964938e+08, 1.08474568e+08, 1.09000000e+08]])
Coordinates:
  * easting   (easting) float64 -5e+03 -4.911e+03 -4.822e+03 ... 2.911e+03 3e+03
  * northing  (northing) float64 4e+03 4.1e+03 4.2e+03 ... 9.8e+03 9.9e+03 1e+04
    upward    (northing, easting) float64 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0
Traceback (most recent call last):
  File "/home/santi/tmp/test.py", line 27, in <module>
    xrft.fft(da, true_phase=True, true_amplitude=True)
  File "/home/santi/.miniforge3/envs/default/lib/python3.9/site-packages/xrft/xrft.py", line 392, in fft
    raise ValueError(
ValueError: The input array contains coordinate variable(s) (['upward']) whose dims include the transform dimension(s) `northing`. Please drop these coordinates (`.drop(['upward']`) before invoking xrft.

Version of xrft: v0.4.0


I think we could improve the detection of the bad coordinates in a way that it doesn't include any n-dimensional coordinate (where n > 1) so we don't need to drop them before passing them to fft.

What do you think?

@santisoler
Copy link
Contributor Author

Another idea would be to apply fft over indexes only, instead of looking for coordinates.

@roxyboy
Copy link
Member

roxyboy commented Nov 11, 2021

Another idea would be to apply fft over indexes only, instead of looking for coordinates.

Can you elaborate on this? xrft uses the raw coordinates to inform the coordinates after the fft is taken.

@santisoler
Copy link
Contributor Author

santisoler commented Nov 11, 2021

Can you elaborate on this? xrft uses the raw coordinates to inform the coordinates after the fft is taken.

Sure! Sorry for the brevity, now that I reread it sounds like I was thinking out loud.

One alternative way to select the possible coordinates along which the fft can be performed could be through the indexes.
If you see the output of the previous print statement, the indexes are those coordinates that has an * before their name.

For instance, if I have an array with several coordinates:

  • northing and easting as indexes,
  • northing_km and easting_km as extra coordinates with northing and easting dims, respectively,
  • and the same upward coordinate.
import numpy as np
import xarray as xr

# Build a sample array
# --------------------
# Generate easting and northing, and their versions in km
easting = np.linspace(-5e3, 3e3, 91)
northing = np.linspace(4e3, 10e3, 61)
easting_km, northing_km = easting * 1e-3, northing * 1e-3

# Generate sample data and the upward coordinate
ee, nn = np.meshgrid(easting, northing)
upward = np.ones_like(ee)
data = ee ** 2 + nn ** 2

# Build the array
dims = ("northing", "easting")
da = xr.DataArray(
    data,
    coords={
        "easting": easting,
        "northing": northing,
        "northing_km": ("northing", northing_km),
        "easting_km": ("easting", easting_km),
        "upward": (dims, upward),
    },
    dims=dims,
)
print(da)

print("\nIndexes:", tuple(i for i in da.indexes))
<xarray.DataArray (northing: 61, easting: 91)>
array([[4.10000000e+07, 4.01190123e+07, 3.92538272e+07, ...,
        2.39649383e+07, 2.44745679e+07, 2.50000000e+07],
       [4.18100000e+07, 4.09290123e+07, 4.00638272e+07, ...,
        2.47749383e+07, 2.52845679e+07, 2.58100000e+07],
       [4.26400000e+07, 4.17590123e+07, 4.08938272e+07, ...,
        2.56049383e+07, 2.61145679e+07, 2.66400000e+07],
       ...,
       [1.21040000e+08, 1.20159012e+08, 1.19293827e+08, ...,
        1.04004938e+08, 1.04514568e+08, 1.05040000e+08],
       [1.23010000e+08, 1.22129012e+08, 1.21263827e+08, ...,
        1.05974938e+08, 1.06484568e+08, 1.07010000e+08],
       [1.25000000e+08, 1.24119012e+08, 1.23253827e+08, ...,
        1.07964938e+08, 1.08474568e+08, 1.09000000e+08]])
Coordinates:
  * easting      (easting) float64 -5e+03 -4.911e+03 ... 2.911e+03 3e+03
  * northing     (northing) float64 4e+03 4.1e+03 4.2e+03 ... 9.9e+03 1e+04
    northing_km  (northing) float64 4.0 4.1 4.2 4.3 4.4 ... 9.6 9.7 9.8 9.9 10.0
    easting_km   (easting) float64 -5.0 -4.911 -4.822 -4.733 ... 2.822 2.911 3.0
    upward       (northing, easting) float64 1.0 1.0 1.0 1.0 ... 1.0 1.0 1.0 1.0

Indexes: ('easting', 'northing')

If I run xrft.fft over this array, I would get instructed to drop the upward, the northing_km and the easting_km. If we instead look for the coordinates along which to perform the fft through the indexes, we avoid that and return the FT array with just freq_northing and freq_easting, ignoring the other coords.

This is an unpolished idea, so feel free to disagree and/or offer another way to solve this issue.

@santisoler
Copy link
Contributor Author

If working with indexes is too complex, maybe we could just ignore these extra coordinates and apply the FFT only on the ones that are named after the dimensions of the array.

I think my main issue is with the error against bad coordinates. Is there any special reason for it to exist?

@roxyboy
Copy link
Member

roxyboy commented Nov 11, 2021

PR #163 was where we implemented the error detection as an easy fix to issue #162 .

@roxyboy
Copy link
Member

roxyboy commented Nov 11, 2021

If working with indexes is too complex, maybe we could just ignore these extra coordinates and apply the FFT only on the ones that are named after the dimensions of the array.

I personally like this idea.

@santisoler
Copy link
Contributor Author

I've just noticed that pad has a similar problem: it raises a cryptic xarray error when trying to pad a grid with the extra upward coordinate. So probably a single private function should take care of the task of ignoring additional coordinates for both cases, and only a simple call to it inside fft and pad should do the trick.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants