-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compute Grid Centerpoint using Welzl's algorithm #811
base: main
Are you sure you want to change the base?
Conversation
…centerpoint. Need to use great circle distance and add/fix tests and data types in the algo
…x test case asserts
I'm not a big fan of the Maybe we should consider the following design?
This would make the workflow look something like the following: # get the value of face_lon without needing to specify an algorithm, will use either the stored value or cart avg
uxgrid.face_lon
# I want to explicitly set the algorithm to be Welzl
uxgrid.populate_face_coordinates(method='welzl')
# value will now be populated using your approach
uxgrid.face_lon
# I want to re-populate again using cartesiain averaging
uxgrid.populate_face_coordinates(method='cartesian average')
# value will now be populated using artesian average
uxgrid.face_lon This allows us to not need to define any new properties and to better stick to the UGRID conventions. What do you think? |
…dependency (use with arcs and arcs use coordinates). o Remove new routine in favor of using the existing angle b/w vectors to calculate distance.
During my testing and sometimes in testing the face geometry both centerpoint and centroid might be needed. When working with a mesh I wanted to check how much did one deviate from the other and if one or the other made more sense. We might be able to get both with the way you propose also, but with two calls to We can get another name for |
My main concern with breaking up the different types of coordinates in separate attributes is that it'll add extra overhead for us to ensure that the coordinates we read match the ones that we want to store, not to mention needing to redefine / extent the UGRID conventions further past what we've already done. Even with this (and say some other method down the line), this could end up looking like:
Consider the case where two UGRID (or any other format) grid files are loaded into UXarray. If we move forward with a split attribute approach, we'd need to ensure that the coordinates we are reading either go into I'm still in favor of keeping @paullric @rljacob Is there ever a sceneiro where we would want to have more than one definition of a "center" coordinate attached to a grid at a time? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good to me. Please address just a few simple comments below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code looks good to me. The only thing remaining was the reduced code coverage, but apparently codecov can't track test cases written for the njit
-decorated functions. After we figure a path forward with that, I am happy to approve this.
uxarray/grid/utils.py
Outdated
Cartesiain y coordinate | ||
z: float | ||
Cartesian z coordinate | ||
|
||
|
||
Returns | ||
------- | ||
lon : float | ||
Longitude in radians | ||
lat: float | ||
Latitude in radians | ||
""" | ||
|
||
lon = math.atan2(y, x) | ||
lat = math.asin(z) | ||
|
||
# set longitude range to [0, pi] | ||
lon = np.mod(lon, 2 * np.pi) | ||
|
||
z_mask = np.abs(z) > 1.0 - ERROR_TOLERANCE | ||
|
||
lat = np.where(z_mask, np.sign(z) * np.pi / 2, lat) | ||
lon = np.where(z_mask, 0.0, lon) | ||
|
||
return lon, lat | ||
|
||
|
||
def _normalize_xyz( | ||
x: Union[np.ndarray, float], | ||
y: Union[np.ndarray, float], | ||
z: Union[np.ndarray, float], | ||
) -> tuple[np.ndarray, np.ndarray, np.ndarray]: | ||
"""Normalizes a set of Cartesiain coordinates.""" | ||
denom = np.linalg.norm( | ||
np.asarray(np.array([x, y, z]), dtype=np.float64), ord=2, axis=0 | ||
) | ||
|
||
x_norm = x / denom | ||
y_norm = y / denom | ||
z_norm = z / denom | ||
return x_norm, y_norm, z_norm | ||
|
||
|
||
@njit(cache=True) | ||
def _lonlat_rad_to_xyz( | ||
lon: Union[np.ndarray, float], | ||
lat: Union[np.ndarray, float], | ||
) -> tuple[np.ndarray, np.ndarray, np.ndarray]: | ||
"""Converts Spherical lon and lat coordinates into Cartesian x, y, z | ||
coordinates.""" | ||
x = np.cos(lon) * np.cos(lat) | ||
y = np.sin(lon) * np.cos(lat) | ||
z = np.sin(lat) | ||
|
||
return x, y, z | ||
|
||
|
||
def _xyz_to_lonlat_deg( | ||
x: Union[np.ndarray, float], | ||
y: Union[np.ndarray, float], | ||
z: Union[np.ndarray, float], | ||
normalize: bool = True, | ||
) -> tuple[np.ndarray, np.ndarray, np.ndarray]: | ||
"""Converts Cartesian x, y, z coordinates in Spherical latitude and | ||
longitude coordinates in degrees. | ||
|
||
Parameters | ||
---------- | ||
x : Union[np.ndarray, float] | ||
Cartesian x coordinates | ||
y: Union[np.ndarray, float] | ||
Cartesiain y coordinates | ||
z: Union[np.ndarray, float] | ||
Cartesian z coordinates | ||
normalize: bool | ||
Flag to select whether to normalize the coordinates | ||
|
||
Returns | ||
------- | ||
lon : Union[np.ndarray, float] | ||
Longitude in degrees | ||
lat: Union[np.ndarray, float] | ||
Latitude in degrees | ||
""" | ||
lon_rad, lat_rad = _xyz_to_lonlat_rad(x, y, z, normalize=normalize) | ||
|
||
lon = np.rad2deg(lon_rad) | ||
lat = np.rad2deg(lat_rad) | ||
|
||
lon = (lon + 180) % 360 - 180 | ||
return lon, lat | ||
|
||
|
||
@njit | ||
def _normalize_xyz_scalar(x: float, y: float, z: float): | ||
denom = np.linalg.norm(np.asarray(np.array([x, y, z]), dtype=np.float64), ord=2) | ||
x_norm = x / denom | ||
y_norm = y / denom | ||
z_norm = z / denom | ||
return x_norm, y_norm, z_norm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still a bit hesitant in moving these. Can you try moving them back into coordinates.py
and provide a bit more info on the cyclical dependencies issue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
all tests should fail now, please check if there is a better fix. I thought those better belonged to grid/utils.py as they can run as a utility and are not tied to other infra.
# _xyz_to_lonlat_rad,
# _lonlat_rad_to_xyz,
# _xyz_to_lonlat_deg,
# _normalize_xyz,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still a bit hesitant in moving these. Can you try moving them back into
coordinates.py
and provide a bit more info on the cyclical dependencies issue?
coordinates.py needs "from uxarray.grid.arcs import _angle_of_2_vectors"
arcs.py needs stuff from coordinates.py -
from uxarray.grid.coordinates import (
_xyz_to_lonlat_rad_scalar,
_normalize_xyz_scalar,
)
Once, we resolve this circular dependency issue. I will disable NUMBA to check codecov - it might increase the percentage coverage issue |
How about doing this in such a way: In the beginning of each indidvual case that tests a njit-decorated function, disable numba, in the end, enable it back? This helps us notice right away that whenever we see this kind of disable-enable pairs, that is a case for testing a njit-decorated function. |
I removed all the numba stuff on my local and the coverage (85% total and 95% for coordinates.py - see pic below) is considerably higher for sha: |
from uxarray.conventions import ugrid | ||
|
||
from typing import Union | ||
from uxarray.grid.arcs import _angle_of_2_vectors |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to be the root cause of the circular import.
To avoid this, can you try using the following in this module when calling this function:
uxarray.grid.arcs._angle_of_2_vectors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This appears to be the root cause of the circular import.
To avoid this, can you try using the following in this module when calling this function:
uxarray.grid.arcs._angle_of_2_vectors
Doesn't work. Numba doesn't like uxarray...
def _replace_fill_values(grid_var, original_fill, new_fill, new_dtype=None): | ||
"""Replaces all instances of the current fill value (``original_fill``) in | ||
(``grid_var``) with (``new_fill``) and converts to the dtype defined by | ||
(``new_dtype``) | ||
|
||
Parameters | ||
---------- | ||
grid_var : np.ndarray | ||
grid variable to be modified | ||
original_fill : constant | ||
original fill value used in (``grid_var``) | ||
new_fill : constant | ||
new fill value to be used in (``grid_var``) | ||
new_dtype : np.dtype, optional | ||
new data type to convert (``grid_var``) to | ||
|
||
Returns | ||
---------- | ||
grid_var : xarray.Dataset | ||
Input Dataset with correct fill value and dtype | ||
""" | ||
|
||
# locations of fill values | ||
if original_fill is not None and np.isnan(original_fill): | ||
fill_val_idx = np.isnan(grid_var) | ||
else: | ||
fill_val_idx = grid_var == original_fill | ||
|
||
# convert to new data type | ||
if new_dtype != grid_var.dtype and new_dtype is not None: | ||
grid_var = grid_var.astype(new_dtype) | ||
|
||
# ensure fill value can be represented with current integer data type | ||
if np.issubdtype(new_dtype, np.integer): | ||
int_min = np.iinfo(grid_var.dtype).min | ||
int_max = np.iinfo(grid_var.dtype).max | ||
# ensure new_fill is in range [int_min, int_max] | ||
if new_fill < int_min or new_fill > int_max: | ||
raise ValueError( | ||
f"New fill value: {new_fill} not representable by" | ||
f" integer dtype: {grid_var.dtype}" | ||
) | ||
|
||
# ensure non-nan fill value can be represented with current float data type | ||
elif np.issubdtype(new_dtype, np.floating) and not np.isnan(new_fill): | ||
float_min = np.finfo(grid_var.dtype).min | ||
float_max = np.finfo(grid_var.dtype).max | ||
# ensure new_fill is in range [float_min, float_max] | ||
if new_fill < float_min or new_fill > float_max: | ||
raise ValueError( | ||
f"New fill value: {new_fill} not representable by" | ||
f" float dtype: {grid_var.dtype}" | ||
) | ||
else: | ||
raise ValueError( | ||
f"Data type {grid_var.dtype} not supported" f"for grid variables" | ||
) | ||
|
||
# replace all zeros with a fill value | ||
grid_var[fill_val_idx] = new_fill | ||
|
||
return grid_var |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shouldn't be moved either. Please refer to the above comment if the circular import issues still persist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I checked, it is not used, what is used is in connectivity
Co-authored-by: Philip Chmielowiec <[email protected]>
Co-authored-by: Philip Chmielowiec <[email protected]>
No description provided.