Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PR]: Improving regrid2 performance #533

Merged
merged 27 commits into from
Mar 8, 2024
Merged

Conversation

jasonb5
Copy link
Collaborator

@jasonb5 jasonb5 commented Aug 17, 2023

Description

After some analysis, I determined we were losing some performance moving back and forth between xarray and numpy.

The first fix ensures we're doing all the heavy computation in numpy. This has reduce the example time from ~4.8583 to ~1.6833.

There's still some more room for improvement.

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules

If applicable:

  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass with my changes (locally and CI/CD build)
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have noted that this is a breaking change for a major release (fix or feature that would cause existing functionality to not work as expected)

@jasonb5 jasonb5 changed the title Regrid2 performance Improving regrid2 performance Aug 17, 2023
@codecov
Copy link

codecov bot commented Aug 17, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (78dedf0) to head (e2125f9).

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #533   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           15        15           
  Lines         1602      1588   -14     
=========================================
- Hits          1602      1588   -14     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@tomvothecoder
Copy link
Collaborator

Hey @jasonb5 just checking in to see your estimated timeline for when this will be ready for review and merge.

I'm shooting to have xCDAT v0.6.0 released in the next week or so.

@tomvothecoder tomvothecoder added this to the v0.6.0 milestone Sep 27, 2023
@tomvothecoder tomvothecoder added type: docs Updates to documentation type: enhancement New enhancement request and removed type: docs Updates to documentation labels Sep 27, 2023
@tomvothecoder
Copy link
Collaborator

Notes from 9/13/23 meeting:

  • Xarray performance overhead might be from Pandas Index, other things like dims, labels, etc.
    • Creating the objects might have an overhead
    • Xarray operations should be equivalent to numpy performance, but creating
    • Do more research here for

@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Oct 11, 2023

10/11/23 Meeting Notes:

Next steps:

  1. Focus on Python optimizations first
    • Overhead in unavoidable operations: constructing Xarray objects with numpy arrays
    • 12 calls to get_bounds(), .2 sec
    • Explore compatibility with ufunc to take advantage of parallelization
  2. @lee1043 will test performance with Python optimizations to see if it is now reasonable with large datasets
    • If still not reasonable, we can look into C based optimizations
    • Most time spent is on calculating means
      • Cython is a possible solution, changes build process -- @jasonb5 already started looking into Cython, might be able to fit it in this PR if implementation and build process are straightforward and performance is dramatically improved
      • Numba is another option (just-in-time compiler) -- consider adding on top of Cython implementation for more performance improvements

@tomvothecoder tomvothecoder added priority: soon Should be addressed soon. priority: now Requires immediate attention and removed priority: soon Should be addressed soon. labels Dec 19, 2023
@tomvothecoder
Copy link
Collaborator

Any status updates here?

@jasonb5
Copy link
Collaborator Author

jasonb5 commented Feb 12, 2024

@tomvothecoder @lee1043 @chengzhuzhang @pochedls
Here's the latest performance result's

Input shape Output shape CDMS2 (seconds) xCDAT (seconds) Diff Mean Diff Max Diff Min
0 (50, 20, 180, 288) (50, 20, 362, 720) 36.0516 16.3418 3.1614e-08 8.39233e-05 -0.000119209
1 (50, 180, 288) (50, 362, 720) 0.414103 6.87195 3.1614e-08 8.39233e-05 -0.000119209
2 (180, 288) (362, 720) 0.0105002 5.15758 3.1614e-08 8.39233e-05 -0.000119209
3 (50, 20, 180, 288) (50, 20, 121, 240) 2.22065 3.87692 4.32748e-08 1.33514e-05 -3.24249e-05
4 (50, 180, 288) (50, 121, 240) 0.0767616 1.66283 4.32748e-08 1.33514e-05 -3.24249e-05
5 (180, 288) (121, 240) 0.00450883 0.74982 4.32748e-08 1.33514e-05 -3.24249e-05

I placed the notebook under /shared/regrid2_perf on jupyterhub if you want to checkout the notebook.

If everything looks alright lets merge and I'll get out the next few fixes and continue working on performance.

@jasonb5 jasonb5 marked this pull request as ready for review February 12, 2024 22:41
@tomvothecoder
Copy link
Collaborator

tomvothecoder commented Feb 12, 2024

Thanks @jasonb5. I will review the code changes this week.

Maybe @lee1043 can test this branch on his code?

@lee1043
Copy link
Collaborator

lee1043 commented Feb 12, 2024

Thank you for the update. Sure, I can test the branch.

@tomvothecoder tomvothecoder changed the title Improving regrid2 performance [PR]: Improving regrid2 performance Feb 12, 2024
Copy link
Collaborator

@tomvothecoder tomvothecoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jasonb5, here's my initial code review with questions and minor suggestions.

xcdat/regridder/regrid2.py Outdated Show resolved Hide resolved
xcdat/regridder/regrid2.py Outdated Show resolved Hide resolved
xcdat/regridder/regrid2.py Outdated Show resolved Hide resolved
xcdat/regridder/regrid2.py Outdated Show resolved Hide resolved
xcdat/regridder/regrid2.py Outdated Show resolved Hide resolved
Comment on lines 271 to 286
mapping = [
np.where(
np.logical_and(
shifted_src_west < dst_east[i], shifted_src_east > dst_west[i]
shifted_src_west < dst_east[x], shifted_src_east > dst_west[x]
)
)[0]
for x in range(dst_length)
]

weight = np.minimum(dst_east[i], shifted_src_east[contrib]) - np.maximum(
dst_west[i], shifted_src_west[contrib]
)

weights.append(weight.values.reshape(1, contrib.shape[0]))

contrib += shift
weights = [
(
np.minimum(dst_east[x], shifted_src_east[y])
- np.maximum(dst_west[x], shifted_src_west[y])
).reshape((1, -1))
for x, y in enumerate(mapping)
]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about adding comment to explain logic and purpose

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved with 7695e2b

Comment on lines 426 to 429
name = input_data_var.cf.axes[cf_axis_name]

if isinstance(name, list):
name = name[0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless the intent here is to only interpret the axis CF attribute?

Instead of using cf_xarray directly, I think you can use xc.get_dim_keys() which can also interpret the standard_name attribute or use the xCDAT fall-back table of generally accepted axis names.

Copy link
Collaborator Author

@jasonb5 jasonb5 Feb 23, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could use get_dim_keys but we will be trading performance for robustness. If we accept this I'm fine making the change.

Copy link
Collaborator

@tomvothecoder tomvothecoder Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Is the performance hit significant using get_dim_keys()? If so, I think it is fine to only interpret the axis attribute for performance.

@lee1043 any thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this part of the code is in the back-end level that less likely be accessed by users, I would prefer prioritizing performance, unless the robustness trading off is too significant. How much performance change this would make?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try passing data variable directly to get_dim_keys()

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like using get_dim_keys works just fine and no decrease in performance, actually a small increase.

Comment on lines +437 to +440
try:
name = ds.cf.bounds[axis][0]
except (KeyError, IndexError):
raise RuntimeError(f"Could not determine {axis!r} bounds")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
try:
name = ds.cf.bounds[axis][0]
except (KeyError, IndexError):
raise RuntimeError(f"Could not determine {axis!r} bounds")
try:
name = ds.bounds.get_bounds(axis)
except (ValueError, KeyError):
raise RuntimeError(f"Could not determine {axis!r} bounds")

I think you can use xCDAT's ds.bounds.get_bounds().

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As from above I could use get_bounds but we're again trading performance for robustness.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. If the current implementation is faster and using .get_bounds() isn't necessary, we can keep your current implementation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After some review I'm going to leave the ds.cf.bounds usage. In this case it's fine to use as it's being called on the input/output grid, both of which have been generated/validated by previous code. We can guarantee that both grid objects only have lat/lon coordinates/bounds with the correct metadata thus we do not need the robustness of ds.bounds.get_bounds.

xcdat/regridder/regrid2.py Show resolved Hide resolved
Comment on lines 127 to 154
for y in range(y_length):
y_seg = np.take(input_data, lat_mapping[y], axis=y_index)

for lon_index, lon_map in enumerate(self._lon_mapping):
lon_weight = self._lon_weights[lon_index]
for x in range(x_length):
x_seg = np.take(y_seg, lon_mapping[x], axis=x_index, mode="wrap")

dot_weight = np.dot(lat_weight, lon_weight)
cell_weight = np.dot(lat_weights[y], lon_weights[x])

cell_weight = np.sum(dot_weight)
output_seg_index = y * x_length + x

input_lon_segment = np.take(
input_lat_segment, lon_map, axis=input_lon_index
if is_2d:
output_data[output_seg_index] = np.divide(
np.sum(
np.multiply(x_seg, cell_weight),
axis=(y_index, x_index),
),
np.sum(cell_weight),
)

data = (
np.nansum(
np.multiply(input_lon_segment, dot_weight),
axis=(input_lat_index, input_lon_index),
)
/ cell_weight
else:
output_seg = output_data[output_seg_index]

np.divide(
np.sum(
np.multiply(x_seg, cell_weight),
axis=(y_index, x_index),
),
np.sum(cell_weight),
out=output_seg,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comments explaining the logic here would be good. Maybe in the docstring.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved with 7695e2b

Copy link
Collaborator

@lee1043 lee1043 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delayed approval. I thought I have marked approval but missed.

@jasonb5 jasonb5 merged commit 38cdfd5 into xCDAT:main Mar 8, 2024
9 checks passed
@jasonb5 jasonb5 deleted the regrid2_performance branch March 8, 2024 21:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: now Requires immediate attention type: enhancement New enhancement request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Regrid2Regridder yields NANs [Bug]: NaN handling in regrid2 [Bug]: slow regrid2 operation
3 participants