Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRAFT: Weighted Average #833

Draft
wants to merge 18 commits into
base: main
Choose a base branch
from
Draft

DRAFT: Weighted Average #833

wants to merge 18 commits into from

Conversation

philipc2
Copy link
Member

@philipc2 philipc2 commented Jul 2, 2024

Closes #826

Overview

Expected Usage

import uxarray as ux

grid_path = "/path/to/grid.nc"
data_path = "/path/to/data.nc"

uxds = ux.open_dataset(grid_path, data_path)

# this is how you use this function
some_output = uxds.some_function()

# this is another way to use this function
other_output = uxds.some_function(some_param = True)

PR Checklist

General

  • An issue is linked created and linked
  • Add appropriate labels
  • Filled out Overview and Expected Usage (if applicable) sections

Testing

  • Adequate tests are created if there is new functionality
  • Tests cover all possible logical paths in your function
  • Tests are not too basic (such as simply calling a function and nothing else)

Documentation

  • Docstrings have been added to all new functions
  • Docstrings have updated with any function changes
  • Internal functions have a preceding underscore (_) and have been added to docs/internal_api/index.rst
  • User functions have been added to docs/user_api/index.rst

Examples

  • Any new notebook examples added to docs/examples/ folder
  • Clear the output of all cells before committing
  • New notebook files added to docs/examples.rst toctree
  • New notebook files added to new entry in docs/gallery.yml with appropriate thumbnail photo in docs/_static/thumbnails/

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@philipc2
Copy link
Member Author

philipc2 commented Jul 2, 2024

@rytam2

I've set up the boilerplate for the weighted mean functionality. This should be a good place to get started. We can run over this during today's meeting.

@philipc2
Copy link
Member Author

philipc2 commented Jul 5, 2024

@rytam2

We have fixed the issue with the quad-hexagon grid. I've added it back to the test case.

@philipc2 philipc2 added the run-benchmark Run ASV benchmark workflow label Jul 17, 2024
Copy link

github-actions bot commented Jul 17, 2024

ASV Benchmarking

Benchmark Comparison Results

Benchmarks that have improved:

Change Before [1a68daa] After [06a301e] Ratio Benchmark (Parameter)
- 400M 347M 0.87 mpas_ocean.Integrate.peakmem_integrate('480km')

Benchmarks that have stayed the same:

Change Before [1a68daa] After [06a301e] Ratio Benchmark (Parameter)
376M 376M 1.00 face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/mpas/QU/oQU480.231010.nc'))
377M 377M 1.00 face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/scrip/outCSne8/outCSne8.nc'))
381M 380M 1.00 face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/geoflow-small/grid.nc'))
379M 378M 1.00 face_bounds.FaceBounds.peakmem_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/quad-hexagon/grid.nc'))
1.42±0s 1.45±0.01s 1.02 face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/mpas/QU/oQU480.231010.nc'))
183±0.5ms 191±4ms 1.04 face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/scrip/outCSne8/outCSne8.nc'))
1.67±0s 1.69±0.01s 1.01 face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/geoflow-small/grid.nc'))
8.68±0.5ms 7.92±0.08ms 0.91 face_bounds.FaceBounds.time_face_bounds(PosixPath('/home/runner/work/uxarray/uxarray/test/meshfiles/ugrid/quad-hexagon/grid.nc'))
1.65±0.01s 1.63±0.01s 0.99 import.Imports.timeraw_import_uxarray
630±4ms 616±5ms 0.98 mpas_ocean.ConnectivityConstruction.time_face_face_connectivity('120km')
38.8±0.2ms 39.7±0.2ms 1.02 mpas_ocean.ConnectivityConstruction.time_face_face_connectivity('480km')
1.92±0.1ms 1.80±0.01ms 0.94 mpas_ocean.ConnectivityConstruction.time_n_nodes_per_face('120km')
485±10μs 476±10μs 0.98 mpas_ocean.ConnectivityConstruction.time_n_nodes_per_face('480km')
1.26±0μs 1.19±0μs 0.95 mpas_ocean.ConstructTreeStructures.time_ball_tree('120km')
310±2ns 306±2ns 0.99 mpas_ocean.ConstructTreeStructures.time_ball_tree('480km')
838±8ns 862±2ns 1.03 mpas_ocean.ConstructTreeStructures.time_kd_tree('120km')
288±1ns 282±8ns 0.98 mpas_ocean.ConstructTreeStructures.time_kd_tree('480km')
394M 397M 1.01 mpas_ocean.GeoDataFrame.peakmem_to_geodataframe('120km', False)
382M 384M 1.01 mpas_ocean.GeoDataFrame.peakmem_to_geodataframe('120km', True)
355M 355M 1.00 mpas_ocean.GeoDataFrame.peakmem_to_geodataframe('480km', False)
354M 354M 1.00 mpas_ocean.GeoDataFrame.peakmem_to_geodataframe('480km', True)
1.22±0s 1.21±0.01s 0.99 mpas_ocean.GeoDataFrame.time_to_geodataframe('120km', False)
58.2±0.7ms 56.8±0.4ms 0.97 mpas_ocean.GeoDataFrame.time_to_geodataframe('120km', True)
95.6±0.7ms 94.7±0.5ms 0.99 mpas_ocean.GeoDataFrame.time_to_geodataframe('480km', False)
5.33±0.1ms 5.28±0.1ms 0.99 mpas_ocean.GeoDataFrame.time_to_geodataframe('480km', True)
266M 266M 1.00 mpas_ocean.Gradient.peakmem_gradient('120km')
244M 243M 1.00 mpas_ocean.Gradient.peakmem_gradient('480km')
2.68±0.03ms 2.64±0.01ms 0.99 mpas_ocean.Gradient.time_gradient('120km')
289±3μs 282±0.6μs 0.98 mpas_ocean.Gradient.time_gradient('480km')
365M 364M 1.00 mpas_ocean.Integrate.peakmem_integrate('120km')
176±2ms 176±1ms 1.00 mpas_ocean.Integrate.time_integrate('120km')
11.9±0.02ms 11.9±0.04ms 1.00 mpas_ocean.Integrate.time_integrate('480km')
346±2ms 345±3ms 1.00 mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('120km', 'exclude')
348±1ms 348±1ms 1.00 mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('120km', 'include')
347±3ms 348±3ms 1.00 mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('120km', 'split')
21.9±0.07ms 22.6±0.4ms 1.03 mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('480km', 'exclude')
22.1±0.09ms 22.4±0.4ms 1.01 mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('480km', 'include')
22.2±0.3ms 22.1±0.4ms 1.00 mpas_ocean.MatplotlibConversion.time_dataarray_to_polycollection('480km', 'split')
54.9±0.1ms 54.9±0.2ms 1.00 mpas_ocean.RemapDownsample.time_inverse_distance_weighted_remapping
44.3±0.07ms 44.4±0.2ms 1.00 mpas_ocean.RemapDownsample.time_nearest_neighbor_remapping
360±0.8ms 360±0.5ms 1.00 mpas_ocean.RemapUpsample.time_inverse_distance_weighted_remapping
263±1ms 263±0.2ms 1.00 mpas_ocean.RemapUpsample.time_nearest_neighbor_remapping
failed failed n/a mpas_ocean.WeightedMean.time_weighted_mean_face_centered('120km')
failed failed n/a mpas_ocean.WeightedMean.time_weighted_mean_face_centered('480km')
239M 239M 1.00 quad_hexagon.QuadHexagon.peakmem_open_dataset
239M 239M 1.00 quad_hexagon.QuadHexagon.peakmem_open_grid
7.24±0.2ms 7.03±0.04ms 0.97 quad_hexagon.QuadHexagon.time_open_dataset
5.99±0.03ms 6.15±0.08ms 1.03 quad_hexagon.QuadHexagon.time_open_grid

rytam2 and others added 2 commits July 26, 2024 17:43
…rray/weighted-mean (#866)

* updated mean function with weighted arg

* updated weighted-mean functionality in dataarray.py

* edited weights to dask array

---------

Co-authored-by: Rachel Yuen Sum Tam <[email protected]>
Co-authored-by: Rachel Yuen Sum Tam <[email protected]>
@philipc2 philipc2 linked an issue Aug 12, 2024 that may be closed by this pull request
if self._face_centered():
grid_dim = "n_face"
# use face areas as weight
weights = da.from_array(self.uxgrid.face_areas.values)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
weights = da.from_array(self.uxgrid.face_areas.values)
weights =self.uxgrid.face_areas

elif self._edge_centered():
grid_dim = "n_edge"
# use edge magnitude as weight
weights = da.from_array(self.uxgrid.edge_node_distances.values)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
weights = da.from_array(self.uxgrid.edge_node_distances.values)
weights = self.uxgrid.edge_node_distances

total_weight = weights.sum()

# compute weighted mean #assumption on index of dimension (last one is geometry)
weighted_mean = (self * weights).sum(dim=grid_dim) / total_weight
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
weighted_mean = (self * weights).sum(dim=grid_dim) / total_weight
weighted_mean = (self.data * weights).sum(dim=grid_dim) / total_weight

@philipc2 philipc2 removed the run-benchmark Run ASV benchmark workflow label Sep 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Weighted Mean Functionality
2 participants