Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update remfile verbiage in streaming.py #1823

Merged
merged 1 commit into from
Jan 14, 2024
Merged

update remfile verbiage in streaming.py #1823

merged 1 commit into from
Jan 14, 2024

Conversation

magland
Copy link
Contributor

@magland magland commented Jan 13, 2024

@rly

Here are some propsed updates to the remfile verbiage in pynwb.

The claims of fast initial load times (especially on a wifi network, but also on dandihub) are supported here.

Update: here are some further tests on a GitHub codespace, including ros3 method (which is way slower than the other two)

Copy link

codecov bot commented Jan 13, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (51a113f) 91.99% compared to head (f5fbf44) 91.99%.

Additional details and impacted files
@@           Coverage Diff           @@
##              dev    #1823   +/-   ##
=======================================
  Coverage   91.99%   91.99%           
=======================================
  Files          27       27           
  Lines        2623     2623           
  Branches      685      685           
=======================================
  Hits         2413     2413           
  Misses        138      138           
  Partials       72       72           
Flag Coverage Δ
integration 71.10% <ø> (ø)
unit 83.68% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rly
Copy link
Contributor

rly commented Jan 14, 2024

Thanks @magland !

@sinha-r @oruebel please see above for some initial streaming benchmarks from @magland

@rly rly merged commit 82c4f2b into NeurodataWithoutBorders:dev Jan 14, 2024
24 checks passed
@rly
Copy link
Contributor

rly commented Jan 14, 2024

@magland It's interesting to see ros3 so much slower and good to have hard numbers to back up the claims.

FYI This year as part of supplemental funding for our NWB U24 grant, we will be doing a set of comprehensive tests evaluating method (fsspec, ros3, remfile, kerchunk), environment (dandihub, nersc/home network), data type (ecephys, ophys, etc.), file settings (chunking, compression, etc.), and metrics (time of initial load, time to read various chunks of data). We haven't started running those tests yet, but when we do, it will be useful for us to compare your benchmark results with ours.

@oruebel
Copy link
Contributor

oruebel commented Jan 14, 2024

Thanks @magland My guess is that the large difference in the initial load for ros3 probably means that ros3 makes many more small web requests in sequence, i.e., I suspect network latency is a main factor here. Would be interesting to run these with a network profiler to get and idea of how many requests and data are being transferred with each method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants