Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor for lazy loading #189

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

JoeZiminski
Copy link

Hi @adamltaylor and @boazmohar,

Hope things are good! Further to the discussion on incorporating PyWaveSurfer into python-neo PR 1040, it would be great if you would be happy to support a version of PyWaveSurfer that can operate in a lazy manner (i.e. only read and the data from the h5py file when it is requested).

Towards this, this PR proposes refactored version of PyWaveSurfer done in such a way that it supports lazy loading. It would be great to hear your thoughts on this and if you would be happy to support the package in this form. This refactors the existing code into a 'PyWaveSurferData' class that holds a reference to the open .h5 file, and reads the header information. However, the data is not read into the data_file_as_dict dictionary by default. Instead, a get_traces() function is introduced that allows dynamic retrieval of sections of sweep data.

However, for backwards compatability a load_all_data() on the class and general loadDataFile() function are provided, to fill the data_file_as_dict in it's entireity up front. So, while under the hood a get_traces() method is exposed, day-to-day the package can be used as it previously was. Also introduced are some new tests for the get_traces() method.

If you are interested, it would be great to get your feedback, I have some additional questions I can follow up with if so. After these changes it would not be difficult to incorpoate PyWaveSurfer into python-neo, which essentially would wrap the get_traces() method.

Best Wishes,
Joe

@coveralls
Copy link

coveralls commented Oct 16, 2023

Coverage Status

coverage: 93.125% (+2.9%) from 90.265% when pulling 7ad3e22 on JoeZiminski:refactor_for_lazy_loading into 68b5742 on JaneliaSciComp:master.

@boazmohar
Copy link
Collaborator

Hi @JoeZiminski,

Happy to see this is useful for python-neo, and thank you for all this work!

This looks good to me but I will try to get @adamltaylor to also have a look as there might be wavesurfer metadata things I am not aware of.

Could you also change the readme file to add the information in the PR description so that users will know about this new lazy loading ability even if they don't use python-neo (I will be happy if you point them to your repo as well).

Thanks again,
Boaz

@JoeZiminski
Copy link
Author

Thanks @boazmohar! That's great to hear, sure I'd be happy to update the readme file. I also had a metadata question, I saw the AreSweepsContinuous field and wondered what it indicated, in case this somehow interfered with slicing into the underlying raw data.

@adamltaylor
Copy link
Member

Sorry for the radio silence. Will look at this this week.

@boazmohar
Copy link
Collaborator

Hi @JoeZiminski,

Sorry for the delay, I am trying to see if we can dedicate some internal Janelia funds to get more eyes on this and get this over the finish line. Will update hopefully by the end of the week. Please @ me if I forget...

Thanks!
Boaz

@JoeZiminski
Copy link
Author

Thanks @boazmohar, in fact thinking about this, it might take a little more work on my end. I had based this around SpikeInterface's API but to work directory with NEO it make need a couple more adjustments.

@samuelgarcia this is a PR I wrote to rework PyWavesurfer to better work with Neo API. If you have a change could you take a quick look and make any recomedations on how it could be improved? Maybe it is more focused on the get_traces() function as related to SpikeInterface rather than the neo API, which would be more appropriate.

@boazmohar
Copy link
Collaborator

Hi @JoeZiminski, we have been able to get some of @adamltaylor's time towards the end of the month.
I think a probable approach he might take is to create a separate entry point for lazy loading so that API and documentation of current functionality would not be effected.
It would be great if you can provide what such an API is required to provide for proper integration.

@adamltaylor
Copy link
Member

Yeah so I looked at this a bit and probably the main issue is that it should support lazy-loading of digital traces, in addition to analog. I don't think the current test suite covers reading of digital traces, so that's another issue. I can address this, just not for a few weeks.

@adamltaylor
Copy link
Member

adamltaylor commented Nov 1, 2023

But thank you so much for all the great work you put in on this, @JoeZiminski !

@samuelgarcia
Copy link

Hi @JoeZiminski.
I will not have time to have a look very soon. Having the leazy loding is cool for neo.rawio
The only advice I would give : check that you are able to retrieve traces as "raw" (int16) and to able to apply the gain separatly. This is a need for neo.rawio.

@h-mayorquin
Copy link

Would be great to have this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants