Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Documentation]: Add example for using family driver to modular stroage docs #1948

Closed
3 tasks done
oruebel opened this issue Aug 19, 2024 · 0 comments · Fixed by #1949
Closed
3 tasks done

[Documentation]: Add example for using family driver to modular stroage docs #1948

oruebel opened this issue Aug 19, 2024 · 0 comments · Fixed by #1949
Labels
priority: low alternative solution already working and/or relevant to only specific user(s) topic: docs issues related to documentation

Comments

@oruebel
Copy link
Contributor

oruebel commented Aug 19, 2024

What would you like changed or added to the documentation and why?

Add a section to https://pynwb.readthedocs.io/en/latest/tutorials/advanced_io/linking_data.html#sphx-glr-tutorials-advanced-io-linking-data-py to discuss how to use the family driver to automatically split data across multiple files.

Here is an example of how this should work

import h5py
import numpy as np
from pynwb import NWBFile, NWBHDF5IO
from pynwb.base import TimeSeries
from datetime import datetime

# Number of files to split into
# chunk_size = 1 * 1024**3  # 1GB per file
chunk_size = 1024 * 1024 # 1MB for testing

# Create the HDF5 file using the family driver
with h5py.File('family_nwb_file_%d.h5', 'w', driver='family', memb_size=chunk_size) as f:
    # Create an NWBFile object
    nwbfile = NWBFile('session_description', 'identifier', datetime.now().astimezone())

    # Create some example data
    data = np.random.rand(500000)  # Example large dataset
    timestamps = np.arange(500000) / 1000.0  # Example timestamps in seconds

    # Create a TimeSeries object
    time_series = TimeSeries(name='example_timeseries',
                             data=data,
                             unit='mV',
                             timestamps=timestamps)

    # Add the TimeSeries to the NWBFile
    nwbfile.add_acquisition(time_series)

    # Use NWBHDF5IO to write the NWBFile to the HDF5 file
    with NWBHDF5IO(file=f, mode='w') as io:
        io.write(nwbfile)

print("NWB file created successfully with the family driver.")

# Open the HDF5 file using the family driver
with h5py.File('family_nwb_file_%d.h5', 'r', driver='family', memb_size=chunk_size) as f:
    # Use NWBHDF5IO to read the NWBFile from the HDF5 file
    with NWBHDF5IO(file=f, manager=None, mode='r') as io:
        nwbfile = io.read()
        print(nwbfile)

print("NWB file read successfully with the family driver.")

The creation of the NWBFile could probably move outside of the with h5py.File context for clarity. To make this even simpler, we would need to allow passing the memb_size option to NWBHDF5IO (and HDF5IO).

Do you have any interest in helping write or edit the documentation?

Yes.

Code of Conduct

@oruebel oruebel added topic: docs issues related to documentation priority: low alternative solution already working and/or relevant to only specific user(s) labels Aug 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: low alternative solution already working and/or relevant to only specific user(s) topic: docs issues related to documentation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant