Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: get_default_backend_configuration: auto chunk not good for time series data #1099

Open
2 tasks done
bendichter opened this issue Sep 24, 2024 · 0 comments
Open
2 tasks done
Assignees
Labels

Comments

@bendichter
Copy link
Contributor

What happened?

When using get_default_backend_configuration for long time series, the recommended chunks are similar to the dataset size, which creates very long chunks that are sub-optimal for viewing windows of time e.g. the way data is accessed in neurosift. A better chunking for time series would deviate from the similarity convention, and provide chunks that hold more channels.

Steps to Reproduce

import numpy as np
from pynwb.testing.mock.ecephys import mock_ElectricalSeries
from pynwb.testing.mock.file import mock_NWBFile
from neuroconv.tools.nwb_helpers import get_default_backend_configuration


data = np.ones((10000000,128))

nwbfile = mock_NWBFile()

ts = mock_ElectricalSeries(data=data, nwbfile=nwbfile)
nwbfile

backend_config = get_default_backend_configuration(nwbfile, backend="hdf5")
backend_config.dataset_configurations["acquisition/ElectricalSeries/data"].chunk_shape

output: (312500, 4)

Traceback

No response

Operating System

macOS

Python Executable

Conda

Python Version

3.10

Package Versions

No response

Code of Conduct

@bendichter bendichter added the bug label Sep 24, 2024
@h-mayorquin h-mayorquin self-assigned this Sep 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants