Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not allow write with a DynamicTableRegion index that is out of bounds #210

Open
rly opened this issue Nov 21, 2019 · 4 comments · May be fixed by #1168
Open

Do not allow write with a DynamicTableRegion index that is out of bounds #210

rly opened this issue Nov 21, 2019 · 4 comments · May be fixed by #1168
Assignees
Labels
category: bug errors in the code or code behavior priority: low alternative solution already working and/or relevant to only specific user(s)
Milestone

Comments

@rly
Copy link
Contributor

rly commented Nov 21, 2019

Do not allow writing of a file with a DynamicTableRegion index that is out of bounds. The file is unreadable. Attempting to read it results in a cryptic error message (#209).

@rly rly added category: enhancement improvements of code or code behavior category: bug errors in the code or code behavior and removed category: enhancement improvements of code or code behavior labels Nov 21, 2019
@ajtritt
Copy link
Contributor

ajtritt commented Jan 2, 2020

I believe that check already exists right here:

for idx in self.data:
if idx < 0 or idx >= len(val):
raise IndexError('The index ' + str(idx) +
' is out of range for this DynamicTable of length '
+ str(len(val)))

Anytime DynamicTableRegion data is set (i.e. read or write), this brute force check happen. For example, it took at least 10 minutes to open a file that had a DynamicTableRegion with more than 2.4M elements. So, I don't think we want to go this route for data integrity.

@bendichter
Copy link
Contributor

bendichter commented Jan 2, 2020

We found a number of files from the DataJoint team that had this error and were unopenable in pynwb. I'm not sure how they managed to create these files in the first place with this check in place. I'm pretty sure they were using pynwb to write. Was this check recently added?

As for files taking a long time to open, I think that is probably due to automatically dereferencing DynamicTableRegions during open, and I think we should reconsider doing that.

@ajtritt
Copy link
Contributor

ajtritt commented Jan 2, 2020

I think that is probably due to automatically dereferencing DynamicTableRegions during open

It’s reading one element at a time from the index—not the dereferenced data.

I think these checks need to move elsewhere, and shouldn’t be enforced in setters. It makes maintenance and tracking of these sort of rules difficult, and creates performance issues. We probably need some sort of constraints layer to enforce these rules.

@rly
Copy link
Contributor Author

rly commented Jan 5, 2023

We should do this for all DynamicTableRegions that are small enough (e.g., <10000 elements) and have that threshold (or the behavior) be configurable on write. It would be ideal to check this everytime a value is changed, but we cannot monitor all changes because we use numpy and h5py.dataset objects. A constraints layer would be nice, but for now, this can be done on write.

@rly rly added this to the Next Release milestone Jan 5, 2023
@rly rly self-assigned this Jan 5, 2023
@rly rly added the priority: low alternative solution already working and/or relevant to only specific user(s) label Jan 5, 2023
@rly rly linked a pull request Aug 10, 2024 that will close this issue
4 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: bug errors in the code or code behavior priority: low alternative solution already working and/or relevant to only specific user(s)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants