Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support modifications of a read file in an external overlay #676

Open
5 tasks done
rly opened this issue Nov 11, 2021 · 2 comments · May be fixed by #677
Open
5 tasks done

Support modifications of a read file in an external overlay #676

rly opened this issue Nov 11, 2021 · 2 comments · May be fixed by #677
Assignees
Labels
category: proposal proposed enhancements or new features priority: medium non-critical problem and/or affecting only a small set of users
Milestone

Comments

@rly
Copy link
Contributor

rly commented Nov 11, 2021

In neurophysiology, metadata is often not static. Sometimes, the experiment description, related publications, or data annotations need to be changed after the file is written. Currently, users can open the file in HDMF in append mode and add containers. Users can also open the file in read mode, make a modification to any part of a container in-memory, and export the modified in-memory container to a new file on disk. Simple changes to metadata cannot be made in append mode and require rewriting the file, which can be expensive (e.g., a 10 GB file is written but one attribute needs to be changed.) Metadata changes can be done in h5py but is hacky.

Data archives would also like to be able to support versioning in a lean way, where making small metadata changes would not require maintaining a complete copy of a file with each change.

The HDF5 group is planning to add support for similar changes / versioning but it will likely be a while before this feature is widely supported.

The NWB and DANDI teams have brainstormed several approaches, one of which is to maintain a human-readable, sidecar JSON file with the same name as the NWB file but with a .json suffix that contains the sequence of changes to be made to the data after reading it. The original NWB file would stay intact.

Other options should be explored too.

Checklist

  • Have you ensured the feature or change was not already reported ?
  • Have you included a brief and descriptive title?
  • Have you included a clear description of the problem you are trying to solve?
  • Have you included a minimal code snippet that reproduces the issue you are encountering?
  • Have you checked our Contributing document?
@rly rly linked a pull request Nov 11, 2021 that will close this issue
6 tasks
@rly rly added category: proposal proposed enhancements or new features priority: medium non-critical problem and/or affecting only a small set of users labels Nov 19, 2021
@mavaylon1
Copy link
Contributor

@rly where we in our thoughts on a sidecar?

@mavaylon1 mavaylon1 added this to the Future milestone Apr 16, 2024
@rly
Copy link
Contributor Author

rly commented Apr 17, 2024

I think LINDI would be a better approach but let's discuss at the breakout session on Thursday and add notes here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: proposal proposed enhancements or new features priority: medium non-critical problem and/or affecting only a small set of users
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants