Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve html representation of datasets #1100
base: dev
Are you sure you want to change the base?
improve html representation of datasets #1100
Changes from 17 commits
649f141
475cda9
7f3c94e
5128d53
21ae3cf
4eb2635
08292c6
59083c2
06a064e
7ce5b3f
fc14d71
a2931e2
96456a4
133e28d
ae21b61
28449a3
6e6a84c
89fd978
a0e1736
538ba98
e0ad0a1
9cbcf64
5b235e0
3813723
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be nice to avoid having logic that is specific for a particular io backend in the
Container
. The reason is that this inhibts implementing backends in a self-contained manner and stand-alone packages and requires updating many places throughout HDMF.The checks for
h5py.Dataset
andZarr.array
are really only relevant when a file was read from file. To help disentangle the dependencies, I'm wondering whether we could do the following:generate_dataset_html
toHDMFIO
that would then need to implemented byHDF5IO
andZarrIO
Container
you could then do something like:`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, yes, it would be nice if we could decouple this. On the other hand, right now, if they do implement their own backend they will just lose the representation for datasets which is not critical.
Is it? Right now in the test we are passsing an hdf5 dataset as data without writing the nwbfile for testin the display. Is this not intended?
This proposal seems very good:
I see two downsides:
Is there any other backend in the works right now? If not, maybe we can do this simpler way and add the complexity once we are closer to need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would worth giving this a try here. If it works for HDF5, we can then we can easily move the logic for Zarr to hdmf-zarr. I don't think it should be too hard to make this work right now, but these things tend to get hard to change later on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about handling in-memory objects as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In-memory-only objects (i.e., numpy arrays and lists) can be handled here in the Container class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you point me to the PR you are referring to. I'm not sure what role
DataIO
plays for this PR.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean that we can implement the following strategy in another PR to add backend specific information:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes that looks good 👍 . Just to avoid confusion,
read_io
is an instance ofHDMFIO
(i.e.,HDF5IO
orZarrIO
) and notDataIO
. To implement the logic we would then need to:HDMFIO.generate_dataset_html(dataset)
which would implement just a minimalist representationHDF5IO.generate_dataset_html(h5py.Dataset)
to representh5py.Dataset
hdmf_zarr
implementZarrIO.generate_dataset_html(Zarr.array)
To simplify this implementation and generate consistent representations, we could make a utility function that would take information about a dataset (e.g,. name, shape, data type, etc.) as input and generate the html representation such that the individual
generate_data_html
on the I/O backends would just collect the information from the dataset and use the utility function to generate the actual html.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I realized after that I was confusing these two objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Thanks for your hard work on this PR and the fruitful discussion.