Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature]: attribute reftype list #833

Open
3 tasks done
bendichter opened this issue Mar 13, 2023 · 1 comment
Open
3 tasks done

[Feature]: attribute reftype list #833

bendichter opened this issue Mar 13, 2023 · 1 comment
Labels
category: enhancement improvements of code or code behavior priority: low alternative solution already working and/or relevant to only specific user(s) topic: extension issues related to extensions or dynamic class generation

Comments

@bendichter
Copy link
Contributor

What would you like to see added to HDMF?

I am creating an extension where I have a biosample that has a was_derived_from attribute which can point to other biosamples. I have tried do this as follows:

NWBAttributeSpec(
    name="wasDerivedFrom",
    doc="Describes the hierarchy of sample derivation or aggregation.",
    required=False,
    shape=(None, ),
    dtype=NWBRefSpec(
        reftype="object",
        target_type="Biosample",
    )
),

the refspec part works as expected for a single biosample object, but I am unable to use the shape parameter to create a list of there references in the attribute.

  1. Is this even possible in HDF5?
  2. I am not aware of any place in the schema currently that uses this. Do we want to support it?

I suppose I could store the links in groups and use a MultiContainerInterface for this, so I guess it isn't strictly blocking me.

Is your feature request related to a problem?

No response

What solution would you like?

I would like to be able to store multiple references in a single attribute.

Do you have any interest in helping implement the feature?

Yes, but I would need guidance.

Code of Conduct

@rly
Copy link
Contributor

rly commented May 10, 2023

It is possible to create an HDF5 attribute containing multiple references.

>>> import h5py
>>> myfile = h5py.File('myfile.hdf5', 'w')
>>> myfile.create_group("test_group")
<HDF5 group "/test_group" (0 members)>
>>> myfile.attrs.create("attr_of_refs", data=[myfile.ref, myfile["test_group"].ref], dtype=h5py.ref_dtype)
>>> myfile[myfile.attrs["attr_of_refs"][0]]
<HDF5 group "/" (1 members)>
>>> myfile[myfile.attrs["attr_of_refs"][1]]
<HDF5 group "/test_group" (0 members)>

Could we add this to HDMF? Yes.

Should we? I am concerned about adding complexity to HDMF and the schema language to support a single use case that has another solution. We would also need to support this data type in MatNWB and HDMF-Zarr. There are possible workarounds, but I understand that they are not as elegant:

  1. You can define your Biosample data type to contain one or more links to Biosample data types (though see open issue [Bug]: get_class can't work recursively #794)
  2. You can use a dataset spec instead of an attribute spec. A dataset of references is supported by HDMF and commonly used.

@rly rly added category: enhancement improvements of code or code behavior priority: low alternative solution already working and/or relevant to only specific user(s) topic: extension issues related to extensions or dynamic class generation labels Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: enhancement improvements of code or code behavior priority: low alternative solution already working and/or relevant to only specific user(s) topic: extension issues related to extensions or dynamic class generation
Projects
None yet
Development

No branches or pull requests

2 participants