-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Specify format for experimenter name #528
Comments
An experimenter could have multiple roles in a project as well. See https://www.nature.com/nature-index/news-blog/researchers-embracing-visual-tools-contribution-matrix-give-fair-credit-authors-scientific-papers |
This raises a separate issue of how much we want to include best practice suggestions in the NWB schema docs itself. I think once a best practice suggestion is relatively stable, it should be mentioned in the schema docs and the API docs. That may be tedious though. |
I noticed this part as well - it came up back in the Best Practice discussion here but nothing was resolved with it, mainly because the DANDI metadata has a much richer schema for specifying all experimenter-related information, for example a contributor (which associated a role as an attribute), and a person (that includes associating people with institution). Given everything else you point out about how strict the regex is for the name, I do think we should at the very least go with option (d) and remove the bit of text 'Can also specify roles of different people involved.' from the NWB schema instead of contrive an additional way of including that in the free-text string. Aside from that, I think the best way all-around would be similar to (c). (a) and (b) both feel like band-aids that wouldn't generalize to additional fields; best solution would be to define a separate schema type for an actual Now, I know how to type something like that up in JSON but not so much in the NWB schema language...
To which I'd again mimic the DANDI structure (
This is really the big question at the heart of this. We can patch the Currently, I'd say the DANDI schema is much more strict both in it's structure and the impose form on content - they have regex's for just about everything. Whereas it's my impression that the goal of the NWB schema was always to not be 'too strict' in order to let people use it in whatever way(s) they wish rather than for the explicit purpose of one day ending up on the archive. I know @bendichter biggest concern about these things, and the motivation to make the NWB Inspector a separate tool from the core NWB stuff, is we don't want to build too high of a 'wall'/'barrier' for people just trying to make a minimally working NWB file. Not every NWB file is created with the explicit purpose of passing DANDI validation (though any automatically generated via NeuroConv should be, in principle). If we want to change that philosophy, people are going to have to start doing a lot more work to insert and conform metadata to tighter standards just to create their first/initial NWB file. |
The current schema for experimenter name:
nwb-schema/core/nwb.file.yaml
Lines 171 to 179 in b22fdb7
This makes the experimenter name impossible to parse by machines.
DANDI requires that the experimenter name match the regex pattern:
https://github.com/dandi/dandi-schema/blob/master/dandischema/models.py#L60
i.e., it has to be of form
LastName, FirstName
.Or more specifically,
[characters from set A], [characters from set A]
, whereA = {a, b, ..., z, A, B, ... Z, 0, 1, ..., 9, _, -, ., space}
The NWB Inspector performs the same check and documents this as a best practice:
NeurodataWithoutBorders/nwbinspector#227
NeurodataWithoutBorders/nwbinspector#33
see also NeurodataWithoutBorders/nwbinspector#253
I think we should
Some options for 2:
a) change
experimenter
from a dataset of shape (N, ) to shape (N, 2) and make the first column be the experimenter name and the second column be the role (a breaking change)b) add an optional separate dataset for
experimenter_role
that is aligned withexperimenter
c) add an optional attribute on the
experimenter
dataset calledexperimenter_role
that is aligned withexperimenter
d) remove the suggestion that the role can be specified
This comes up as https://github.com/LorenFrankLab/spyglass seeks to store names and import experimenter names from NWB files into its database.
@bendichter @CodyCBakerPhD @oruebel what do you think?
The text was updated successfully, but these errors were encountered: