new neurodata_type: TTL pulse #306

bendichter · 2019-09-09T18:59:59Z

It would be nice to have an intuitive and machine-readable neurodata_type for TTL pulses. Something like

neurodata_type_def: TTL
neurodata_type_ind: TimeSeries
datasets:
  - name: data
    shape:
      - None
    dims:
      - pulses
    dtype: int
 - name: keys
   shape:
     - None
   dims:
     - unique_labels
   dtype: int
 - name: labels
   shape:
     - None
   dims:
     - unique_labels
   dtype: text

bendichter · 2019-09-09T19:03:41Z

@rly when you return I'd be interested in your thoughts

t-b · 2019-09-09T21:08:14Z

Jeep, that could be useful. For what would you use the labels and keys?

bendichter · 2019-09-10T05:20:45Z

The labels would give you a human readable description of what the events were. The keys would map those labels to values

t-b · 2019-09-10T13:09:30Z

@bendichter But you could use epochs for that as well or?

ajtritt · 2019-09-10T15:05:04Z

Sounds like you’re describing a CV, which is usually represented with a table. I’d recommend just using a DynamicTable for the mapping. And the data could be stored as a DynamicTableRegion. I’m not sure how that would play with the TimeSeries.data tho

bendichter · 2019-09-10T21:28:20Z

what do you mean by CV?

ajtritt · 2019-09-10T21:29:29Z

controlled vocabulary.

bendichter · 2019-09-10T21:32:56Z

Oh I see. Yeah very similar but I think the one difference is that the actual value of the int is important to the labs as well as the label.

ajtritt · 2019-09-10T22:20:01Z

That might not be a CV then. Are the values of the pulses enumerable?

bendichter · 2019-09-11T08:52:29Z

Enumerable? The ints are chosen by the researcher to signal specific events. There is a finite number of them and they generally are not sequential.

rly · 2019-09-12T19:18:01Z

Although TTL pulses are a sequence of events in time, I do not think inheriting this type from TimeSeries is most appropriate here. TTL pulses are not generic time series measurements (the data don't have units) and the timestamps are almost never uniformly sampled in time. It would be more appropriate to use or alias the LabeledEvents type https://github.com/NeurodataWithoutBorders/nwb-schema/pull/302/files which is similar to what you have here ("keys" = "labels", "labels" = "label_map")

In my experience, users sometimes work with all of the TTL pulses together in order of time, but more often they separate the pulses by value and work with only a subset of them, often in isolation (peri-stimulus time histogram), and sometimes together in order. Those use cases would be best served by two different storage options:

as @bendichter described, store a single array of N (timestamp, value) pairs
store K 1-D arrays of timestamps, where K is the number of unique pulse values.

For Option 1, if the user wants to get the data in the form of Option 2 (i.e. split the array by value into K 1-D arrays), a naive implementation takes O(NK) time - I'm not sure this can be done faster.

For Option 2, if the user wants to get the data in the form of Option 1 (i.e. create a single array of all of the pulse values in order of time), then they would have to do a K-way merge sort of the K 1-D arrays, which is O(N log K) time.

Given that difference in efficiency (at the limit) and that I think working with only one or a subset of pulse values is more common than working with all pulse values in order, I think Option 2 would be better.

ajtritt · 2019-09-12T19:44:10Z

store K 1-D arrays of timestamps, where K is the number of unique pulse values.

Could we store this in the same way we store spike units?

rly · 2019-09-12T23:04:02Z

Yes, a DynamicTable would work, but for some reason, it feels less intuitive to me (same for the units table). It's a little overkill, but it would organize the TTL pulses together. Granted, the user does not need to know (and to some degree should not need to know) the low-level details of how the data is structured.

Ultimately, I think that TTL pulses, spike unit times, and events in general, could, and probably should, be stored in the same way as each other. So, yes, option 3:

Store TTL pulses as a DynamicTable with indexed column pulse_times, and normal column label.

bendichter · 2019-09-13T08:01:53Z

I worry about the proliferating use of DynamicTables, because it sort of side-steps the whole point of NWB:N, which is to provide good standards for meta-data. DynamicTable is basically an open door that allows you to store any type of data you want with minimal constraints. Any open door will lead to the temptation to use it for every datatype that is not currently supported, and I worry about the long term effects of that. This is useful for purely-meta-data structures like trials, where this flexibility is important and adherence to a strict standard is less important, but I worry when they are used for annotating acquired data. When they are used to annotate data, DynamicTables on their own are invariably insufficient for the chosen type of data and we end up having to compensate with hacks. Even with Units there is an issue with storing the sampling rate of the waveforms and with linking individual spikes to snippets. The only way to resolve this flexibly would be to allow DTs to have global attributes, and at that point they would not enforce any structure at all!

I agree that we need a way to express time series objects that are not measurements and are unitless. I think the proposed Events and/or LabelledEvents addresses this problem nicely, and I like that it generalizes to cases outside of TTL pulses because this has come up a few times and will continue to be needed. The one issue I have with the proposed extension is the names of the datasets. To me, label should be a string, i.e. the value of the dictionary, not the key.

Then there is the question of whether we should store a one object that captures all TTL pulses or several objects that each store a TTL pulse event type. This comes down to whether we want to make the data closer to what is acquired or closer to the semantic meaning/how it will be used an analysis. In this case I vote to make it closer to acquired with LabelledEvents because I imagine TTL pulses going in /acquisition. If users want to pull out specific events, we could encourage them to do that with Events objects in a processing module.

rly · 2019-10-11T01:00:47Z

@bendichter Coming back to this, when you wrote:

When they are used to annotate data, DynamicTables on their own are invariably insufficient for the chosen type of data and we end up having to compensate with hacks. Even with Units there is an issue with storing the sampling rate of the waveforms and with linking individual spikes to snippets. The only way to resolve this flexibly would be to allow DTs to have global attributes, and at that point they would not enforce any structure at all!

What do you mean by global attributes? I think it is fine for subtypes of DynamicTables to define attributes that apply to all rows of the table.

rly · 2019-10-11T01:06:07Z

What is the issue with linking individual spikes to snippets? I saw a bunch of issues/PRs on UnitSeries and it looked like a reasonable solution was agreed upon #248

rly mentioned this issue Oct 4, 2019

Add Events type #301

Open

stephprince added category: proposal proposed enhancements or new features priority: medium non-critical problem and/or affecting only a small set of NWB users labels May 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

new neurodata_type: TTL pulse #306

new neurodata_type: TTL pulse #306

bendichter commented Sep 9, 2019

bendichter commented Sep 9, 2019

t-b commented Sep 9, 2019

bendichter commented Sep 10, 2019

t-b commented Sep 10, 2019

ajtritt commented Sep 10, 2019

bendichter commented Sep 10, 2019

ajtritt commented Sep 10, 2019

bendichter commented Sep 10, 2019

ajtritt commented Sep 10, 2019

bendichter commented Sep 11, 2019 •

edited

Loading

rly commented Sep 12, 2019 •

edited

Loading

ajtritt commented Sep 12, 2019 •

edited

Loading

rly commented Sep 12, 2019

bendichter commented Sep 13, 2019

rly commented Oct 11, 2019

rly commented Oct 11, 2019

new neurodata_type: TTL pulse #306

new neurodata_type: TTL pulse #306

Comments

bendichter commented Sep 9, 2019

bendichter commented Sep 9, 2019

t-b commented Sep 9, 2019

bendichter commented Sep 10, 2019

t-b commented Sep 10, 2019

ajtritt commented Sep 10, 2019

bendichter commented Sep 10, 2019

ajtritt commented Sep 10, 2019

bendichter commented Sep 10, 2019

ajtritt commented Sep 10, 2019

bendichter commented Sep 11, 2019 • edited Loading

rly commented Sep 12, 2019 • edited Loading

ajtritt commented Sep 12, 2019 • edited Loading

rly commented Sep 12, 2019

bendichter commented Sep 13, 2019

rly commented Oct 11, 2019

rly commented Oct 11, 2019

bendichter commented Sep 11, 2019 •

edited

Loading

rly commented Sep 12, 2019 •

edited

Loading

ajtritt commented Sep 12, 2019 •

edited

Loading