MRG: Update repository structure #21

mscheltienne · 2022-04-28T17:12:12Z

As per #19, the structure could be something like that.

I chose label as name for the entry-point: adding one for the entire package (mne_icalabel) and one for each model (submodule, only iclabel for now). If you have a better name in mind, please comment 😉
That would bring the public API to:

from mne_icalabel import label
label(raw, ica, method='iclabel')

And

from mne_icalabel.iclabel import label
label(raw, ica)

I added an underscore to all private functions. I chose to keep the feature extraction get_features, the network ICLabelNet, the forward pass run_iclabel public on top of the entry-point label.

WDYT?

…ion/class

codecov-commenter · 2022-04-28T17:15:01Z

Codecov Report

Merging #21 (5efbd0c) into main (7bf7956) will decrease coverage by 0.49%.
The diff coverage is 93.15%.

@@            Coverage Diff             @@
##             main      #21      +/-   ##
==========================================
- Coverage   97.09%   96.60%   -0.50%     
==========================================
  Files           4        7       +3     
  Lines         379      412      +33     
==========================================
+ Hits          368      398      +30     
- Misses         11       14       +3

Impacted Files	Coverage Δ
mne_icalabel/iclabel/features.py	`96.46% <87.09%> (ø)`
mne_icalabel/iclabel/utils.py	`92.39% <87.50%> (ø)`
mne_icalabel/__init__.py	`100.00% <100.00%> (ø)`
mne_icalabel/iclabel/__init__.py	`100.00% <100.00%> (ø)`
mne_icalabel/iclabel/label_components.py	`100.00% <100.00%> (ø)`
mne_icalabel/iclabel/network.py	`100.00% <100.00%> (ø)`
mne_icalabel/label_components.py	`100.00% <100.00%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7bf7956...5efbd0c. Read the comment docs.

adam2392 · 2022-04-28T17:23:15Z

Thanks for getting to this right away!

I feel like label might be too generic and not explicit enough. Perhaps label_ic? WDTY?

mscheltienne · 2022-04-28T18:46:59Z

How about label_components since MNE has the methods get_components and plot_components for ICA?

adam2392 · 2022-04-28T18:52:09Z

How about label_components since MNE has the methods get_components and plot_components for ICA?

I like it!

mscheltienne · 2022-04-28T20:26:55Z

Done, and I added the simplest of test cases to run label_components on the MNE raw sample dataset.. and it's already failing! 😅

adam2392 · 2022-04-28T20:39:11Z

make isort make black should do the auto-formatting now with #22

make run-checks runs all the checks we want (e.g. pydocstyle, isort, flake8, black, check-manifest, and mypy).

Hopefully this tightens up the process and makes life easier. I think I made the MANIFEST accurately reflect what was happening in this PR too, but if not, I can fix it later too.

The docs folder was added, but since this is on Jacob's gh, we'll just wait till we migrate the repo to mnetools to setup the actual CI to build the docs.

mscheltienne · 2022-04-28T21:03:49Z

For the error, it's because autocorrelation ends with a resampling:

# resample to 1 second at 100 samples/sec
resamp = resample_poly(ac.T, 100, np.round(raw.info["sfreq"])).T

The sample dataset sampling rate is 600.614990234375 Hz. It was complaining that the up-scaling and the down-scaling factors must be integers, so I rounded raw.info['sfreq']. But that's not the correct fix as now, instead of resampling to 100 samples, it resampled to 99.. thus failing the reshaping at a later stage.

I am not sure how to correctly fix this at the moment, I'll think about it.

adam2392 · 2022-04-28T21:12:44Z

For the error, it's because autocorrelation ends with a resampling:
# resample to 1 second at 100 samples/sec
resamp = resample_poly(ac.T, 100, np.round(raw.info["sfreq"])).T
The sample dataset sampling rate is 600.614990234375 Hz. It was complaining that the up-scaling and the down-scaling factors must be integers, so I rounded raw.info['sfreq']. But that's not the correct fix as now, instead of resampling to 100 samples, it resampled to 99.. thus failing the reshaping at a later stage.

I am not sure how to correctly fix this at the moment, I'll think about it.

I think (pretty sure) this is an "extreme" edge case because the imperfect sampling rate usually arises due to some machine precision error. E.g. in this case the actual sampling rate is actually 600 Hz.

I'm leaning towards: The correct fix would be for the user to pass it in correctly. So the info['sfreq'] is checked when the user calls label_ic, and a custom error message is raised explaining the issue for ICLabel. Then, in the test, we could test two cases:

the error is raised
the error is not raised and things "work" when you pass in np.floor(raw.info['sfreq'])

WDYT?

mscheltienne · 2022-04-29T08:21:18Z

I agree that some error checking is necessary with some warning:

Is the reference a common average? (we need more info than 'custom_ref_applied' provide, there is an issue somewhere on the main repo about storing the reference for EEG dataset)
Is the sampling frequency an integer? (not sure, but I think a float sfreq would crash in EEGLAB)
Is the dataset BP between 0 and 100 Hz?

and some raises:

Do we have the same channel in the ICA and in the instance? (ideally, you need to provide the instance used for fit)

We'll discuss those bullet points later, but for the frequency, I don't think we should force it to be integer because if a user has a dataset like the MNE sample, he can not easily change the sampling frequency. The field .info['sfreq'] is among the lock fields that should not be tampered with and raises if you try to directly change it. Resampling doesn't seem like the way to go either as it will alter the data.. for no real reason.

Instead, I would simply figure out if we have to floor or ceil the sampling frequency in the autocorrelation function to get the 100 samples. I am not (yet) convinced floor for every case is the correct method.

mscheltienne · 2022-04-29T09:27:33Z

I think that fixes it, but it's not super clean.

# the resampling must output an array of shape (components, 101), thus
# respecting '100 < ac.T.shape[0] * 100 / down <= 101'.
down = int(raw.info['sfreq'])
if 101 < ac.shape[1] * 100 / down:
    down += 1
elif ac.shape[1] * 100 / down <= 100:
    down -= 1
resamp = resample_poly(ac.T, 100, down).T

EDIT: Removed the elif statement, because the conversion to int floors, so we should never have to lower further the sampling rate by 1. The only case that could occur is that the floor lowered too much the value, below the criteria.

adam2392 · 2022-04-29T15:25:00Z

I agree that some error checking is necessary with some warning:

Agreed w/ all your warning proposals. Shall we merge this and go for it in another PR?

Do we have the same channel in the ICA and in the instance? (ideally, you need to provide the instance used for fit)

I wonder if someone just passes in instance, that we should run ICA for them by default? I.e. make passing in the ICA instance optional.

We'll discuss those bullet points later, but for the frequency, I don't think we should force it to be integer because if a user has a dataset like the MNE sample, he can not easily change the sampling frequency. The field .info['sfreq'] is among the lock fields that should not be tampered with and raises if you try to directly change it. Resampling doesn't seem like the way to go either as it will alter the data.. for no real reason.

Perhaps we have an optional argument sfreq=None, which is used to override the sampling rate. I think intelligently flooring/ceiling it is not possible since one could easily have 127.9584 or 128.12040 for example sampling rate, which actually corresponds to 128 Hz. WDYT of this soln? Con is adding an additional kwarg, but I don't see an easy way around w/o too much hacking.

adam2392 · 2022-04-29T15:27:17Z

mne_icalabel/iclabel/features.py

@@ -7,7 +7,7 @@
 from numpy.typing import NDArray
 from scipy.signal import resample_poly

-from .utils import _next_power_of_2, gdatav4, mne_to_eeglab_locs, pol2cart
+from .utils import _gdatav4, _mne_to_eeglab_locs, _next_power_of_2, _pol2cart


 def get_features(inst: Union[BaseRaw, BaseEpochs], ica: ICA):


I'm thinking maybe we even rename this to get_iclabel_features, since we'll presumably have other models which also have feature engineering which would have get_<model_X>_features.

Yep, good idea.

adam2392 · 2022-04-29T15:28:46Z

mne_icalabel/iclabel/tests/test_label_components.py

+@pytest.mark.filterwarnings("ignore::RuntimeWarning")
+@pytest.mark.filterwarnings("ignore::FutureWarning")


I will clean this up in the next PR.

Much appreciated!

adam2392 · 2022-04-29T15:32:04Z

Going to merge this for the sake of working on the docs to address Alexandre Gramfort's issues.

We can continue discussing some of the points I raised here and then handle in a downstream PR.

mscheltienne · 2022-04-29T15:41:41Z

For the warnings, yes I will add those later on.
For the optional ICA, sounds good to me! But we will have to document which settings we use by default.

For the sampling rate, I think intelligent flooring/ceiling is possible, but your example is a very good point and I suspect it will fail with the current code. I am not that convince by the additional argument which complicates needlessly the API.
I will add a test case for the resampling alone, and hopefully a solution that will convince both of us that correct flooring/ceiling is possible :)

adam2392 · 2022-04-29T16:11:41Z

Sounds good! FYI the repo is moved to my GitHub for now so I can work on the CI and docs before doing another migration to mne.tools :p

You'll have to reset your remote URL

git remote remove <current one>
then
git add remote <new one>

Also renamed the repository to mne-icalabel to fit the naming scheme in mne.tools.

FYI: @anandsaini024 @mscheltienne @jacobf18 ^ will need to reset your GitHub config. Sorry but will need a bit of these book-keeping until we migrate officially to mne org.

mscheltienne · 2022-04-29T16:47:26Z

Perfect, don't worry about the move(s) ;)

mscheltienne added 3 commits April 28, 2022 18:55

more iclabel to a 'icabel' submodule and add '_' to all private funct…

641b6a1

…ion/class

add entrypoints to iclabel and to mne_icalabel

7fb7f91

black

e762c5e

mscheltienne mentioned this pull request Apr 28, 2022

Tidy Manifest #20

Closed

fix missed

156163f

mscheltienne added 2 commits April 28, 2022 21:37

rename to label_components

8d9c9ce

add tests

4474b5b

mscheltienne added 2 commits April 28, 2022 22:27

fix resample_poly that requires up and down as integers

8f36110

sort imports with isort

f95e8da

adam2392 mentioned this pull request Apr 28, 2022

[DOC] Adding doc folder and tightening makefile recipes for checking style #22

Merged

6 tasks

mscheltienne added 2 commits April 28, 2022 22:52

Merge branch 'main' into structure

4c62518

run sort and black

b6a6856

fix for resampling

2596739

run black

bb1584d

mscheltienne changed the title ~~Update repository structure~~ MRG: Update repository structure Apr 29, 2022

simpler

5efbd0c

adam2392 reviewed Apr 29, 2022

View reviewed changes

adam2392 approved these changes Apr 29, 2022

View reviewed changes

adam2392 merged commit 32edd7d into mne-tools:main Apr 29, 2022

mscheltienne deleted the structure branch April 29, 2022 15:41

mscheltienne mentioned this pull request Apr 29, 2022

Improve testing for resampling in the autocorrelation feature #26

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MRG: Update repository structure #21

MRG: Update repository structure #21

mscheltienne commented Apr 28, 2022

codecov-commenter commented Apr 28, 2022 •

edited

Loading

adam2392 commented Apr 28, 2022 •

edited

Loading

mscheltienne commented Apr 28, 2022

adam2392 commented Apr 28, 2022

mscheltienne commented Apr 28, 2022

adam2392 commented Apr 28, 2022 •

edited

Loading

mscheltienne commented Apr 28, 2022

adam2392 commented Apr 28, 2022

mscheltienne commented Apr 29, 2022 •

edited

Loading

mscheltienne commented Apr 29, 2022 •

edited

Loading

adam2392 commented Apr 29, 2022 •

edited

Loading

adam2392 Apr 29, 2022

mscheltienne Apr 29, 2022

adam2392 Apr 29, 2022

mscheltienne Apr 29, 2022

adam2392 commented Apr 29, 2022

mscheltienne commented Apr 29, 2022

adam2392 commented Apr 29, 2022 •

edited

Loading

mscheltienne commented Apr 29, 2022

		@pytest.mark.filterwarnings("ignore::RuntimeWarning")
		@pytest.mark.filterwarnings("ignore::FutureWarning")

MRG: Update repository structure #21

MRG: Update repository structure #21

Conversation

mscheltienne commented Apr 28, 2022

codecov-commenter commented Apr 28, 2022 • edited Loading

Codecov Report

adam2392 commented Apr 28, 2022 • edited Loading

mscheltienne commented Apr 28, 2022

adam2392 commented Apr 28, 2022

mscheltienne commented Apr 28, 2022

adam2392 commented Apr 28, 2022 • edited Loading

mscheltienne commented Apr 28, 2022

adam2392 commented Apr 28, 2022

mscheltienne commented Apr 29, 2022 • edited Loading

mscheltienne commented Apr 29, 2022 • edited Loading

adam2392 commented Apr 29, 2022 • edited Loading

adam2392 Apr 29, 2022

Choose a reason for hiding this comment

mscheltienne Apr 29, 2022

Choose a reason for hiding this comment

adam2392 Apr 29, 2022

Choose a reason for hiding this comment

mscheltienne Apr 29, 2022

Choose a reason for hiding this comment

adam2392 commented Apr 29, 2022

mscheltienne commented Apr 29, 2022

adam2392 commented Apr 29, 2022 • edited Loading

mscheltienne commented Apr 29, 2022

codecov-commenter commented Apr 28, 2022 •

edited

Loading

adam2392 commented Apr 28, 2022 •

edited

Loading

adam2392 commented Apr 28, 2022 •

edited

Loading

mscheltienne commented Apr 29, 2022 •

edited

Loading

mscheltienne commented Apr 29, 2022 •

edited

Loading

adam2392 commented Apr 29, 2022 •

edited

Loading

adam2392 commented Apr 29, 2022 •

edited

Loading