Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement eigh() fallback #493

Open
wants to merge 40 commits into
base: main
Choose a base branch
from

Conversation

veni-vidi-vici-dormivi
Copy link
Collaborator

@veni-vidi-vici-dormivi veni-vidi-vici-dormivi commented Aug 9, 2024

Implement eigh() as a fallback for loglikelihood estimation in covariance localization. Since physically, the covariance matrix can be positive semidefinite we should allow singular true. I also implemented some bug fixes mentioned in #413.

@veni-vidi-vici-dormivi
Copy link
Collaborator Author

We could cherry pick the rng commit to another PR, I just opened the branch for all thing covariance related as a tryout in the beginning.

@veni-vidi-vici-dormivi
Copy link
Collaborator Author

Before I fix the rest of the tests, could you look at it @mathause. I don't think it's pretty what I did to mesmer.utils.minimize_local_discrete but I couldn't think of a better way. maybe it would also be okay to check cholesky for every localization radius if it is rather likely that one where the matrix is pos def is the better one in which case we would only do the estimation with eigh once or max twice.

Copy link
Member

@mathause mathause left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I made a suggestion how we can potentially avoid changeable. It would be nice to have a separate PR about the random generator but I also see that they are intricately linked...

mesmer/core/utils.py Outdated Show resolved Hide resolved
mesmer/stats/_auto_regression.py Outdated Show resolved Hide resolved
mesmer/stats/_localized_covariance.py Outdated Show resolved Hide resolved
mesmer/stats/_localized_covariance.py Outdated Show resolved Hide resolved
@veni-vidi-vici-dormivi
Copy link
Collaborator Author

Note: diff works better with ignore whitespace

Copy link

codecov bot commented Aug 20, 2024

Codecov Report

Attention: Patch coverage is 98.00000% with 1 line in your changes missing coverage. Please review.

Project coverage is 61.51%. Comparing base (e0ca738) to head (234890e).

Files with missing lines Patch % Lines
mesmer/stats/_localized_covariance.py 96.96% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #493      +/-   ##
==========================================
+ Coverage   61.36%   61.51%   +0.14%     
==========================================
  Files          50       50              
  Lines        3572     3588      +16     
==========================================
+ Hits         2192     2207      +15     
- Misses       1380     1381       +1     
Flag Coverage Δ
unittests 61.51% <98.00%> (+0.14%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@mathause mathause left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks that looks good and clean. One last thing we could do (and I leave it up to you): define _ecov_crossvalidation = _EcovCrossvalidation() and rename crossvalidate to __call__, then we can call _ecov_crossvalidation(...) again as before (but then this function is purely internal...)

mesmer/core/utils.py Outdated Show resolved Hide resolved
mesmer/stats/_localized_covariance.py Outdated Show resolved Hide resolved
mesmer/stats/_localized_covariance.py Outdated Show resolved Hide resolved
@mathause
Copy link
Member

Oh and some of the tests emit warnings - can you suppress them? E.g.:

@pytest.mark.filterwarnings("ignore:The slope of")

or in a with pytest.warns(...): block,

Co-authored-by: Mathias Hauser <[email protected]>
@veni-vidi-vici-dormivi
Copy link
Collaborator Author

Stumbled upon #187. Just wanted to mention it here, but it seems like it does not necessarily speak against allowing singular true?

@mathause
Copy link
Member

I think the problem in #187 was that certain folds did not contribute to the nll. We should make sure this doesn't happen. Need to take a look at the code...

* date_range: update freq string

* add freq conditionals
* add mesmer_m example script

* changelog

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* remove unused imports

* adjustments

* replace script with notebook

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* restructure and add some headings

* remove inputs

* add some comments

* typos

* date_range: update freq string (MESMER-group#504)

* date_range: update freq string

* add freq conditionals

* reference: only year as link (MESMER-group#505)

* nits

* more info on time coordinate

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* update comment & clear output

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: Mathias Hauser <[email protected]>
Co-authored-by: Mathias Hauser <[email protected]>
@veni-vidi-vici-dormivi
Copy link
Collaborator Author

Another not so nice side effect about all of this is that if we select a singular covariance matrix and use eigh for drawing samples this could lead to slight differences in the emulations between machines...

@veni-vidi-vici-dormivi
Copy link
Collaborator Author

Maybe this is the solution to go for. Also because an ostrich looks very similar to an emu.

@veni-vidi-vici-dormivi
Copy link
Collaborator Author

Implement switch to set that per default, it will error if the matrix is singular

@mathause
Copy link
Member

mathause commented Oct 1, 2024

I've avoided reviewing this because I am unsure what's best (different results on different OSs is annoying but prob. unavoidable, however, if #187 happens it would be bad). Should/ can we wait for #521 so we can easily play with different models / scenarios before deciding here? (I think there were models with singular matrices for Lea, maybe only in CMIP5.)

@veni-vidi-vici-dormivi
Copy link
Collaborator Author

Okay

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Raise LinAlg error in try Cholesky Reconsider try Cholesky approach
2 participants