-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Example for several Scenarios #521
base: main
Are you sure you want to change the base?
Add Example for several Scenarios #521
Conversation
for more information, see https://pre-commit.ci
for more information, see https://pre-commit.ci
Okay so when I plot the residuals after removing the global trend (including volcanic forcing) where I treat the historical members as their own scenario I get this "mismatch" at the transition from historical to projected period. At the moment I would say that this is not pretty but actually, for the fitting it should be fine as long as we keep treating the historical data as its own scenario for the AR processes. For the linear regressions and the variances/covariances it doesn't matter since these do not consider time dependency. But I would be happy if a second brain went over this as well @mathause 🙂 We should point out however that in the emulation process one should use a continuous time series not ensure continuity of the realization. |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #521 +/- ##
==========================================
+ Coverage 49.76% 49.77% +0.01%
==========================================
Files 50 50
Lines 3563 3572 +9
==========================================
+ Hits 1773 1778 +5
- Misses 1790 1794 +4
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
Okay, I'm actually surprisingly happy with this approach and impressed by what xarray and datatree can do. I feel like the data tree approach I went for here (holding one dataset per scenario that holds the members along the dimensions) is nice. Nevertheless, I want to rewrite the autoregression functions to work on data trees instead of the arg list. For the linear regression and covariance we could think about implementing functions that take care of the stacking and weighting automatically. Actually I think this would be quite fun. But I want to focus on MESMER-X for the rest of the week. |
@yquilcaille You can use this now. All functionality should stay the same as it is in here now, just that some of the manual data prepping I do will be moved into functions, which needs more time to implement cleanly. One thing, if you calibrate on all the ESMs, could you tell me if you ever run into singular correlation matrices when fitting for the best localization radius? At the moment this should abort the fitting and we are still debating if it is worth to implement a version where we singular matrices are allowed. Thank you! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @veni-vidi-vici-dormivi! The "surfer" looks good, no problem to add it. I agree that the preparation of the data should be moved into functions with the future cleaning. Also, some users may benefit from easy wrappers, like one for training, one for emulation.
I will now use this surfer to prepare the training of all ESMs and emulations for FASTMIP. Promised, if any issue appears on the singular matrices, I will let you know :)
Thanks! Cool that this works & sorry for the late reply. I would like to see some changes before merging, though,
|
TODO: check the status of datatree in xarray - it would be good if we can enable using it from xarray ( |
Yes absolutely, have done this locally already, will push it soon. I am currently working on implementing the data tree approach in the repo and moving this into the integration tests.
It is not because I treat the historical period as a completely independent scenario, i.e. I smooth historical and scenario separately thus leading to different values around the transition from historical to future period. This leads to different values than Lea's in the smoothed global mean and thus the residuals and everything thereafter, thus all the parameters. What do you think about this? I think that it is more elegant as there is no duplication of the historical period. As I see it, Lea solved this by taking the median over the scenario hists before: mesmer/mesmer/calibrate_mesmer/train_gt.py Lines 100 to 105 in 72d2fd9
Agree. Will do.
Will get back to this later
Ah right. If we want different historical variability we just need different seeds for each scenario right, so both is possible depending on the seed? |
if this works could also add an integration test for this
CHANGELOG.rst