Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOS and pct_change coal mine disaster #56

Open
waudinio27 opened this issue Jan 30, 2023 · 4 comments
Open

OOS and pct_change coal mine disaster #56

waudinio27 opened this issue Jan 30, 2023 · 4 comments

Comments

@waudinio27
Copy link

waudinio27 commented Jan 30, 2023

Hello Osvaldo!

I am trying with the BART and in my opinion this shines like a crown. Something like a real PYMC jewel :-D

You say at the end of the notebook that one needs to detrend. All this is hard for me, I am a not so good programmer.

Could you show the coal mine disaster or some other example and how to do some out of sample predictions - but not with train test but true OOS - maybe 10 steps ahead of the coal mine disaster data or another example - with no extra features? I do not need train test - I see that this is working.

Also, could you show how to do make the series stationary and after reverse the process with an inverse transform and plot the final result? To make everything complete. Or would you take out the trend with a polynominal fit? In Light GBM they do this as well and therefore it is very popular. It was dominant at M5 competition.

https://towardsdatascience.com/xgboost-for-timeseries-lightgbm-is-a-bigger-boat-197864013e88

Would be extremely helpful!

@juanitorduz
Copy link
Contributor

hey! An easy way to detrend the series is to take first differences as described in https://otexts.com/fpp3/stationarity.html
To transform back the series you can take a cumulative sum.
Regarding forecasting using tree-based models, I guess you could prepare the data set via a reduction approach which is to create a design matrix from a time series. Maybe you could use some tools from sktime (see here and simply use the BART model as described in the notebook where X is now the time series reduced (i.e. wrapped). It is interesting that the article you shared leverages upon linear models in the nodes, which is related to #51

I hope this helps :)

@waudinio27
Copy link
Author

waudinio27 commented Feb 1, 2023

Hello Juan! Thank you for your reply. I can make the series stationary and make an inverse transform starting from the last known real value with cumulative sum. Just like this, I will lose a lot of structural information. The team of PYMC is fantastic with people from Europe, South America and Asia, but the forecasting with the program remains a big issue. There should be more invested into UI and UX design and easy to set up examples. Otherwise, the people that do not want to go into it too deep will stay with AutoARIMA, Facebook Prophet or LightGBM. I stayed away from PYMC for a while because of this and got back because of curiosity and got exited when I saw BART and Structural AR. BART would be competitive with the trend alone and even more with seasonality added as well, but it will probably take time until this happens. If somebody wants to predict future river continuum or warehouse stock, the posterior is simply not enough. I will think about your idea with the design matrix and the reduction approach as a workaround. I will need time to judge if this could be a way forward, because I do not know about overfitting in this case.

Best regards and greetings
Matthias

@waudinio27
Copy link
Author

Here you have an easy package for detrend and reversion - just saw it today and thought it will fit the discourse.

https://medium.com/towards-data-science/time-series-transformations-and-reverting-made-easy-f4f768c18f63

@waudinio27 waudinio27 reopened this Mar 15, 2023
@waudinio27
Copy link
Author

waudinio27 commented Mar 15, 2023

Dear Juan, you have been right from the start. The data has to be prepared with the design matrix from sktime or by a handmade function to put the lags for the trend.

I have adopted the model for coal mine to put the mutable data -

#mutable data kommt beim x_data - da muss man nur noch unten x_test einfügen dann geht es los ....

with pm.Model() as model_coal:
μ_ = pmb.BART("μ_", pm.MutableData("X", x_data), Y=y_data, m=20)
μ = pm.Deterministic("μ", pm.math.abs(μ_))
y_pred = pm.Poisson("y_pred", mu=μ, observed=y_data)
idata_coal = pm.sample(random_seed=RANDOM_SEED)

But I am not able to make the out of sample predictions after the training with the whole data.

I want to do the same for the quantile regression as well which is a great notebook with great ideas.

Best regards

Matthias

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants