Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fit method fail for Timezone aware timeseries #871

Closed
hdattada opened this issue Jul 5, 2024 · 1 comment · Fixed by #872 or #876
Closed

fit method fail for Timezone aware timeseries #871

hdattada opened this issue Jul 5, 2024 · 1 comment · Fixed by #872 or #876
Assignees
Labels
bug Something isn't working

Comments

@hdattada
Copy link

hdattada commented Jul 5, 2024

Describe the bug
We currently use ETS and DLT forecaster for our timeseries forecasting. When we specify a dataframe with datetime column consisting of timezone aware datetime column. The root cause of the issue is with this line

The numpy.diff returns an array of integers of a naive datetime while it returns an array of TimeDelta object for timezone aware series. Hence its able to cast the diff to float for former and fail for the latter.

Any workaround to get over this issue is appreciated. Thank you!

To Reproduce
Steps to reproduce the behavior:

import pandas as pd
from orbit.utils.general import is_ordered_datetime

df_tz_aware = pd.date_range("2021-01-01", periods=5, freq="D", tz="UTC")

print(is_ordered_datetime(df_tz_aware))

Expected behavior
A clear and concise description of what you expected to happen.
The expected output for the above series is True , while the functions throws the below error

  File "orbit_ets.py", line 37, in orbit_ets_forecast
    ).fit(historic_data_df)
  File "python3.10/site-packages/orbit/forecaster/map.py", line 23, in fit
    super().fit(df, **kwargs)
  File "python3.10/site-packages/orbit/forecaster/forecaster.py", line 143, in fit
    self._validate_training_df(df)
  File "python3.10/site-packages/orbit/forecaster/forecaster.py", line 285, in _validate_training_df
    if not is_ordered_datetime(date_array):
  File "python3.10/site-packages/orbit/utils/general.py", line 18, in is_ordered_datetime
    return np.all(np.diff(array).astype(float) > 0)

Screenshots
If applicable, add screenshots to help explain your problem.

Environment (please complete the following information):

  • OS: macOS
  • Python Version: 3.10.6
  • Versions of Major Dependencies (pandas, scikit-learn, cython): pandas:1.5.3, scikit-learn:1.1.3, orbit-ml:1.1.4.2

Additional context
An issue with Numpy was already raised, the determination was the cast to float for TimeDelta will not work as numpy is unaware of pandas types. numpy/numpy#26838

@hdattada hdattada added the bug Something isn't working label Jul 5, 2024
@swotai swotai linked a pull request Jul 8, 2024 that will close this issue
@swotai swotai mentioned this issue Jul 8, 2024
@swotai swotai reopened this Jul 10, 2024
@swotai swotai self-assigned this Jul 10, 2024
@hdattada
Copy link
Author

Thanks a ton @swotai for fixing this promptly. When can I expect a new release for this fix?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
2 participants