Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The replication issues with the downscaling task. #37

Open
Tttizi opened this issue Dec 18, 2023 · 7 comments
Open

The replication issues with the downscaling task. #37

Tttizi opened this issue Dec 18, 2023 · 7 comments

Comments

@Tttizi
Copy link

Tttizi commented Dec 18, 2023

  1. In attempting the Downscaling task, following the publicly available code on GitHub did not yield the reported performance in the paper. Specifically, the Root Mean Squared Error (RMSE) for T2m was 6.08, whereas the paper reports 2.79. I am uncertain if there are key points I should be mindful of to address this discrepancy.

  2. I noticed some discrepancies between the descriptions in the paper and the provided code, such as the setting of the learning rate. Despite trying various combinations, I have been unable to obtain the correct results. I would appreciate your advice and guidance on this matter.

  3. I would like to inquire about the choice of the pre-training model—should I select the 1.40625-degree model? I have encountered some confusion during my attempts, and I am seeking your professional opinion on this matter.

@tung-nd
Copy link
Collaborator

tung-nd commented Dec 21, 2023

Hi, thank you for your interest in ClimaX. I answer the questions as follows:

  1. Can you elaborate on what the differences are?
  2. Yes, you should use the 1.40625deg model. What issues did you run into when trying to use it?

@Tttizi
Copy link
Author

Tttizi commented Dec 22, 2023

Thank you very much for your response. I have noticed three differences between the paper and the code. First, in the paper, the learning rate for the downscaling task is 5e-5, while in the code, it is set to 5e-4. Second, the warmup setting in the paper is not explicitly mentioned, but from the code, it seems to have exceeded 5 epochs. Third, in the paper, it is stated that you trained different networks for different features, while in the code, these features are predicted together. I have attempted to adjust these settings, but the performance is still not satisfactory. Therefore, I hope you can provide more details on how each feature corresponds to specific settings or offer more detailed guidance on how to reproduce the results from the paper.

@Tttizi
Copy link
Author

Tttizi commented Dec 22, 2023

I have another question regarding the data. There are two issues with the data provided in the Hugging Face link. First, it lacks data for the features "10_m_u_component_of_wind" and "10_m_v_component_of_wind." Second, the data does not match the WeatherBench dataset. Since there are no timestamps, I extracted data for one day and compared it with the data corresponding to that year in the WeatherBench dataset. Unfortunately, I couldn't find matching data.

@Tttizi
Copy link
Author

Tttizi commented Jan 5, 2024

Hi, just wanted to check if there have been any updates on this issue.

1 similar comment
@Tttizi
Copy link
Author

Tttizi commented Jan 26, 2024

Hi, just wanted to check if there have been any updates on this issue.

@Tttizi
Copy link
Author

Tttizi commented Feb 1, 2024

I've noticed in the code that during the network initialization, there is a feature called land_sea_mask. Is this feature used in the downscaling task? Where is this data obtained from?

@Escape142
Copy link

Escape142 commented May 22, 2024

@tung-nd is Cli-ViT from ClimaX paper is the same as ViT in Climate Learn paper?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants