Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

跑不出那么好的效果 #1

Open
qujundong opened this issue Dec 25, 2020 · 6 comments
Open

跑不出那么好的效果 #1

qujundong opened this issue Dec 25, 2020 · 6 comments

Comments

@qujundong
Copy link

老哥,想问一下,我跑了你的代码和aoanet那个代码,精度都没有达到预期的,和给的参数里面唯一的区别就是我使用的是bottom up trainval_36特征,但是和论文里的bleu4精度差0.03,我只能达到0.361,和文中的0.39左右差距还是挺大的,想请教一下调参的方法

@qujundong qujundong changed the title 跑步出那么好的效果 跑不出那么好的效果 Dec 25, 2020
@wtliao
Copy link
Owner

wtliao commented Dec 28, 2020

@qujundong hi! The results given in the paper are achieved by running train_v3d1.sh (the hyperparameters have been set in the file). A little difference is that we trained without rl 40 epochs and then selected the best ones (it is 37th epoch in our model) as the base for further rl training. An then in 63rd epoch (i.e. after 26 epochs rl fine tuning ) we got the best results. Moreover, we use the "10 to 100 features per image adaptive features" from bottom up. The final results may deviate slightly because of different initialization. Thanks.

@qujundong
Copy link
Author

qujundong commented Dec 28, 2020 via email

@wtliao
Copy link
Owner

wtliao commented Dec 28, 2020

@qujundong the results given in Tab.1 in the paper are validated on val set and that on Tab.2 are on test set on the online system. Maybe you can report your more results, e.g. the results before rl fine tuning, so that I can give some suggestion for your training.

@qujundong
Copy link
Author

qujundong commented Dec 29, 2020

这是我在test数据集中跑出的结果,当时在val数据集bleu4是在0.382左右停止,这有可能是我使用数据集的问题,但是test得出的结果有点太低了
computing Bleu score...
{'testlen': 46716, 'reflen': 46696, 'guess': [46716, 41716, 36716, 31716], 'correct': [35785, 20061, 10268, 5115]}
ratio: 1.000428302210018
Bleu_1: 0.766
Bleu_2: 0.607
Bleu_3: 0.469
Bleu_4: 0.359
computing CIDEr score...
CIDEr: 1.154
loss: 0.0
{'Bleu_1': 0.7660116448325892, 'Bleu_2': 0.6069356468740359, 'Bleu_3': 0.4687830819485329, 'Bleu_4': 0.35902174599546705, 'CIDEr': 1.1537819229346586, 'bad_count_rate': 0.0024}

@wtliao
Copy link
Owner

wtliao commented Dec 29, 2020

@qujundong Your results are very low. I have uploaded our trained model, both before and after rl fine tuning. Sorry for the wrong reply previously. We offline validated our model on the so called "test set" and we used "val set" to select the best-trained model.

@qujundong
Copy link
Author

非常感谢 ,我们可以加一下微信嘛,以后可能还会请教您,我的微信:272233310

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants