-
Notifications
You must be signed in to change notification settings - Fork 582
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
question: expected performance of vq-bet? #341
Comments
Hi there, your results look a bit below — but not too far off either — what we achieved on our pre-trained policy here. How many eval steps did you do? @alexander-soare has probably more insights on this |
@Jubayer-Hamid thanks for raising this. VQ-BeT and Diffusion Policy should give about the same results. In fact, the models we have on the hub (DP, VQ-BeT) happen to both give 63.8% success rate with 500 evals. If you try running evals with: |
Hi, thanks for the prompt response. After trying with 500 evaluate episodes, VQ-BeT's success rate managed to get much closer to Diffusion Policy's. |
Hi LeRobot authors, Thank you for your fantastic repo! I wanted to follow up regarding the expected results of VQ-BET. My collaborator and I ran your checkpoint and config across 500 episodes using the following command:
However, our results consistently came out lower than what’s reported on your HF page. Here are the results we obtained on two different GPU machines:
We're wondering if a recent code update might have impacted the evaluation. Could you please confirm the results for the released checkpoint? Thanks, |
@YuejiangLIU I just ran:
I even ran it with I'm on commit hash 2252b42. I'm wondering if this is somehow related to system configuration and hardware. I'm using an Nvidia RTX 3090 on Ubuntu 22. @aliberts any other ideas? (for context, you just need to view @YuejiangLIU's last message and this one - they are falling short by a tiny amount on success rate) |
Hi,
Thank you to the LeRobot community for maintaining such a fantastic codebase. My research group and I have greatly benefited from your efforts. In my current project, I am using the repository primarily for analyzing algorithms across different environments. I wanted to raise an issue I am encountering with VQ-BeT. I have been using the model on PushT and I want to ensure that the results I am obtaining align with community expectations. If not, I might be using the VQ-BeT repository incorrectly and would appreciate any guidance.
I used the following command: python lerobot/scripts/train.py vqbet pusht
For VQ-BeT, it seems like the maximum success rate is exactly 60%, whereas for Diffusion Policy the maximum success rate is 74%. Below, I have attached the wandb figures for the success rate vs training steps (left is for VQ-BeT and right is for Diffusion Policy):
Are these results expected for the algorithm? If not, am I running the wrong commands to reproduce the SOTA results?
Thank you for your assistance.
The text was updated successfully, but these errors were encountered: