Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to run gail code #20

Open
lijinming2018 opened this issue Nov 19, 2023 · 4 comments
Open

how to run gail code #20

lijinming2018 opened this issue Nov 19, 2023 · 4 comments

Comments

@lijinming2018
Copy link

how to run gail code

@lijinming2018
Copy link
Author

while I run
python maniskill2_learn/apis/run_rl.py configs/mfrl/gail/maniskill2_pn.py --work-dir ./logs/bc_PickCube_pointcloud_128bs_ee --gpu-ids 1 --sim-gpu-ids 0 --cfg-options "env_cfg.env_name=PickCube-v0" "env_cfg.obs_mode=pointcloud" "env_cfg.n_points=1200" "env_cfg.control_mode=pd_joint_delta_pos" "replay_cfg.buffer_filenames=../ManiSkill2/demos/rigid_body/PickCube-v0/trajectory.none.pd_joint_delta_pos_pointcloud_ee.h5" "env_cfg.obs_frame=ee" "eval_cfg.num=100" "eval_cfg.save_traj=False" "eval_cfg.save_video=False" "train_cfg.n_eval=5000" "train_cfg.total_steps=500000" "train_cfg.n_checkpoint=10000" "train_cfg.n_updates=500"

and get
‘Traceback (most recent call last):
File "maniskill2_learn/apis/run_rl.py", line 522, in
main()
File "maniskill2_learn/apis/run_rl.py", line 486, in main
run_one_process(0, 1, args, cfg)
File "maniskill2_learn/apis/run_rl.py", line 461, in run_one_process
main_rl(rollout, evaluator, replay, args, cfg, expert_replay=expert_replay, recent_traj_replay=recent_traj_replay)
File "maniskill2_learn/apis/run_rl.py", line 296, in main_rl
train_rl(
File "/data/private/ljm/ManiSkill2-Learn/maniskill2_learn/apis/train_rl.py", line 209, in train_rl
replay.push_batch(trajectories)
File "/data/private/ljm/ManiSkill2-Learn/maniskill2_learn/env/replay_buffer.py", line 196, in push_batch
self.memory.assign(slice(self.position, self.position + len(items)), items)
File "/data/private/ljm/ManiSkill2-Learn/maniskill2_learn/utils/data/dict_array.py", line 830, in assign
self.memory = self._assign(self.memory, indices, value)
File "/data/private/ljm/ManiSkill2-Learn/maniskill2_learn/utils/data/dict_array.py", line 471, in _assign
memory[key] = cls._assign(memory[key], indices, value[key], ignore_list)
File "/data/private/ljm/ManiSkill2-Learn/maniskill2_learn/utils/data/dict_array.py", line 471, in _assign
memory[key] = cls._assign(memory[key], indices, value[key], ignore_list)
File "/data/private/ljm/ManiSkill2-Learn/maniskill2_learn/utils/data/dict_array.py", line 477, in _assign
memory[indices] = value
ValueError: could not broadcast input array from shape (4000,1200,3) into shape (4000,1250,3)
Exception ignored in: <function SharedGDict.del at 0x7f0da445a0d0>
Traceback (most recent call last):
File "/data/private/ljm/ManiSkill2-Learn/maniskill2_learn/utils/data/dict_array.py", line 928, in del
File "/data/private/ljm/ManiSkill2-Learn/maniskill2_learn/utils/data/dict_array.py", line 913, in _unlink
File "/opt/conda/lib/python3.8/multiprocessing/shared_memory.py", line 239, in unlink
ImportError: sys.meta_path is None, Python is likely shutting down
Exception ignored in: <function SharedGDict.del at 0x7f0da445a0d0>
Traceback (most recent call last):
File "/data/private/ljm/ManiSkill2-Learn/maniskill2_learn/utils/data/dict_array.py", line 928, in del
File "/data/private/ljm/ManiSkill2-Learn/maniskill2_learn/utils/data/dict_array.py", line 913, in _unlink
File "/opt/conda/lib/python3.8/multiprocessing/shared_memory.py", line 239, in unlink
ImportError: sys.meta_path is None, Python is likely shutting down
/opt/conda/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 29 leaked shared_memory objects to clean up at shutdown’

@xuanlinli17
Copy link
Collaborator

I think you forgot "env_cfg.n_goal_points=50" since your demos seem to contains the goal points

@lijinming2018
Copy link
Author

when I run
python maniskill2_learn/apis/run_rl.py configs/mfrl/gail/maniskill2_pn.py --work-dir ./logs/bc_PickCube_pointcloud_gail --gpu-ids 1 --sim-gpu-ids 0 --cfg-options "env_cfg.env_name=PickCube-v0" "env_cfg.obs_mode=pointcloud" "env_cfg.n_points=1200" "env_cfg.control_mode=pd_ee_delta_pose" "env_cfg.n_goal_points=50" "replay_cfg.buffer_filenames=../ManiSkill2/demos/rigid_body/PickCube-v0/trajectory.none.pd_ee_delta_pose_pointcloud3.h5" "env_cfg.obs_frame=ee" "eval_cfg.save_traj=False"
,I get
'''
PickCube-v0-train - (run_rl.py:261) - INFO - 2023-11-24,14:02:35 - Num of parameters: 1.33M, Model Size: 5.32M
[2023-11-24 14:02:36.662] [svulkan2] [error] GLFW error: X11: Failed to open display 0
[2023-11-24 14:02:36.662] [svulkan2] [warning] Continue without GLFW.
[2023-11-24 14:02:36.745] [svulkan2] [error] GLFW error: X11: Failed to open display 0
[2023-11-24 14:02:36.745] [svulkan2] [warning] Continue without GLFW.
[2023-11-24 14:02:36.792] [svulkan2] [error] GLFW error: X11: Failed to open display 0
[2023-11-24 14:02:36.792] [svulkan2] [warning] Continue without GLFW.
[2023-11-24 14:02:36.897] [svulkan2] [error] GLFW error: X11: Failed to open display 0
[2023-11-24 14:02:36.897] [svulkan2] [warning] Continue without GLFW.
[2023-11-24 14:02:37.018] [svulkan2] [error] GLFW error: X11: Failed to open display 0
[2023-11-24 14:02:37.018] [svulkan2] [warning] Continue without GLFW.
PickCube-v0-train - (run_rl.py:289) - INFO - 2023-11-24,14:02:37 - Work directory of this run ./logs/bc_PickCube_pointcloud_gail
PickCube-v0-train - (run_rl.py:291) - INFO - 2023-11-24,14:02:37 - Train over GPU [1]!
PickCube-v0-train - (train_rl.py:180) - INFO - 2023-11-24,14:02:37 - Rollout state dim: {'xyz': (4, 1250, 3), 'rgb': (4, 1250, 3), 'frame_related_states': (4, 4, 3), 'to_frames': (4, 2, 4, 4), 'state': (4, 30)}, action dim: (4, 7)!
PickCube-v0-train - (train_rl.py:202) - INFO - 2023-11-24,14:02:37 - Begin 8000 warm-up steps with random policy!
Evaluation-PickCube-v0-train-env-0 - (evaluation.py:294) - INFO - 2023-11-24,14:02:40 - The Evaluation environment has seed in 345236826!
Evaluation-PickCube-v0-train-env-0 - (evaluation.py:330) - INFO - 2023-11-24,14:02:40 - Size of image in the rendered video (512, 512, 3)
PickCube-v0-train - (train_rl.py:210) - INFO - 2023-11-24,14:03:06 - Warm up samples stats: rewards:31.3[3.4, 88.9], max_single_R:0.52[0.18, 0.80], lens:200[200, 200], success:0.00!
{'obs': {'xyz': (8000, 1250, 3), 'rgb': (8000, 1250, 3), 'frame_related_states': (8000, 4, 3), 'to_frames': (8000, 2, 4, 4), 'state': (8000, 30)}, 'next_obs': {'xyz': (8000, 1250, 3), 'rgb': (8000, 1250, 3), 'frame_related_states': (8000, 4, 3), 'to_frames': (8000, 2, 4, 4), 'state': (8000, 30)}, 'actions': (8000, 7), 'rewards': (8000, 1), 'dones': (8000, 1), 'infos': {'elapsed_steps': (8000, 1), 'is_obj_placed': (8000, 1), 'is_robot_static': (8000, 1), 'success': (8000, 1), 'reward': (8000, 1), 'TimeLimit.truncated': (8000, 1)}, 'episode_dones': (8000, 1), 'worker_indices': (8000, 1)}
PickCube-v0-train - (train_rl.py:225) - INFO - 2023-11-24,14:03:06 - Finish 8000 warm-up steps!
PickCube-v0-train - (train_rl.py:244) - INFO - 2023-11-24,14:03:06 - Begin training!
PickCube-v0-train - (train_rl.py:285) - INFO - 2023-11-24,14:03:09 - Replay buffer shape: {'actions': (60000, 7), 'dones': (60000, 1), 'episode_dones': (60000, 1), 'is_truncated': (60000, 1), 'next_obs': {'frame_related_states': (60000, 4, 3), 'rgb': (60000, 1250, 3), 'state': (60000, 30), 'to_frames': (60000, 2, 4, 4), 'xyz': (60000, 1250, 3)}, 'obs': {'frame_related_states': (60000, 4, 3), 'rgb': (60000, 1250, 3), 'state': (60000, 30), 'to_frames': (60000, 2, 4, 4), 'xyz': (60000, 1250, 3)}, 'rewards': (60000, 1), 'worker_indices': (60000, 1)}.

PickCube-v0-train - (train_rl.py:374) - INFO - 2023-11-24,14:09:05 - 20800/20000000(0%) Passed time:5m58s ETA:6d11h33m36s samples_stats: rewards:22.4[2.5, 67.2], max_single_R:0.48[0.20, 0.85], lens:200[200, 200], success:0.00 gpu_mem_ratio: 41.3% gpu_mem: 9.92G gpu_mem_this: 0.00G gpu_util: 0% discriminator_rewards: 0.886 critic_loss: 0.224 max_critic_abs_err: 1.323 actor_loss: -4.383 alpha: 0.178 alpha_loss: 2.098 q: 3.531 q_target: 3.572 entropy: 4.753 target_entropy: -7.000 critic_grad: 10.267 actor_grad: 0.185 episode_time: 358.256 collect_sample_time: 96.606 memory: 18.54G
PickCube-v0-train - (train_rl.py:374) - INFO - 2023-11-24,14:15:00 - 33600/20000000(0%) Passed time:11m53s ETA:6d10h36m48s samples_stats: rewards:35.4[6.1, 70.8], max_single_R:0.66[0.19, 0.86], lens:200[200, 200], success:0.00 gpu_mem_ratio: 41.3% gpu_mem: 9.92G gpu_mem_this: 0.00G gpu_util: 21% discriminator_rewards: 1.462 critic_loss: 0.334 max_critic_abs_err: 2.835 actor_loss: -8.181 alpha: 0.144 alpha_loss: 1.674 q: 7.497 q_target: 7.524 entropy: 4.656 target_entropy: -7.000 critic_grad: 29.183 actor_grad: 0.138 episode_time: 354.416 collect_sample_time: 91.741 memory: 18.38G
PickCube-v0-train - (train_rl.py:374) - INFO - 2023-11-24,14:20:56 - 46400/20000000(0%) Passed time:17m49s ETA:6d10h24m23s samples_stats: rewards:40.2[2.4, 108.9], max_single_R:0.71[0.28, 2.43], lens:200[200, 200], success:0.00 gpu_mem_ratio: 41.3% gpu_mem: 9.92G gpu_mem_this: 0.00G gpu_util: 0% discriminator_rewards: 1.253 critic_loss: 0.882 max_critic_abs_err: 4.930 actor_loss: -9.764 alpha: 0.117 alpha_loss: 1.348 q: 9.191 q_target: 9.233 entropy: 4.484 target_entropy: -7.000 critic_grad: 54.392 actor_grad: 0.136 episode_time: 355.570 collect_sample_time: 92.506 memory: 18.38G
PickCube-v0-train - (train_rl.py:374) - INFO - 2023-11-24,14:26:54 - 59200/20000000(0%) Passed time:23m48s ETA:6d10h30m11s samples_stats: rewards:125.7[33.8, 169.0], max_single_R:1.53[0.82, 2.54], lens:200[200, 200], success:0.00 gpu_mem_ratio: 41.3% gpu_mem: 9.92G gpu_mem_this: 0.00G gpu_util: 0% discriminator_rewards: 0.380 critic_loss: 0.988 max_critic_abs_err: 5.150 actor_loss: -8.505 alpha: 9.713e-02 alpha_loss: 1.057 q: 7.967 q_target: 8.030 entropy: 3.869 target_entropy: -7.000 critic_grad: 47.301 actor_grad: 0.209 episode_time: 357.881 collect_sample_time: 94.864 memory: 18.38G
PickCube-v0-train - (train_rl.py:374) - INFO - 2023-11-24,14:32:51 - 72000/20000000(0%) Passed time:29m44s ETA:6d10h20m44s samples_stats: rewards:99.5[16.5, 250.2], max_single_R:1.25[0.50, 2.87], lens:200[200, 200], success:0.00 gpu_mem_ratio: 41.3% gpu_mem: 9.92G gpu_mem_this: 0.00G gpu_util: 0% discriminator_rewards: 0.206 critic_loss: 0.606 max_critic_abs_err: 3.439 actor_loss: -8.337 alpha: 8.139e-02 alpha_loss: 0.838 q: 7.814 q_target: 7.870 entropy: 3.288 target_entropy: -7.000 critic_grad: 19.566 actor_grad: 0.242 episode_time: 355.850 collect_sample_time: 93.060 memory: 18.38G
PickCube-v0-train - (train_rl.py:374) - INFO - 2023-11-24,14:39:58 - 84800/20000000(0%) Passed time:36m51s ETA:6d15h17m15s samples_stats: rewards:112.8[46.8, 187.7], max_single_R:1.22[0.77, 2.60], lens:200[200, 200], success:0.00 gpu_mem_ratio: 41.3% gpu_mem: 9.92G gpu_mem_this: 0.00G gpu_util: 0% discriminator_rewards: 0.192 critic_loss: 0.569 max_critic_abs_err: 3.061 actor_loss: -9.402 alpha: 6.837e-02 alpha_loss: 0.690 q: 8.959 q_target: 9.018 entropy: 3.085 target_entropy: -7.000 critic_grad: 21.825 actor_grad: 0.237 episode_time: 426.309 collect_sample_time: 94.114 memory: 21.33G
PickCube-v0-train - (train_rl.py:374) - INFO - 2023-11-24,14:45:55 - 97600/20000000(0%) Passed time:42m48s ETA:6d14h29m18s samples_stats: rewards:115.0[15.5, 192.7], max_single_R:1.49[0.75, 2.77], lens:200[200, 200], success:0.00 gpu_mem_ratio: 41.3% gpu_mem: 9.92G gpu_mem_this: 0.00G gpu_util: 0% discriminator_rewards: 0.204 critic_loss: 0.635 max_critic_abs_err: 3.376 actor_loss: -9.904 alpha: 5.747e-02 alpha_loss: 0.566 q: 9.537 q_target: 9.601 entropy: 2.841 target_entropy: -7.000 critic_grad: 26.638 actor_grad: 0.228 episode_time: 356.770 collect_sample_time: 93.589 memory: 21.33G
PickCube-v0-train - (train_rl.py:374) - INFO - 2023-11-24,14:51:52 - 110400/20000000(1%) Passed time:48m45s ETA:6d13h50m43s samples_stats: rewards:107.3[9.4, 244.3], max_single_R:1.56[0.54, 2.90], lens:200[200, 200], success:0.00 gpu_mem_ratio: 41.3% gpu_mem: 9.92G gpu_mem_this: 0.00G gpu_util: 0% discriminator_rewards: 0.280 critic_loss: 0.702 max_critic_abs_err: 3.900 actor_loss: -9.667 alpha: 4.821e-02 alpha_loss: 0.478 q: 9.375 q_target: 9.434 entropy: 2.903 target_entropy: -7.000 critic_grad: 29.599 actor_grad: 0.223 episode_time: 356.411 collect_sample_time: 93.377 memory: 21.33G
PickCube-v0-train - (train_rl.py:374) - INFO - 2023-11-24,14:57:52 - 123200/20000000(1%) Passed time:54m45s ETA:6d13h29m18s samples_stats: rewards:172.6[36.8, 403.4], max_single_R:1.75[0.46, 2.83], lens:200[200, 200], success:0.00 gpu_mem_ratio: 41.3% gpu_mem: 9.92G gpu_mem_this: 0.00G gpu_util: 0% discriminator_rewards: 0.373 critic_loss: 0.971 max_critic_abs_err: 4.578 actor_loss: -9.738 alpha: 4.059e-02 alpha_loss: 0.374 q: 9.436 q_target: 9.497 entropy: 2.210 target_entropy: -7.000 critic_grad: 39.959 actor_grad: 0.247 episode_time: 359.880 collect_sample_time: 96.928 memory: 21.33G
Traceback (most recent call last):
File "maniskill2_learn/apis/run_rl.py", line 522, in
main()
File "maniskill2_learn/apis/run_rl.py", line 486, in main
run_one_process(0, 1, args, cfg)
File "maniskill2_learn/apis/run_rl.py", line 461, in run_one_process
main_rl(rollout, evaluator, replay, args, cfg, expert_replay=expert_replay, recent_traj_replay=recent_traj_replay)
File "maniskill2_learn/apis/run_rl.py", line 296, in main_rl
train_rl(
File "/data/private/ljm/ManiSkill2-Learn/maniskill2_learn/apis/train_rl.py", line 313, in train_rl
disc_update_applied = agent.update_discriminator(expert_replay, recent_traj_replay, n_ep)
File "/data/private/ljm/ManiSkill2-Learn/maniskill2_learn/methods/mfrl/gail.py", line 142, in update_discriminator
self.update_discriminator_helper(expert_replay, recent_traj_replay)
File "/data/private/ljm/ManiSkill2-Learn/maniskill2_learn/methods/mfrl/gail.py", line 115, in update_discriminator_helper
expert_sampled_batch = expert_replay.sample(self.discriminator_batch_size // 2).to_torch(
File "/data/private/ljm/ManiSkill2-Learn/maniskill2_learn/env/replay_buffer.py", line 231, in sample
assert self.position == 0, "cache size should equals to buffer size"
AssertionError: cache size should equals to buffer size
Exception ignored in: <function SharedGDict.del at 0x7fe1a5891280>
Traceback (most recent call last):
File "/data/private/ljm/ManiSkill2-Learn/maniskill2_learn/utils/data/dict_array.py", line 928, in del
File "/data/private/ljm/ManiSkill2-Learn/maniskill2_learn/utils/data/dict_array.py", line 913, in _unlink
File "/opt/conda/lib/python3.8/multiprocessing/shared_memory.py", line 239, in unlink
ImportError: sys.meta_path is None, Python is likely shutting down
Exception ignored in: <function SharedGDict.del at 0x7fe1a5891280>
Traceback (most recent call last):
File "/data/private/ljm/ManiSkill2-Learn/maniskill2_learn/utils/data/dict_array.py", line 928, in del
File "/data/private/ljm/ManiSkill2-Learn/maniskill2_learn/utils/data/dict_array.py", line 913, in _unlink
File "/opt/conda/lib/python3.8/multiprocessing/shared_memory.py", line 239, in unlink
ImportError: sys.meta_path is None, Python is likely shutting down
Exception ignored in: <function SharedGDict.del at 0x7fe1a5891280>
Traceback (most recent call last):
File "/data/private/ljm/ManiSkill2-Learn/maniskill2_learn/utils/data/dict_array.py", line 928, in del
File "/data/private/ljm/ManiSkill2-Learn/maniskill2_learn/utils/data/dict_array.py", line 913, in _unlink
File "/opt/conda/lib/python3.8/multiprocessing/shared_memory.py", line 239, in unlink
ImportError: sys.meta_path is None, Python is likely shutting down
/opt/conda/lib/python3.8/multiprocessing/resource_tracker.py:216: UserWarning: resource_tracker: There appear to be 50 leaked shared_memory objects to clean up at shutdown
'''

@xuanlinli17
Copy link
Collaborator

Looks like your config is modified from the original config, as the replay buffer size is different.

The error is due to assert self.position == 0, "cache size should equals to buffer size" in expert replay. As mentioned in readme, set the expert replay with capacity == cache_size, e.g.,

demo_replay_cfg=dict(
    type="ReplayMemory",
    capacity=int(2e4),
    num_samples=-1,
    cache_size=int(2e4),
    dynamic_loading=True,
    synchronized=False,
    keys=["obs", "actions", "dones", "episode_dones"],
    buffer_filenames=[
        "PATH_TO_DEMO.h5",
    ],
),

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants