I cannot get the result the same as the paper. When the training of jump policy, I always gets reward 0.
The default values to learn runup policy are:
algorithm.max_iterations: 2000
experiment.env: jumper_run2
env.jumper_run2.angular_v: [-3.0, -3.0, 1.0]
env.jumper_run2.linear_v_z: -2.4
The jump policy with the following parameters, which are the recommended ones to learn Fosbury Flop.
algorithm.max_iterations: 12000
experiment.env: highjump
# initial state file generated by the run-up training
env.highjump.initial_state: results/runup-2022-Feb-10-175005/checkpoint_2000.tar.npy
# wall orientation in degrees
env.highjump.wall_rotation: -0.05
# must correspond to the training height of the checkpoint
env.highjump.initial_wall_height: 0.5
I cannot get the result the same as the paper. When the training of jump policy, I always gets reward 0.
The default values to learn runup policy are:
algorithm.max_iterations: 2000
experiment.env: jumper_run2
env.jumper_run2.angular_v: [-3.0, -3.0, 1.0]
env.jumper_run2.linear_v_z: -2.4
The jump policy with the following parameters, which are the recommended ones to learn Fosbury Flop.
algorithm.max_iterations: 12000
experiment.env: highjump
# initial state file generated by the run-up training
env.highjump.initial_state: results/runup-2022-Feb-10-175005/checkpoint_2000.tar.npy
# wall orientation in degrees
env.highjump.wall_rotation: -0.05
# must correspond to the training height of the checkpoint
env.highjump.initial_wall_height: 0.5