--- env_name: Reacher-v5 tags: - Reacher-v5 - ppo - reinforcement-learning - custom-implementation - mujoco - ddp model-index: - name: PPO-DDP-ReacherV5 results: - task: type: reinforcement-learning name: reinforcement-learning dataset: name: Reacher-v5 type: Reacher-v5 metrics: - type: mean_reward value: -5.30 +/- 1.20 name: mean_reward verified: false --- # **PPO** Agent playing **Reacher-v5** This is a trained model of a **PPO** agent playing **Reacher-v5**. ## Usage ### create the conda env in https://github.com/GeneHit/drl_practice ```bash conda create -n drl python=3.12 conda activate drl python -m pip install -r requirements.txt ``` ### play with full model ```python # load the full model model = load_from_hub(repo_id="winkin119/PPO-DDP-ReacherV5", filename="full_model.pt") # Create the environment. env = gym.make("Reacher-v5") state, _ = env.reset() action = model.action(state) ... ``` There is also a state dict version of the model, you can check the corresponding definition in the repo.