# Physically Embedded Planning Environments This repository contains the three environments introduced in 'Physically Embedded Planning Problems: New Challenges for Reinforcement Learning' If you use this package, please cite our accompanying [tech report]: ``` @misc{mirza2020physically, title={Physically Embedded Planning Problems: New Challenges for Reinforcement Learning}, author={Mehdi Mirza and Andrew Jaegle and Jonathan J. Hunt and Arthur Guez and Saran Tunyasuvunakool and Alistair Muldal and Théophane Weber and Peter Karkus and Sébastien Racanière and Lars Buesing and Timothy Lillicrap and Nicolas Heess}, year={2020}, eprint={2009.05524}, archivePrefix={arXiv}, primaryClass={cs.AI} } ``` ## Requirements and Installation This repository is divided into 'mujoban' and 'board_games' folders. Both of them are built on top of [dm_control] which requires MuJoCo. Please follow [these] instructions to install MuJoCo. Other dependencies can be installed by: ``` pip3 install -r requirements.txt ``` ### Board games The game logic is based on [open_spiel]. Please install as instructed [here]. [gnugo] is required to play the game of Go against a non-random opponent. [gnugo] can be installed in Ubuntu by: ``` apt install gnugo ``` Board game scripts expect gnugo binary to be at: `/usr/games/gnugo`. Users can change this path inside `board_games/go_logic.py` This library has only been tested on Ubuntu. ## Example usage The code snippets below show examples of instantiating each of the environments. ### Mujoban ```python from dm_control import composer from dm_control.locomotion import walkers from physics_planning_games.mujoban.mujoban import Mujoban from physics_planning_games.mujoban.mujoban_level import MujobanLevel from physics_planning_games.mujoban.boxoban import boxoban_level_generator walker = walkers.JumpingBallWithHead(add_ears=True, camera_height=0.25) maze = MujobanLevel(boxoban_level_generator) task = Mujoban(walker=walker, maze=maze, control_timestep=0.1, top_camera_height=96, top_camera_width=96) env = composer.Environment(time_limit=1000, task=task) ``` ### Board games ```python from physics_planning_games import board_games environment_name = 'go_7x7' env = board_games.load(environment_name=environment_name) ``` ### Stepping through environment. The returned environments are of type of `dm_env.Environment` and can be stepped through as shown here with random actions: ```python import numpy as np timestep = env.reset() action_spec = env.action_spec() while True: action = np.stack([ np.random.uniform(low=minimum, high=maximum) for minimum, maximum in zip(action_spec.minimum, action_spec.maximum) ]) timestep = env.step(action) ``` ### Visualization For visualization of the environments `explore.py` loads them using the [viewer] from [dm_control]. ## More details For more details please refer to the [tech report], [dm_control] and [dm_env]. [tech report]: https://arxiv.org/abs/2009.05524 [dm_control]: https://github.com/deepmind/dm_control [dm_env]: https://github.com/deepmind/dm_env [gnugo]: https://www.gnu.org/software/gnugo/ [open_spiel]: https://github.com/deepmind/open_spiel [here]: https://github.com/deepmind/open_spiel/blob/master/docs/install.md [these]: https://github.com/deepmind/dm_control#requirements-and-installation [viewer]: https://github.com/deepmind/dm_control/tree/master/dm_control/viewer