PiperOrigin-RevId: 331167767
Physically Embedded Planning Environments
This repository contains the three environments introduced in 'Physically Embedded Planning Problems: New Challenges for Reinforcement Learning'
If you use this package, please cite our accompanying tech report:
@misc{,
title={Physically Embedded Planning Problems: New Challenges for
Reinforcement Learning},
author={Mehdi Mirza, Andrew Jaegle, Jonathan J. Hunt, Arthur Guez,
Saran Tunyasuvunakool, Alistair Muldal, Théophane Weber,
Peter Karkus, Sébastien Racanière, Lars Buesing,
Timothy Lillicrap, Nicolas Heess},
year={2020},
eprint={},
archivePrefix={arXiv},
primaryClass={cs.RO}
}
Requirements and Installation
This repository is divided into 'mujoban' and 'board_games' folders. Both of them are built on top of dm_control which requires MuJoCo. Please follow these instructions to install MuJoCo. Other dependencies can be installed by:
pip3 install -r requirements.txt
Board games
The game logic is based on open_spiel. Please install as instructed here. gnugo is required to play the game of Go against a non-random opponent. gnugo can be installed in Ubuntu by:
apt install gnugo
Board game scripts expect gnugo binary to be at: /usr/games/gnugo. Users can
change this path inside board_games/go_logic.py
This library has only been tested on Ubuntu.
Example usage
The code snippets below show examples of instantiating each of the environments.
Mujoban
from dm_control import composer
from dm_control.locomotion import walkers
from physics_planning_games.mujoban.mujoban import Mujoban
from physics_planning_games.mujoban.mujoban_level import MujobanLevel
from physics_planning_games.mujoban.boxoban import boxoban_level_generator
walker = walkers.JumpingBallWithHead(add_ears=True, camera_height=0.25)
maze = MujobanLevel(boxoban_level_generator)
task = Mujoban(walker=walker,
maze=maze,
control_timestep=0.1,
top_camera_height=96,
top_camera_width=96)
env = composer.Environment(time_limit=1000, task=task)
Board games
from physics_planning_games import board_games
environment_name = 'go_7x7'
env = board_games.load(environment_name=environment_name)
Stepping through environment.
The returned environments are of type of dm_env.Environment and can be stepped
through as shown here with random actions:
import numpy as np
timestep = env.reset()
action_spec = env.action_spec()
while True:
action = np.stack([
np.random.uniform(low=minimum, high=maximum)
for minimum, maximum in zip(action_spec.minimum, action_spec.maximum)
])
timestep = env.step(action)
Visualization
For visualization of the environments explore.py loads them using the viewer
from dm_control.
More details
For more details please refer to the tech report, dm_control and dm_env.