Release of code and dataset accompanying the SIGGRAPH 2020 publication "Catch & Carry: Reusable Neural Controllers for Vision-Guided Whole-Body Tasks".

PiperOrigin-RevId: 325790467
This commit is contained in:
Saran Tunyasuvunakool
2020-08-10 14:19:22 +01:00
parent d687ecc9fb
commit 7a07dc8e47
14 changed files with 1960 additions and 0 deletions
+1
View File
@@ -24,6 +24,7 @@ https://deepmind.com/research/publications/
## Projects
* [Catch & Carry: Reusable Neural Controllers for Vision-Guided Whole-Body Tasks](catch_carry), SIGGRAPH 2020
* [MEMO: A Deep Network For Flexible Combination Of Episodic Memories](memo), ICLR 2020
* [RL Unplugged: Benchmarks for Offline Reinforcement Learning](rl_unplugged)
* [Disentangling by Subspace Diffusion (GEOMANCER)](geomancer)
+123
View File
@@ -0,0 +1,123 @@
# Catch & Carry: Reusable Neural Controllers for Vision-Guided Whole-Body Tasks
This package contains motion capture data and tasks associated with "Catch &
Carry: Reusable Neural Controllers for Vision-Guided Whole-Body Tasks"
(https://arxiv.org/abs/1911.06636), which was published at SIGGRAPH 2020.
This is research code, and has dependencies on more stable code that is
available as part of [`dm_control`], in particular upon components in
[`dm_control.locomotion`].
To get access to preconfigured python environments for the "warehouse" and "ball
toss" tasks, see the `task_examples.py` file. To use the MuJoCo interactive
viewer (from dm_control) to load the environments, see `explore.py`.
<p float="left">
<img src="tasks.png" height="200">
</p>
## Installation instructions
1. Download [MuJoCo Pro](https://mujoco.org/) and extract the zip archive as
`~/.mujoco/mujoco200_$PLATFORM` where `$PLATFORM` is one of `linux`,
`macos`, or `win64`.
2. Ensure that a valid MuJoCo license key file is located at
`~/.mujoco/mjkey.txt`.
3. Clone the `deepmind-research` repository:
```shell
git clone https://github.com/deepmind/deepmind-research.git
cd deepmind-research
```
4. Create and activate a Python virtual environment:
```shell
python3 -m virtualenv catch_carry
source catch_carry/bin/activate
```
5. Install the package:
```shell
pip install ./catch_carry
```
## Quickstart
To instantiate and step through the warehouse task:
```python
from catch_carry import task_examples
import numpy as np
# Build an example environment.
env = task_examples.build_vision_warehouse()
# Get the `action_spec` describing the control inputs.
action_spec = env.action_spec()
# Step through the environment for one episode with random actions.
time_step = env.reset()
while not time_step.last():
action = np.random.uniform(action_spec.minimum, action_spec.maximum,
size=action_spec.shape)
time_step = env.step(action)
print("reward = {}, discount = {}, observations = {}.".format(
time_step.reward, time_step.discount, time_step.observation))
```
The above code snippet can also be used for the ball toss task by replacing
`build_vision_warehouse` with `build_vision_toss`.
## Visualization
[`dm_control.viewer`] can be used to visualize and interact with the
environment. We provide the `explore.py` script specifically for this. If you
followed our installation instructions above, this can be launched for the
warehouse task via:
```shell
python3 -m catch_carry.explore --task=warehouse
```
and for the ball toss task via:
```shell
python3 -m catch_carry.explore --task=toss
```
## Citation
If you use the code or data in this package, please cite:
```
@article{merel2020catch,
title = {Catch \& Carry: Reusable Neural Controllers for
Vision-Guided Whole-Body Tasks},
author = {Merel, Josh and
Tunyasuvunakool, Saran and
Ahuja, Arun and
Tassa, Yuval and
Hasenclever, Leonard and
Pham, Vu and
Erez, Tom and
Wayne, Greg and
Heess, Nicolas},
journal = {ACM Trans. Graph.},
issue_date = {July 2020},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {39},
number = {4},
numpages = {14},
issn = {0730-0301},
year = {2020},
month = jul,
}
```
[`dm_control`]: https://github.com/deepmind/dm_control
[`dm_control.locomotion`]: https://github.com/deepmind/dm_control/tree/master/dm_control/locomotion
[`dm_control.viewer`]: https://github.com/deepmind/dm_control/tree/master/dm_control/viewer
+13
View File
@@ -0,0 +1,13 @@
# Copyright 2020 DeepMind Technologies Limited
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
+171
View File
@@ -0,0 +1,171 @@
# Copyright 2020 Deepmind Technologies Limited.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Utility for opening arms until they are not in contact with a prop."""
import contextlib
from dm_control.mujoco.wrapper import mjbindings
import numpy as np
_MAX_IK_ATTEMPTS = 100
_IK_MAX_CORRECTION_WEIGHT = 0.1
_JOINT_LIMIT_TOLERANCE = 1e-4
_GAP_TOLERANCE = 0.1
class _ArmPropContactRemover(object):
"""Helper class for removing contacts between an arm and a prop via IK."""
def __init__(self, physics, arm_root, prop, gap):
arm_geoms = arm_root.find_all('geom')
self._arm_geom_ids = set(physics.bind(arm_geoms).element_id)
arm_joints = arm_root.find_all('joint')
self._arm_joint_ids = list(physics.bind(arm_joints).element_id)
self._arm_qpos_indices = physics.model.jnt_qposadr[self._arm_joint_ids]
self._arm_dof_indices = physics.model.jnt_dofadr[self._arm_joint_ids]
self._prop_geoms = prop.find_all('geom')
self._prop_geom_ids = set(physics.bind(self._prop_geoms).element_id)
self._arm_joint_min = np.full(len(self._arm_joint_ids), float('-inf'),
dtype=physics.model.jnt_range.dtype)
self._arm_joint_max = np.full(len(self._arm_joint_ids), float('inf'),
dtype=physics.model.jnt_range.dtype)
for i, joint_id in enumerate(self._arm_joint_ids):
if physics.model.jnt_limited[joint_id]:
self._arm_joint_min[i], self._arm_joint_max[i] = (
physics.model.jnt_range[joint_id])
self._gap = gap
def _contact_pair_is_relevant(self, contact):
set1 = self._arm_geom_ids
set2 = self._prop_geom_ids
return ((contact.geom1 in set1 and contact.geom2 in set2) or
(contact.geom2 in set1 and contact.geom1 in set2))
def _forward_and_find_next_contact(self, physics):
"""Forwards the physics and finds the next contact to handle."""
physics.forward()
next_contact = None
for contact in physics.data.contact:
if (self._contact_pair_is_relevant(contact) and
(next_contact is None or contact.dist < next_contact.dist)):
next_contact = contact
return next_contact
def _remove_contact_ik_iteration(self, physics, contact):
"""Performs one linearized IK iteration to remove the specified contact."""
if contact.geom1 in self._arm_geom_ids:
sign = -1
geom_id = contact.geom1
else:
sign = 1
geom_id = contact.geom2
body_id = physics.model.geom_bodyid[geom_id]
normal = sign * contact.frame[:3]
jac_dtype = physics.data.qpos.dtype
jac = np.empty((6, physics.model.nv), dtype=jac_dtype)
jac_pos, jac_rot = jac[:3], jac[3:]
mjbindings.mjlib.mj_jacPointAxis(
physics.model.ptr, physics.data.ptr,
jac_pos, jac_rot,
contact.pos + (contact.dist / 2) * normal, normal, body_id)
# Calculate corrections w.r.t. all joints, disregarding joint limits.
delta_xpos = normal * max(0, self._gap - contact.dist)
jac_all_joints = jac_pos[:, self._arm_dof_indices]
update_unfiltered = np.linalg.lstsq(
jac_all_joints, delta_xpos, rcond=None)[0]
# Filter out joints at limit that are corrected in the "wrong" direction.
initial_qpos = np.array(physics.data.qpos[self._arm_qpos_indices])
min_filter = np.logical_and(
initial_qpos - self._arm_joint_min < _JOINT_LIMIT_TOLERANCE,
update_unfiltered < 0)
max_filter = np.logical_and(
self._arm_joint_max - initial_qpos < _JOINT_LIMIT_TOLERANCE,
update_unfiltered > 0)
active_joints = np.where(
np.logical_not(np.logical_or(min_filter, max_filter)))[0]
# Calculate corrections w.r.t. valid joints only.
active_dof_indices = self._arm_dof_indices[active_joints]
jac_joints = jac_pos[:, active_dof_indices]
update_filtered = np.linalg.lstsq(jac_joints, delta_xpos, rcond=None)[0]
update_nv = np.zeros(physics.model.nv, dtype=jac_dtype)
update_nv[active_dof_indices] = update_filtered
# Calculate maximum correction weight that does not violate joint limits.
weights = np.full_like(update_filtered, _IK_MAX_CORRECTION_WEIGHT)
active_initial_qpos = initial_qpos[active_joints]
active_joint_min = self._arm_joint_min[active_joints]
active_joint_max = self._arm_joint_max[active_joints]
for i in range(len(weights)):
proposed_update = update_filtered[i]
if proposed_update > 0:
max_allowed_update = active_joint_max[i] - active_initial_qpos[i]
weights[i] = min(max_allowed_update / proposed_update, weights[i])
elif proposed_update < 0:
min_allowed_update = active_joint_min[i] - active_initial_qpos[i]
weights[i] = min(min_allowed_update / proposed_update, weights[i])
weight = min(weights)
# Integrate the correction into `qpos`.
mjbindings.mjlib.mj_integratePos(
physics.model.ptr, physics.data.qpos, update_nv, weight)
# "Paranoid" clip the modified joint `qpos` to within joint limits.
active_qpos_indices = self._arm_qpos_indices[active_joints]
physics.data.qpos[active_qpos_indices] = np.clip(
physics.data.qpos[active_qpos_indices],
active_joint_min, active_joint_max)
@contextlib.contextmanager
def _override_margins_and_gaps(self, physics):
"""Context manager that overrides geom margins and gaps to `self._gap`."""
prop_geom_bindings = physics.bind(self._prop_geoms)
original_margins = np.array(prop_geom_bindings.margin)
original_gaps = np.array(prop_geom_bindings.gap)
prop_geom_bindings.margin = self._gap * (1 - _GAP_TOLERANCE)
prop_geom_bindings.gap = self._gap * (1 - _GAP_TOLERANCE)
yield
prop_geom_bindings.margin = original_margins
prop_geom_bindings.gap = original_gaps
physics.forward()
def remove_contacts(self, physics):
with self._override_margins_and_gaps(physics):
for _ in range(_MAX_IK_ATTEMPTS):
contact = self._forward_and_find_next_contact(physics)
if contact is None:
return
self._remove_contact_ik_iteration(physics, contact)
contact = self._forward_and_find_next_contact(physics)
if contact and contact.dist < 0:
raise RuntimeError(
'Failed to remove contact with prop after {} iterations. '
'Final contact distance is {}.'.format(
_MAX_IK_ATTEMPTS, contact.dist))
def open_arms_for_prop(physics, left_arm_root, right_arm_root, prop, gap):
"""Opens left and right arms so as to leave a specified gap with the prop."""
left_arm_opener = _ArmPropContactRemover(physics, left_arm_root, prop, gap)
left_arm_opener.remove_contacts(physics)
right_arm_opener = _ArmPropContactRemover(physics, right_arm_root, prop, gap)
right_arm_opener.remove_contacts(physics)
+319
View File
@@ -0,0 +1,319 @@
# Copyright 2020 Deepmind Technologies Limited.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""A ball-tossing task."""
import collections
from dm_control import composer
from dm_control import mjcf
from dm_control.composer import variation
from dm_control.composer.observation import observable
from dm_control.locomotion.arenas import floors
from dm_control.locomotion.mocap import loader as mocap_loader
import numpy as np
from catch_carry import mocap_data
from catch_carry import props
from catch_carry import trajectories
_PHYSICS_TIMESTEP = 0.005
_BUCKET_SIZE = (0.2, 0.2, 0.02)
# Magnitude of the sparse reward.
_SPARSE_REWARD = 1.0
class BallToss(composer.Task):
"""A task involving catching and throwing a ball."""
def __init__(self, walker,
proto_modifier=None,
negative_reward_on_failure_termination=True,
priority_friction=False,
bucket_offset=1.,
y_range=0.5,
toss_delay=0.5,
randomize_init=False,
):
"""Initialize ball tossing task.
Args:
walker: the walker to be used in this task.
proto_modifier: function to modify trajectory proto.
negative_reward_on_failure_termination: flag to provide negative reward
as task fails.
priority_friction: sets friction priority thereby making prop objects have
higher friction.
bucket_offset: distance in meters to push bucket (away from walker)
y_range: range (uniformly sampled) of distance in meters the ball is
thrown left/right of the walker.
toss_delay: time in seconds to delay after catching before changing reward
to encourage throwing the ball.
randomize_init: flag to randomize initial pose.
"""
self._proto_modifier = proto_modifier
self._negative_reward_on_failure_termination = (
negative_reward_on_failure_termination)
self._priority_friction = priority_friction
self._bucket_rewarded = False
self._bucket_offset = bucket_offset
self._y_range = y_range
self._toss_delay = toss_delay
self._randomize_init = randomize_init
# load a clip to grab a ball prop and initializations
loader = mocap_loader.HDF5TrajectoryLoader(
mocap_data.H5_PATH, trajectories.WarehouseTrajectory)
clip_number = 54
self._trajectory = loader.get_trajectory(
mocap_data.IDENTIFIER_TEMPLATE.format(clip_number))
# create the floor arena
self._arena = floors.Floor()
self._walker = walker
self._walker_geoms = tuple(self._walker.mjcf_model.find_all('geom'))
self._feet_geoms = (
walker.mjcf_model.find('body', 'lfoot').find_all('geom') +
walker.mjcf_model.find('body', 'rfoot').find_all('geom'))
self._lhand_geoms = (
walker.mjcf_model.find('body', 'lhand').find_all('geom'))
self._rhand_geoms = (
walker.mjcf_model.find('body', 'rhand').find_all('geom'))
# resize the humanoid based on the motion capture data subject
self._trajectory.configure_walkers([self._walker])
walker.create_root_joints(self._arena.attach(walker))
control_timestep = self._trajectory.dt
self.set_timesteps(control_timestep, _PHYSICS_TIMESTEP)
# build and attach the bucket to the arena
self._bucket = props.Bucket(_BUCKET_SIZE)
self._arena.attach(self._bucket)
self._prop = self._trajectory.create_props(
priority_friction=self._priority_friction)[0]
self._arena.add_free_entity(self._prop)
self._task_observables = collections.OrderedDict()
# define feature based observations (agent may or may not use these)
def ego_prop_xpos(physics):
prop_xpos, _ = self._prop.get_pose(physics)
walker_xpos = physics.bind(self._walker.root_body).xpos
return self._walker.transform_vec_to_egocentric_frame(
physics, prop_xpos - walker_xpos)
self._task_observables['prop_{}/xpos'.format(0)] = (
observable.Generic(ego_prop_xpos))
def prop_zaxis(physics):
prop_xmat = physics.bind(
mjcf.get_attachment_frame(self._prop.mjcf_model)).xmat
return prop_xmat[[2, 5, 8]]
self._task_observables['prop_{}/zaxis'.format(0)] = (
observable.Generic(prop_zaxis))
def ego_bucket_xpos(physics):
bucket_xpos, _ = self._bucket.get_pose(physics)
walker_xpos = physics.bind(self._walker.root_body).xpos
return self._walker.transform_vec_to_egocentric_frame(
physics, bucket_xpos - walker_xpos)
self._task_observables['bucket_{}/xpos'.format(0)] = (
observable.Generic(ego_bucket_xpos))
for obs in (self._walker.observables.proprioception +
self._walker.observables.kinematic_sensors +
self._walker.observables.dynamic_sensors +
list(self._task_observables.values())):
obs.enabled = True
@property
def root_entity(self):
return self._arena
@property
def task_observables(self):
return self._task_observables
@property
def name(self):
return 'ball_toss'
def initialize_episode_mjcf(self, random_state):
self._reward = 0.0
self._discount = 1.0
self._should_terminate = False
self._prop.detach()
if self._proto_modifier:
trajectory = self._trajectory.get_modified_trajectory(
self._proto_modifier)
self._prop = trajectory.create_props(
priority_friction=self._priority_friction)[0]
self._arena.add_free_entity(self._prop)
# set the bucket position for this episode
bucket_distance = 1.*random_state.rand()+self._bucket_offset
mjcf.get_attachment_frame(self._bucket.mjcf_model).pos = [bucket_distance,
0, 0]
def initialize_episode(self, physics, random_state):
self._ground_geomid = physics.bind(
self._arena.mjcf_model.worldbody.geom[0]).element_id
self._feet_geomids = set(physics.bind(self._feet_geoms).element_id)
self._lhand_geomids = set(physics.bind(self._lhand_geoms).element_id)
self._rhand_geomids = set(physics.bind(self._rhand_geoms).element_id)
self._walker_geomids = set(physics.bind(self._walker_geoms).element_id)
self._bucket_rewarded = False
if self._randomize_init:
timestep_ind = random_state.randint(
len(self._trajectory._proto.timesteps)) # pylint: disable=protected-access
else:
timestep_ind = 0
walker_init_timestep = self._trajectory._proto.timesteps[timestep_ind] # pylint: disable=protected-access
prop_init_timestep = self._trajectory._proto.timesteps[0] # pylint: disable=protected-access
self._walker.set_pose(
physics,
position=walker_init_timestep.walkers[0].position,
quaternion=walker_init_timestep.walkers[0].quaternion)
self._walker.set_velocity(
physics, velocity=walker_init_timestep.walkers[0].velocity,
angular_velocity=walker_init_timestep.walkers[0].angular_velocity)
physics.bind(self._walker.mocap_joints).qpos = (
walker_init_timestep.walkers[0].joints)
physics.bind(self._walker.mocap_joints).qvel = (
walker_init_timestep.walkers[0].joints_velocity)
initial_prop_pos = np.copy(prop_init_timestep.props[0].position)
initial_prop_pos[0] += 1. # move ball (from mocap) relative to origin
initial_prop_pos[1] = 0 # align ball with walker along y-axis
self._prop.set_pose(
physics,
position=initial_prop_pos,
quaternion=prop_init_timestep.props[0].quaternion)
# specify the distributions of ball velocity componentwise
x_vel_mag = 4.5*random_state.rand()+1.5 # m/s
x_dist = 3 # approximate initial distance from walker to ball
self._t_dist = x_dist/x_vel_mag # target time at which to hit the humanoid
z_offset = .4*random_state.rand()+.1 # height at which to hit person
# compute velocity to satisfy desired projectile trajectory
z_vel_mag = (4.9*(self._t_dist**2) + z_offset)/self._t_dist
y_range = variation.evaluate(self._y_range, random_state=random_state)
y_vel_mag = y_range*random_state.rand()-y_range/2
trans_vel = [-x_vel_mag, y_vel_mag, z_vel_mag]
ang_vel = 1.5*random_state.rand(3)-0.75
self._prop.set_velocity(
physics,
velocity=trans_vel,
angular_velocity=ang_vel)
def after_step(self, physics, random_state):
# First we check for failure termination (walker or ball touches ground).
ground_failure = False
for contact in physics.data.contact:
if ((contact.geom1 == self._ground_geomid and
contact.geom2 not in self._feet_geomids) or
(contact.geom2 == self._ground_geomid and
contact.geom1 not in self._feet_geomids)):
ground_failure = True
break
contact_features = self._evaluate_contacts(physics)
prop_lhand, prop_rhand, bucket_prop, bucket_walker, walker_prop = contact_features
# or also fail if walker hits bucket
if ground_failure or bucket_walker:
if self._negative_reward_on_failure_termination:
self._reward = -_SPARSE_REWARD
else:
self._reward = 0.0
self._should_terminate = True
self._discount = 0.0
return
self._reward = 0.0
# give reward if prop is in bucket (prop touching bottom surface of bucket)
if bucket_prop:
self._reward += _SPARSE_REWARD/10
# shaping reward for being closer to bucket
if physics.data.time > (self._t_dist + self._toss_delay):
bucket_xy = physics.bind(self._bucket.geom).xpos[0][:2]
prop_xy = self._prop.get_pose(physics)[0][:2]
xy_dist = np.sum(np.array(np.abs(bucket_xy - prop_xy)))
self._reward += np.exp(-xy_dist/3.)*_SPARSE_REWARD/50
else:
# bonus for hands touching ball
if prop_lhand:
self._reward += _SPARSE_REWARD/100
if prop_rhand:
self._reward += _SPARSE_REWARD/100
# combined with penalty for other body parts touching the ball
if walker_prop:
self._reward -= _SPARSE_REWARD/100
def get_reward(self, physics):
return self._reward
def get_discount(self, physics):
return self._discount
def should_terminate_episode(self, physics):
return self._should_terminate
def _evaluate_contacts(self, physics):
prop_elem_id = physics.bind(self._prop.geom).element_id
bucket_bottom_elem_id = physics.bind(self._bucket.geom[0]).element_id
bucket_any_elem_id = set(physics.bind(self._bucket.geom).element_id)
prop_lhand_contact = False
prop_rhand_contact = False
bucket_prop_contact = False
bucket_walker_contact = False
walker_prop_contact = False
for contact in physics.data.contact:
has_prop = (contact.geom1 == prop_elem_id or
contact.geom2 == prop_elem_id)
has_bucket_bottom = (contact.geom1 == bucket_bottom_elem_id or
contact.geom2 == bucket_bottom_elem_id)
has_bucket_any = (contact.geom1 in bucket_any_elem_id or
contact.geom2 in bucket_any_elem_id)
has_lhand = (contact.geom1 in self._lhand_geomids or
contact.geom2 in self._lhand_geomids)
has_rhand = (contact.geom1 in self._rhand_geomids or
contact.geom2 in self._rhand_geomids)
has_walker = (contact.geom1 in self._walker_geomids or
contact.geom2 in self._walker_geomids)
if has_prop and has_bucket_bottom:
bucket_prop_contact = True
if has_walker and has_bucket_any:
bucket_walker_contact = True
if has_walker and has_prop:
walker_prop_contact = True
if has_prop and has_lhand:
prop_lhand_contact = True
if has_prop and has_rhand:
prop_rhand_contact = True
return (prop_lhand_contact, prop_rhand_contact, bucket_prop_contact,
bucket_walker_contact, walker_prop_contact)
+37
View File
@@ -0,0 +1,37 @@
# Copyright 2020 Deepmind Technologies Limited.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Simple script to launch viewer with an example environment."""
from absl import app
from absl import flags
from dm_control import viewer
from catch_carry import task_examples
FLAGS = flags.FLAGS
flags.DEFINE_enum('task', 'warehouse', ['warehouse', 'toss'],
'The task to visualize.')
TASKS = {
'warehouse': task_examples.build_vision_warehouse,
'toss': task_examples.build_vision_toss,
}
def main(unused_argv):
viewer.launch(environment_loader=TASKS[FLAGS.task])
if __name__ == '__main__':
app.run(main)
Binary file not shown.
+187
View File
@@ -0,0 +1,187 @@
# Copyright 2020 Deepmind Technologies Limited.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Metadata for mocap clips that correspond to a walker carrying a prop."""
import collections
import enum
import os
from dm_control.locomotion.mocap import loader as mocap_loader
from catch_carry import trajectories
H5_DIR = os.path.dirname(__file__)
H5_PATH = os.path.join(H5_DIR, 'mocap_data.h5')
IDENTIFIER_PREFIX = 'DeepMindCatchCarry'
IDENTIFIER_TEMPLATE = IDENTIFIER_PREFIX + '-{:03d}'
ClipInfo = collections.namedtuple(
'ClipInfo', ('clip_identifier', 'num_steps', 'dt', 'flags'))
class Flag(enum.IntEnum):
BOX = 1 << 0
BALL = 1 << 1
LIGHT_PROP = 1 << 2
HEAVY_PROP = 1 << 3
SMALL_PROP = 1 << 4
LARGE_PROP = 1 << 5
FLOOR_LEVEL = 1 << 6
MEDIUM_PEDESTAL = 1 << 7
HIGH_PEDESTAL = 1 << 8
_ALL_CLIPS = None
def _get_clip_info(loader, clip_number, flags):
clip = loader.get_trajectory(IDENTIFIER_TEMPLATE.format(clip_number))
return ClipInfo(
clip_identifier=clip.identifier,
num_steps=clip.num_steps,
dt=clip.dt,
flags=flags)
def _get_all_clip_infos_if_necessary():
"""Creates the global _ALL_CLIPS list if it has not already been created."""
global _ALL_CLIPS
if _ALL_CLIPS is None:
loader = mocap_loader.HDF5TrajectoryLoader(
H5_PATH, trajectories.WarehouseTrajectory)
clip_numbers = (1, 2, 3, 4, 5, 6, 9, 10,
11, 12, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28,
29, 30, 31, 32, 33, 34, 35, 36,
37, 38, 39, 40, 42, 43, 44, 45,
46, 47, 48, 49, 50, 51, 52, 53)
clip_infos = []
for i, clip_number in enumerate(clip_numbers):
flags = 0
if i in _FLOOR_LEVEL:
flags |= Flag.FLOOR_LEVEL
elif i in _MEDIUM_PEDESTAL:
flags |= Flag.MEDIUM_PEDESTAL
elif i in _HIGH_PEDESTAL:
flags |= Flag.HIGH_PEDESTAL
if i in _LIGHT_PROP:
flags |= Flag.LIGHT_PROP
elif i in _HEAVY_PROP:
flags |= Flag.HEAVY_PROP
if i in _SMALL_BOX:
flags |= Flag.SMALL_PROP
flags |= Flag.BOX
elif i in _LARGE_BOX:
flags |= Flag.LARGE_PROP
flags |= Flag.BOX
elif i in _SMALL_BALL:
flags |= Flag.SMALL_PROP
flags |= Flag.BALL
elif i in _LARGE_BALL:
flags |= Flag.LARGE_PROP
flags |= Flag.BALL
clip_infos.append(_get_clip_info(loader, clip_number, flags))
_ALL_CLIPS = tuple(clip_infos)
def _assert_partitions_all_clips(*args):
"""Asserts that a given set of subcollections partitions ALL_CLIPS."""
sets = tuple(set(arg) for arg in args)
# Check that the union of all the sets is ALL_CLIPS.
union = set()
for subset in sets:
union = union | set(subset)
assert union == set(range(48))
# Check that the sets are pairwise disjoint.
for i in range(len(sets)):
for j in range(i + 1, len(sets)):
assert sets[i] & sets[j] == set()
_FLOOR_LEVEL = tuple(range(0, 16))
_MEDIUM_PEDESTAL = tuple(range(16, 32))
_HIGH_PEDESTAL = tuple(range(32, 48))
_assert_partitions_all_clips(_FLOOR_LEVEL, _MEDIUM_PEDESTAL, _HIGH_PEDESTAL)
_LIGHT_PROP = (0, 1, 2, 3, 8, 9, 12, 13, 16, 17, 18, 19, 24,
25, 26, 27, 34, 35, 38, 39, 42, 43, 46, 47)
_HEAVY_PROP = (4, 5, 6, 7, 10, 11, 14, 15, 20, 21, 22, 23, 28,
29, 30, 31, 32, 33, 36, 37, 40, 41, 44, 45)
_assert_partitions_all_clips(_LIGHT_PROP, _HEAVY_PROP)
_SMALL_BOX = (0, 1, 4, 5, 16, 17, 20, 21, 34, 35, 36, 37)
_LARGE_BOX = (2, 3, 6, 7, 18, 19, 22, 23, 32, 33, 38, 39)
_SMALL_BALL = (8, 9, 10, 11, 24, 25, 30, 31, 40, 41, 46, 47)
_LARGE_BALL = (12, 13, 14, 15, 26, 27, 28, 29, 42, 43, 44, 45)
_assert_partitions_all_clips(_SMALL_BOX, _LARGE_BOX, _SMALL_BALL, _LARGE_BALL)
def all_clips():
_get_all_clip_infos_if_necessary()
return _ALL_CLIPS
def floor_level():
clips = all_clips()
return tuple(clips[i] for i in _FLOOR_LEVEL)
def medium_pedestal():
clips = all_clips()
return tuple(clips[i] for i in _MEDIUM_PEDESTAL)
def high_pedestal():
clips = all_clips()
return tuple(clips[i] for i in _HIGH_PEDESTAL)
def light_prop():
clips = all_clips()
return tuple(clips[i] for i in _LIGHT_PROP)
def heavy_prop():
clips = all_clips()
return tuple(clips[i] for i in _HEAVY_PROP)
def small_box():
clips = all_clips()
return tuple(clips[i] for i in _SMALL_BOX)
def large_box():
clips = all_clips()
return tuple(clips[i] for i in _LARGE_BOX)
def small_ball():
clips = all_clips()
return tuple(clips[i] for i in _SMALL_BALL)
def large_ball():
clips = all_clips()
return tuple(clips[i] for i in _LARGE_BALL)
+86
View File
@@ -0,0 +1,86 @@
# Copyright 2020 Deepmind Technologies Limited.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""A rectangular pedestal."""
from dm_control import composer
from dm_control import mjcf
class Pedestal(composer.Entity):
"""A rectangular pedestal."""
def _build(self, size=(.2, .3, .05), rgba=(0, .5, 0, 1), name='pedestal'):
self._mjcf_root = mjcf.RootElement(model=name)
self._geom = self._mjcf_root.worldbody.add(
'geom', type='box', size=size, name='geom', rgba=rgba)
@property
def mjcf_model(self):
return self._mjcf_root
@property
def geom(self):
return self._geom
def after_compile(self, physics, unused_random_state):
super(Pedestal, self).after_compile(physics, unused_random_state)
self._body_geom_ids = set(
physics.bind(geom).element_id
for geom in self.mjcf_model.find_all('geom'))
@property
def body_geom_ids(self):
return self._body_geom_ids
class Bucket(composer.Entity):
"""A rectangular bucket."""
def _build(self, size=(.2, .3, .05), rgba=(0, .5, 0, 1), name='pedestal'):
self._mjcf_root = mjcf.RootElement(model=name)
self._geoms = []
self._geoms.append(self._mjcf_root.worldbody.add(
'geom', type='box', size=size, name='geom_bottom', rgba=rgba))
self._geoms.append(self._mjcf_root.worldbody.add(
'geom', type='box', size=(size[2], size[1], size[0]), name='geom_s1',
rgba=rgba, pos=[size[0], 0, size[0]]))
self._geoms.append(self._mjcf_root.worldbody.add(
'geom', type='box', size=(size[2], size[1], size[0]), name='geom_s2',
rgba=rgba, pos=[-size[0], 0, size[0]]))
self._geoms.append(self._mjcf_root.worldbody.add(
'geom', type='box', size=(size[0], size[2], size[0]), name='geom_s3',
rgba=rgba, pos=[0, size[1], size[0]]))
self._geoms.append(self._mjcf_root.worldbody.add(
'geom', type='box', size=(size[0], size[2], size[0]), name='geom_s4',
rgba=rgba, pos=[0, -size[1], size[0]]))
@property
def mjcf_model(self):
return self._mjcf_root
@property
def geom(self):
return self._geoms
def after_compile(self, physics, unused_random_state):
super(Bucket, self).after_compile(physics, unused_random_state)
self._body_geom_ids = set(
physics.bind(geom).element_id
for geom in self.mjcf_model.find_all('geom'))
@property
def body_geom_ids(self):
return self._body_geom_ids
+35
View File
@@ -0,0 +1,35 @@
# Copyright 2020 DeepMind Technologies Limited
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Setup for pip package."""
from setuptools import find_packages
from setuptools import setup
REQUIRED_PACKAGES = ['absl-py', 'dm_control', 'numpy']
setup(
name='catch_carry',
version='0.1',
description='Whole-body object manipulation tasks and motion capture data.',
url='https://github.com/deepmind/deepmind-research/catch_carry',
author='DeepMind',
author_email='stunya@google.com',
# Contained modules and scripts.
packages=find_packages(),
install_requires=REQUIRED_PACKAGES,
platforms=['any'],
license='Apache 2.0',
)
+82
View File
@@ -0,0 +1,82 @@
# Copyright 2020 Deepmind Technologies Limited.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Functions that build representative tasks."""
from dm_control import composer
from dm_control.composer.variation import distributions
from dm_control.locomotion.mocap import loader as mocap_loader
from dm_control.locomotion.walkers import cmu_humanoid
from catch_carry import ball_toss
from catch_carry import warehouse
def build_vision_warehouse(random_state=None):
"""Build canonical 4-pedestal, 2-prop task."""
# Build a position-controlled CMU humanoid walker.
walker = cmu_humanoid.CMUHumanoidPositionControlled(
observable_options={'egocentric_camera': dict(enabled=True)})
# Build the task.
size_distribution = distributions.Uniform(low=0.75, high=1.25)
mass_distribution = distributions.Uniform(low=2, high=7)
prop_resizer = mocap_loader.PropResizer(size_factor=size_distribution,
mass=mass_distribution)
task = warehouse.PhasedBoxCarry(
walker=walker,
num_props=2,
num_pedestals=4,
proto_modifier=prop_resizer,
negative_reward_on_failure_termination=True)
# return the environment
return composer.Environment(
time_limit=15,
task=task,
random_state=random_state,
strip_singleton_obs_buffer_dim=True,
max_reset_attempts=float('inf'))
def build_vision_toss(random_state=None):
"""Build canonical ball tossing task."""
# Build a position-controlled CMU humanoid walker.
walker = cmu_humanoid.CMUHumanoidPositionControlled(
observable_options={'egocentric_camera': dict(enabled=True)})
# Build the task.
size_distribution = distributions.Uniform(low=0.95, high=1.5)
mass_distribution = distributions.Uniform(low=2, high=4)
prop_resizer = mocap_loader.PropResizer(size_factor=size_distribution,
mass=mass_distribution)
task = ball_toss.BallToss(
walker=walker,
proto_modifier=prop_resizer,
negative_reward_on_failure_termination=True,
priority_friction=True,
bucket_offset=3.,
y_range=0.5,
toss_delay=1.5,
randomize_init=True)
# return the environment
return composer.Environment(
time_limit=6,
task=task,
random_state=random_state,
strip_singleton_obs_buffer_dim=True,
max_reset_attempts=float('inf'))
Binary file not shown.

After

Width:  |  Height:  |  Size: 181 KiB

+225
View File
@@ -0,0 +1,225 @@
# Copyright 2020 Deepmind Technologies Limited.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
"""Mocap trajectory that assumes props start stationary on pedestals."""
import copy
import enum
import itertools
from dm_control.locomotion.mocap import mocap_pb2
from dm_control.locomotion.mocap import trajectory
from dm_control.utils import transformations
import numpy as np
_PEDESTAL_SIZE = (0.2, 0.2, 0.02)
_MAX_SETTLE_STEPS = 100
@enum.unique
class ClipSegment(enum.Enum):
"""Annotations for subsegments within a warehouse clips."""
# Clip segment corresponding to a walker approaching an object
APPROACH = 1
# Clip segment corresponding to a walker picking up an object.
PICKUP = 2
# Clip segment corresponding to the "first half" of the walker carrying an
# object, beginning from the walker backing away from a pedestal with
# object in hand.
CARRY1 = 3
# Clip segment corresponding to the "second half" of the walker carrying an
# object, ending in the walker approaching a pedestal the object in hand.
CARRY2 = 4
# Clip segment corresponding to a walker putting down an object on a pedestal.
PUTDOWN = 5
# Clip segment corresponding to a walker backing off after successfully
# placing an object on a pedestal.
BACKOFF = 6
def _get_rotated_bounding_box(size, quaternion):
"""Calculates the bounding box of a rotated 3D box.
Args:
size: An array of length 3 specifying the half-lengths of a box.
quaternion: A unit quaternion specifying the box's orientation.
Returns:
An array of length 3 specifying the half-lengths of the bounding box of
the rotated box.
"""
corners = ((size[0], size[1], size[2]),
(size[0], size[1], -size[2]),
(size[0], -size[1], size[2]),
(-size[0], size[1], size[2]))
rotated_corners = tuple(
transformations.quat_rotate(quaternion, corner) for corner in corners)
return np.amax(np.abs(rotated_corners), axis=0)
def _get_prop_z_extent(prop_proto, quaternion):
"""Calculates the "z-extent" of the prop in given orientation.
This is the distance from the centre of the prop to its lowest point in the
world frame, taking into account the prop's orientation.
Args:
prop_proto: A `mocap_pb2.Prop` protocol buffer defining a prop.
quaternion: A unit quaternion specifying the prop's orientation.
Returns:
the distance from the centre of the prop to its lowest point in the
world frame in the specified orientation.
"""
if prop_proto.shape == mocap_pb2.Prop.BOX:
return _get_rotated_bounding_box(prop_proto.size, quaternion)[2]
elif prop_proto.shape == mocap_pb2.Prop.SPHERE:
return prop_proto.size[0]
else:
raise NotImplementedError(
'Unsupported prop shape: {}'.format(prop_proto.shape))
class WarehouseTrajectory(trajectory.Trajectory):
"""Mocap trajectory that assumes props start stationary on pedestals."""
def infer_pedestal_positions(self, num_averaged_steps=30,
ground_height_tolerance=0.1,
proto_modifier=None):
proto = self._proto
if proto_modifier is not None:
proto = copy.copy(proto)
proto_modifier(proto)
if not proto.props:
return []
positions = []
for timestep in itertools.islice(proto.timesteps, num_averaged_steps):
positions_for_timestep = []
for prop_proto, prop_timestep in zip(proto.props, timestep.props):
z_extent = _get_prop_z_extent(prop_proto, prop_timestep.quaternion)
positions_for_timestep.append([prop_timestep.position[0],
prop_timestep.position[1],
prop_timestep.position[2] - z_extent])
positions.append(positions_for_timestep)
median_positions = np.median(positions, axis=0)
median_positions[:, 2][median_positions[:, 2] < ground_height_tolerance] = 0
return median_positions
def get_props_z_extent(self, physics):
timestep = self._proto.timesteps[self._get_step_id(physics.time())]
out = []
for prop_proto, prop_timestep in zip(self._proto.props, timestep.props):
z_extent = _get_prop_z_extent(prop_proto, prop_timestep.quaternion)
out.append(z_extent)
return out
class SinglePropCarrySegmentedTrajectory(WarehouseTrajectory):
"""A mocap trajectory class that automatically segments prop-carry clips.
The algorithm implemented in the class only works if the trajectory consists
of exactly one walker and one prop. The value of `pedestal_zone_distance`
the exact nature of zone crossings are determined empirically from the
DeepMindCatchCarry dataset, and are likely to not work well outside of this
setting.
"""
def __init__(self,
proto,
start_time=None,
end_time=None,
pedestal_zone_distance=0.65,
start_step=None,
end_step=None,
zero_out_velocities=True):
super(SinglePropCarrySegmentedTrajectory, self).__init__(
proto, start_time, end_time, start_step=start_step, end_step=end_step,
zero_out_velocities=zero_out_velocities)
self._pedestal_zone_distance = pedestal_zone_distance
self._generate_segments()
def _generate_segments(self):
pedestal_position = self.infer_pedestal_positions()[0]
# First we find the timesteps at which the walker cross the pedestal's
# vicinity zone. This should happen exactly 4 times: enter it to pick up,
# leave it, enter it again to put down, and leave it again.
was_in_pedestal_zone = False
crossings = []
for i, timestep in enumerate(self._proto.timesteps):
pedestal_dist = np.linalg.norm(
timestep.walkers[0].position[:2] - pedestal_position[:2])
if pedestal_dist > self._pedestal_zone_distance and was_in_pedestal_zone:
crossings.append(i)
was_in_pedestal_zone = False
elif (pedestal_dist <= self._pedestal_zone_distance and
not was_in_pedestal_zone):
crossings.append(i)
was_in_pedestal_zone = True
if len(crossings) < 3:
raise RuntimeError(
'Failed to segment the given trajectory: '
'walker should cross the pedestal zone\'s boundary >= 3 times '
'but got {}'.format(len(crossings)))
elif len(crossings) == 3:
crossings.append(len(self._proto.timesteps) - 1)
elif len(crossings) > 4:
crossings = [crossings[0], crossings[1], crossings[-2], crossings[-1]]
# Identify the pick up event during the first in-zone interval.
start_position = np.array(self._proto.timesteps[0].props[0].position)
end_position = np.array(self._proto.timesteps[-1].props[0].position)
pick_up_step = crossings[1] - 1
while pick_up_step > crossings[0]:
prev_position = self._proto.timesteps[pick_up_step - 1].props[0].position
if np.linalg.norm(start_position[2] - prev_position[2]) < 0.001:
break
pick_up_step -= 1
# Identify the put down event during the second in-zone interval.
put_down_step = crossings[2]
while put_down_step <= crossings[3]:
next_position = self._proto.timesteps[put_down_step + 1].props[0].position
if np.linalg.norm(end_position[2] - next_position[2]) < 0.001:
break
put_down_step += 1
carry_halfway_step = int((crossings[1] + crossings[2]) / 2)
self._segment_intervals = {
ClipSegment.APPROACH: (0, crossings[0]),
ClipSegment.PICKUP: (crossings[0], pick_up_step),
ClipSegment.CARRY1: (pick_up_step, carry_halfway_step),
ClipSegment.CARRY2: (carry_halfway_step, crossings[2]),
ClipSegment.PUTDOWN: (crossings[2], put_down_step),
ClipSegment.BACKOFF: (put_down_step, len(self._proto.timesteps))
}
def segment_interval(self, segment):
start_step, end_step = self._segment_intervals[segment]
return (start_step * self._proto.dt, (end_step - 1) * self._proto.dt)
def get_random_timestep_in_segment(self, segment, random_step):
return self._proto.timesteps[
random_step.randint(*self._segment_intervals[segment])]
File diff suppressed because it is too large Load Diff