Update citation and add demo results for no-TVT with gamma<1.

PiperOrigin-RevId: 281522361
2026-05-31 13:05:40 +08:00 · 2019-11-20 16:07:58 +00:00
parent 5c9f992652
commit 94505a89e6
37 changed files with 5447 additions and 0 deletions
@@ -0,0 +1,66 @@
+# DM Lab Tasks
+
+## General Structure
+
+There are 7 [DM Lab](https://github.com/deepmind/lab) tasks presented here.
+Each level is composed of 3 distinct phases (except `Key To Door To Match`
+which has 5 phases). The first phase is the 'explore' phase, where the agent
+should learn a piece of information or do something. For all tasks, the 2nd
+phase is the 'distractor' phase, where the agent collects apples for rewards.
+The 3rd phase is the 'exploit' phase, where the agent gets rewards based on the
+knowledge acquired or actions performed in phase 1.
+
+## Specific Tasks
+
+### Passive Visual Match
+
+* Phase 1: A colour square right in front of the agent.
+* Phase 2: Apples collection.
+* Phase 3: Choose the colour square matched that in Phase 1 among 4 options.
+
+### Active Visual Match
+
+* Phase 1: A colour square randomly placed in a two-connected room.
+* Phase 2: Apples collection.
+* Phase 3: Choose the colour square matched that in Phase 1 among 4 options.
+
+### Key To Door
+
+* Phase 1: A key randomly placed in a two-connected room.
+* Phase 2: Apples collection.
+* Phase 3: A small room with a door. If agent has key, it can open the door to
+           get to the goal behind the door to get reward.
+
+### Key To Door Bluekey
+
+All the same as key_to_door above but the key has a blue colour instead of
+black.
+
+### Two Negative Keys
+
+* Phase 1: A blue and a red key placed in a small room. The agent can only
+           pick up one of the key.
+* Phase 2: Apples collection.
+* Phase 3: A small room with a door. If agent has either key, it can open the
+           door to get reward. The reward depends on which key it got in Phase 1
+           All the rewards are negative in this level.
+
+### Latent Information Acquisition
+
+* Phase 1: Thre randomly sampled objects are randomly placed in a small room.
+           When the agent touch each object, a red or green cue will appear,
+           indicating the reward it is associated in this episode. No rewards
+           are given in this phase.
+* Phase 2: Apples collection.
+* Phase 3: The same three objects in Phase 1 randomly placed again in the room.
+           The agent will get positive rewards if pick up the objects with green
+           cues in Phase 1, and get negative rewards for objects with red cues.
+
+### Key To Door To Match
+
+* Phase 1: A key is randomly placed in a room. Agent could pick it up.
+* Phase 2: Apples collection.
+* Phase 3: A colour square behind a door. If agent has key from Phase 1, it can
+           open the door to see the colour.
+* Phase 4: Apples collection.
+* Phase 5: Chose the colour square matched that in Phase 3 among 4 options.
@@ -0,0 +1,24 @@
+-- Copyright 2019 DeepMind Technologies Limited. All Rights Reserved.
+-- Licensed under the Apache License, Version 2.0 (the "License");
+-- you may not use this file except in compliance with the License.
+-- You may obtain a copy of the License at
+--    http://www.apache.org/licenses/LICENSE-2.0
+-- Unless required by applicable law or agreed to in writing, software
+-- distributed under the License is distributed on an "AS IS" BASIS,
+-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or  implied.
+-- See the License for the specific language governing permissions and
+-- limitations under the License.
+-- ============================================================================
+local factory = require 'visual_match_factory'
+
+return factory.createLevelApi{
+    exploreMapMode = 'TWO_ROOMS',
+    episodeLengthSeconds = 40,
+    exploreLengthSeconds = 5,
+    distractorLengthSeconds = 30,
+
+    differentDistractRoomTexture = true,
+    differentRewardRoomTexture = true,
+    correctReward = 10,
+    incorrectReward = 1,
+}
@@ -0,0 +1,42 @@
+-- Copyright 2019 DeepMind Technologies Limited. All Rights Reserved.
+-- Licensed under the Apache License, Version 2.0 (the "License");
+-- you may not use this file except in compliance with the License.
+-- You may obtain a copy of the License at
+--    http://www.apache.org/licenses/LICENSE-2.0
+-- Unless required by applicable law or agreed to in writing, software
+-- distributed under the License is distributed on an "AS IS" BASIS,
+-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or  implied.
+-- See the License for the specific language governing permissions and
+-- limitations under the License.
+-- ============================================================================
+local tensor = require 'dmlab.system.tensor'
+
+local utils = {}
+utils.COLORS = {
+    {0, 0, 0},
+    {0, 0, 170},
+    {0, 170, 0},
+    {0, 170, 170},
+    {170, 0, 0},
+    {170, 0, 170},
+    {170, 85, 0},
+    {170, 170, 170},
+    {85, 85, 85},
+    {85, 85, 255},
+    {85, 255, 85},
+    {85, 255, 255},
+    {255, 85, 85},
+    {255, 85, 255},
+    {255, 255, 85},
+    {255, 255, 255},
+}
+
+function utils:createByteImage(h, w, rgb)
+  return tensor.ByteTensor(h, w, 4):fill{rgb[1], rgb[2], rgb[3], 255}
+end
+
+function utils:createTransparentImage(h, w)
+  return tensor.ByteTensor(h, w, 4):fill{127, 127, 127, 0}
+end
+
+return utils
@@ -0,0 +1,20 @@
+-- Copyright 2019 DeepMind Technologies Limited. All Rights Reserved.
+-- Licensed under the Apache License, Version 2.0 (the "License");
+-- you may not use this file except in compliance with the License.
+-- You may obtain a copy of the License at
+--    http://www.apache.org/licenses/LICENSE-2.0
+-- Unless required by applicable law or agreed to in writing, software
+-- distributed under the License is distributed on an "AS IS" BASIS,
+-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or  implied.
+-- See the License for the specific language governing permissions and
+-- limitations under the License.
+-- ============================================================================
+local factory = require 'key_to_door_factory'
+
+return factory.createLevelApi{
+    episodeLengthSeconds = 37,
+    exploreLengthSeconds = 5,
+    distractorLengthSeconds = 30,
+    differentDistractRoomTexture = true,
+    differentRewardRoomTexture = true,
+}
@@ -0,0 +1,22 @@
+-- Copyright 2019 DeepMind Technologies Limited. All Rights Reserved.
+-- Licensed under the Apache License, Version 2.0 (the "License");
+-- you may not use this file except in compliance with the License.
+-- You may obtain a copy of the License at
+--    http://www.apache.org/licenses/LICENSE-2.0
+-- Unless required by applicable law or agreed to in writing, software
+-- distributed under the License is distributed on an "AS IS" BASIS,
+-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or  implied.
+-- See the License for the specific language governing permissions and
+-- limitations under the License.
+-- ============================================================================
+local factory = require 'key_to_door_factory'
+
+return factory.createLevelApi{
+    keyColor = {0, 0, 255},
+    episodeLengthSeconds = 37,
+    exploreLengthSeconds = 5,
+    distractorLengthSeconds = 30,
+    differentDistractRoomTexture = true,
+    differentRewardRoomTexture = true,
+}
+
@@ -0,0 +1,459 @@
+-- Copyright 2019 DeepMind Technologies Limited. All Rights Reserved.
+-- Licensed under the Apache License, Version 2.0 (the "License");
+-- you may not use this file except in compliance with the License.
+-- You may obtain a copy of the License at
+--    http://www.apache.org/licenses/LICENSE-2.0
+-- Unless required by applicable law or agreed to in writing, software
+-- distributed under the License is distributed on an "AS IS" BASIS,
+-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or  implied.
+-- See the License for the specific language governing permissions and
+-- limitations under the License.
+-- ============================================================================
+local make_map = require 'common.make_map'
+local custom_observations = require 'decorators.custom_observations'
+local debug_observations = require 'decorators.debug_observations'
+local game = require 'dmlab.system.game'
+local map_maker = require 'dmlab.system.map_maker'
+local maze_generation = require 'dmlab.system.maze_generation'
+local pickup_decorator = require 'decorators.human_recognisable_pickups'
+local random = require 'common.random'
+local setting_overrides = require 'decorators.setting_overrides'
+local texture_sets = require 'themes.texture_sets'
+local themes = require 'themes.themes'
+local hrp = require 'common.human_recognisable_pickups'
+
+local DEFAULTS = {
+    EPISODE_LENGTH_SECONDS = 15,
+    EXPLORE_LENGTH_SECONDS = 5,
+    DISTRACTOR_LENGTH_SECONDS = 5,
+    REWARD_LENGTH_SECONDS = nil,
+    SHOW_KEY_COLOR_SQUARE_SECONDS = 1,
+    PROB_APPLE_IN_DISTRACTOR_MAP = 0.3,
+    APPLE_REWARD = 5,
+    APPLE_REWARD_PROB = 1.0,
+    APPLE_EXTRA_REWARD_RANGE = 0,
+    GOAL_REWARD = 10,
+    DISTRACTOR_ROOM_SIZE = {11, 11},
+    DIFFERENT_DISTRACT_ROOM_TEXTURE = false,
+    DIFFERENT_REWARD_ROOM_TEXTURE = false,
+    KEY_COLOR = {0, 0, 0},
+}
+
+local APPLE_ID = 998
+local GOAL_ID = 999
+local KEY_SPAWN_ID = 1000
+local DOOR_ID = 1001
+
+local KEY_CUE_RECTANGLE_WIDTH = 600
+local KEY_CUE_RECTANGLE_HEIGHT = 200
+
+-- Table that maps from full decal name to decal index number.
+local decalIndices = {}
+
+local EXPLORE_MAP = "exploreMap"
+local DISTRACTOR_MAP = "distractorMap"
+local REWARD_MAP = "rewardMap"
+
+-- Set texture set for all maps.
+local textureSet = texture_sets.PACMAN
+local secondTextureSet = texture_sets.TETRIS
+local thirdTextureSet = texture_sets.TRON
+
+local REWARD_ROOM =[[
+***
+*P*
+*H*
+*G*
+***
+]]
+
+local OPEN_TWO_ROOM = [[
+*********
+*********
+*PKK*KKK*
+*KKKKKKK*
+*KKK*KKK*
+*********
+]]
+local N_KEY_POS_IN_TWO_ROOM = 18  -- # of K in OPEN_TWO_ROOM
+
+local function createDistractorMaze(opts)
+    -- Example room with height = 2, width = 3
+    -- A are possible apple locations (everywhere)
+    -- *****
+    -- *APA*
+    -- *AAA*
+    -- *****
+
+    local roomHeight = opts.roomSize[1]
+    local roomWidth = opts.roomSize[2]
+    centerWidth = 1 + math.ceil(roomWidth / 2)
+    local maze = maze_generation:mazeGeneration{
+        height = roomHeight + 2,  -- +2 for the two side of walls
+        width = roomWidth + 2
+    }
+
+    -- Fill the room with 'A' for apples. updateSpawnVars decides where to put.
+    for i = 2, roomHeight + 1 do
+      for j = 2, roomWidth + 1 do
+        maze:setEntityCell(i, j, 'A')
+      end
+    end
+    -- Override one cell with 'P' for spawn point.
+    maze:setEntityCell(2, centerWidth, 'P')
+    return maze
+end
+
+local function numPossibleAppleLocations(distractorRoomSize)
+  return distractorRoomSize[1] * distractorRoomSize[2] - 1
+end
+
+local factory = {}
+game:console('cg_drawScriptRectanglesAlways 1')
+
+function factory.createLevelApi(kwargs)
+  kwargs.episodeLengthSeconds = kwargs.episodeLengthSeconds or
+                                DEFAULTS.EPISODE_LENGTH_SECONDS
+  kwargs.exploreLengthSeconds = kwargs.exploreLengthSeconds or
+                                DEFAULTS.EXPLORE_LENGTH_SECONDS
+  kwargs.rewardLengthSeconds = kwargs.rewardLengthSeconds or
+                               DEFAULTS.REWARD_LENGTH_SECONDS
+  kwargs.distractorLengthSeconds = kwargs.distractorLengthSeconds or
+                                   DEFAULTS.DISTRACTOR_LENGTH_SECONDS
+  kwargs.distractorRoomSize = kwargs.distractorRoomSize or
+                              DEFAULTS.DISTRACTOR_ROOM_SIZE
+
+  kwargs.appleReward = kwargs.appleReward or DEFAULTS.APPLE_REWARD
+  kwargs.appleRewardProb = kwargs.appleRewardProb or DEFAULTS.APPLE_REWARD_PROB
+  kwargs.probAppleInDistractorMap = kwargs.probAppleInDistractorMap or
+                                    DEFAULTS.PROB_APPLE_IN_DISTRACTOR_MAP
+
+  kwargs.appleExtraRewardRange =
+      kwargs.appleExtraRewardRange or DEFAULTS.APPLE_EXTRA_REWARD_RANGE
+
+  kwargs.differentDistractRoomTexture = kwargs.differentDistractRoomTexture or
+                                        DEFAULTS.DIFFERENT_DISTRACT_ROOM_TEXTURE
+
+  kwargs.differentRewardRoomTexture = kwargs.differentRewardRoomTexture or
+                                      DEFAULTS.DIFFERENT_REWARD_ROOM_TEXTURE
+
+  kwargs.showKeyColorSquareSeconds = kwargs.showKeyColorSquareSeconds or
+                                     DEFAULTS.SHOW_KEY_COLOR_SQUARE_SECONDS
+  kwargs.goalReward = kwargs.goalReward or DEFAULTS.GOAL_REWARD
+  kwargs.keyColor = kwargs.keyColor or DEFAULTS.KEY_COLOR
+
+  local api = {}
+
+  function api:init(params)
+    self:_createExploreMap()
+    self:_createDistractorMap()
+    self:_createRewardMap()
+
+    local keyInfo = {
+        shape='key',
+        pattern='solid',
+        color1 = kwargs.keyColor,
+        color2 = kwargs.keyColor
+    }
+    self._keyObject = hrp.create(keyInfo)
+    self._keyCueRgba = {
+        kwargs.keyColor[1]/255,
+        kwargs.keyColor[2]/255,
+        kwargs.keyColor[3]/255,
+        1
+    }
+  end
+
+  function api:_createRewardMap()
+    self._rewardMap = map_maker:mapFromTextLevel{
+        mapName = REWARD_MAP,
+        entityLayer = REWARD_ROOM,
+    }
+
+    -- Create map theme and override default wall decal placement.
+    local texture = textureSet
+    if kwargs.differentRewardRoomTexture then
+      texture = thirdTextureSet
+    end
+    local rewardMapTheme = themes.fromTextureSet{
+        textureSet = texture,
+        decalFrequency = 0.0,
+        floorModelFrequency = 0.0,
+    }
+
+    self._rewardMap = map_maker:mapFromTextLevel{
+        mapName = REWARD_MAP,
+        entityLayer = REWARD_ROOM,
+        theme = rewardMapTheme,
+        callback = function (i, j, c, maker)
+          local pickup = self:_makePickup(c)
+          if pickup then
+            return maker:makeEntity{i = i, j = j, classname = pickup}
+          end
+        end
+    }
+  end
+
+  function api:_createExploreMap()
+    exploreMapInfo = {map = OPEN_TWO_ROOM}
+
+    -- Create map theme and override default wall decal placement.
+    local exploreMapTheme = themes.fromTextureSet{
+        textureSet = textureSet,
+        decalFrequency = 0.0,
+        floorModelFrequency = 0.0,
+    }
+
+    self._exploreMap = map_maker:mapFromTextLevel{
+        mapName = EXPLORE_MAP,
+        entityLayer = exploreMapInfo.map,
+        theme = exploreMapTheme,
+        callback = function (i, j, c, maker)
+          local pickup = self:_makePickup(c)
+          if pickup then
+            return maker:makeEntity{i = i, j = j, classname = pickup}
+          end
+        end
+    }
+  end
+
+  function api:_createDistractorMap()
+    -- Create maze to be converted into map.
+    local maze = createDistractorMaze{roomSize = kwargs.distractorRoomSize}
+
+    -- Create map theme with no wall decals.
+    local texture = textureSet
+    if kwargs.differentDistractRoomTexture then
+      texture = secondTextureSet
+    end
+    local distractorMapTheme = themes.fromTextureSet{
+        textureSet = texture,
+        decalFrequency = 0.0,
+        floorModelFrequency = 0.0,
+    }
+
+    self._distractorMap = map_maker:mapFromTextLevel{
+        mapName = DISTRACTOR_MAP,
+        entityLayer = maze:entityLayer(),
+        theme = distractorMapTheme,
+        callback = function (i, j, c, maker)
+          local pickup = self:_makePickup(c)
+          if pickup then
+            return maker:makeEntity{i = i, j = j, classname = pickup}
+          end
+        end
+    }
+  end
+
+  function api:start(episode, seed)
+    random:seed(seed)
+
+    self._map = nil
+    self._time = 0
+    self._holdingKey = false
+    self._keyPosCount = 0
+    self._collectedGoal = false
+
+    if kwargs.distractorLengthSecondsRange then
+      self._distractorLen = random:uniformReal(
+          kwargs.distractorLengthSecondsRange[1],
+          kwargs.distractorLengthSecondsRange[2])
+    else
+      self._distractorLen = kwargs.distractorLengthSeconds
+    end
+
+    -- Sample the key position in phase 1.
+    self._keyPosition = random:uniformInt(1, N_KEY_POS_IN_TWO_ROOM)
+
+    -- Default instruction channel to 0 (indicating the rewards in final phase.)
+    self.setInstruction(tostring(0))
+  end
+
+  function api:filledRectangles(args)
+    if self._showKeyCue then
+      return {{
+          x = 12,
+          y = 12,
+          width = KEY_CUE_RECTANGLE_WIDTH,
+          height = KEY_CUE_RECTANGLE_HEIGHT,
+          rgba = self._keyCueRgba
+      }}
+    end
+    return {}
+  end
+
+  function api:nextMap()
+    -- 1. Decide what is the next map.
+    if self._map == nil then
+      self._map = EXPLORE_MAP
+    elseif self._map == DISTRACTOR_MAP then
+      self._map = REWARD_MAP
+    elseif self._map == EXPLORE_MAP then
+      if self._distractorLen > 0.0 then
+        self._map = DISTRACTOR_MAP
+      else
+        self._map = REWARD_MAP
+      end
+    elseif self._map == REWARD_MAP then
+      -- Stay in distractor map till end of episode.
+      self._map = DISTRACTOR_MAP
+      self._collectedGoal = true
+    end
+
+    -- 2. Set up timeout for the up-coming map.
+    if self._map == DISTRACTOR_MAP and self._collectedGoal then
+      if not self._timeOut then -- don't override any existing timeout
+        self._timeOut = self._time + 0.1
+      end
+    elseif self._map == EXPLORE_MAP then
+      self._timeOut = self._time + kwargs.exploreLengthSeconds
+    elseif self._map == DISTRACTOR_MAP then
+      self._timeOut = self._time + self._distractorLen
+    elseif self._map == REWARD_MAP then
+      if kwargs.rewardLengthSeconds then
+        self._timeOut = self._time + kwargs.rewardLengthSeconds
+      else
+        self._timeOut = nil
+      end
+    end
+
+    return self._map
+  end
+
+ -- PICKUP functions ----------------------------------------------------------
+
+  function api:_makePickup(c)
+    if c == 'K' then
+      return 'key'
+    end
+    if c == 'G' then
+      return 'goal'
+    end
+    if c == 'A' then
+      return 'apple_reward'
+    end
+  end
+
+  function api:pickup(spawnId)
+    if spawnId == GOAL_ID then
+      local goalReward = kwargs.goalReward
+      game:addScore(goalReward - 10)  -- Offset the default +10 for goal.
+      self.setInstruction(tostring(goalReward))
+      game:finishMap()
+    end
+    if spawnId == KEY_SPAWN_ID then
+      self._holdingKey = true
+      self._holdingKeyTime = self._time  -- When the avatar got the key.
+      self._showKeyCue = true
+    end
+
+    if spawnId == APPLE_ID then
+      if kwargs.appleRewardProb >= 1 or
+         random:uniformReal(0, 1) < kwargs.appleRewardProb then
+        -- The -1 is to offset the default 1 point for apple in dmlab
+        appleReward = kwargs.appleReward +
+            random:uniformInt(0, kwargs.appleExtraRewardRange) - 1
+        game:addScore(appleReward)
+      else
+        -- The -1 is to offset the default 1 point for apple in dmlab
+        game:addScore(-1)
+      end
+    end
+  end
+
+  -- TRIGGER functions ---------------------------------------------------------
+
+  function api:canTrigger(teleportId, targetName)
+    if string.sub(targetName, 1, 4) == 'door' then
+      if self._holdingKey then
+        return true
+      else
+        return false
+      end
+    end
+    return true
+  end
+
+  function api:trigger(teleportId, targetName)
+    if string.sub(targetName, 1, 4) == 'door' then
+      -- When door opend, stop showing key cue, and set holding key to false.
+      self._showKeyCue = false
+      self._holdingKey = false
+      return
+    end
+  end
+
+  function api:hasEpisodeFinished(timeSeconds)
+    self._time = timeSeconds
+
+    if self._map == REWARD_MAP or self._collectedGoal then
+      return self._timeOut and timeSeconds > self._timeOut
+    end
+
+    -- Control the timing of showing key cue.
+    if self._holdingKey then
+      local showTime = self._time - self._holdingKeyTime
+      if showTime > kwargs.showKeyColorSquareSeconds then
+        self._showKeyCue = false
+      end
+    end
+
+    if self._map == EXPLORE_MAP or self._map == DISTRACTOR_MAP then
+      if timeSeconds > self._timeOut then
+        game:finishMap()
+      end
+      return false
+    end
+  end
+
+  -- END TRIGGER functions -----------------------------------------------------
+
+  function api:updateSpawnVars(spawnVars)
+    local classname = spawnVars.classname
+    if classname == "info_player_start" then
+      -- Spawn facing South.
+      spawnVars.angle = "-90"
+      spawnVars.randomAngleRange = "0"
+    elseif classname == "func_door" then
+      spawnVars.id = tostring(DOOR_ID)
+      spawnVars.wait = "1000000" -- Open the door for long time.
+    elseif classname == "goal" then
+      spawnVars.id = tostring(GOAL_ID)
+    elseif classname == "apple_reward" then
+      -- We respawn the avatar to distractor room after reaching goal
+      -- there will be no more apples in this case.
+      if self._collectedGoal == true then
+        return nil
+      end
+      local useApple = false
+      if kwargs.probAppleInDistractorMap > 0 then
+        useApple = random:uniformReal(0, 1) < kwargs.probAppleInDistractorMap
+      end
+      if useApple then
+        spawnVars.id = tostring(APPLE_ID)
+      else
+        return nil
+      end
+    elseif classname == "key" then
+      self._keyPosCount = self._keyPosCount + 1
+      if self._keyPosition == self._keyPosCount then
+        spawnVars.id = tostring(KEY_SPAWN_ID)
+        spawnVars.classname = self._keyObject
+      else
+        return nil
+      end
+    end
+    return spawnVars
+  end
+
+  custom_observations.decorate(api)
+  pickup_decorator.decorate(api)
+  setting_overrides.decorate{
+      api = api,
+      apiParams = kwargs,
+      decorateWithTimeout = true
+  }
+  return api
+end
+
+return factory
@@ -0,0 +1,28 @@
+-- Copyright 2019 DeepMind Technologies Limited. All Rights Reserved.
+-- Licensed under the Apache License, Version 2.0 (the "License");
+-- you may not use this file except in compliance with the License.
+-- You may obtain a copy of the License at
+--    http://www.apache.org/licenses/LICENSE-2.0
+-- Unless required by applicable law or agreed to in writing, software
+-- distributed under the License is distributed on an "AS IS" BASIS,
+-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or  implied.
+-- See the License for the specific language governing permissions and
+-- limitations under the License.
+-- ============================================================================
+local factory = require 'visual_match_factory'
+
+return factory.createLevelApi{
+    exploreMapMode = 'KEY_TO_COLOR',
+    episodeLengthSeconds = 45,
+    secondOrderExploreLengthSeconds = 5,
+    preExploreDistractorLengthSeconds = 15,
+    exploreLengthSeconds = 5,
+    distractorLengthSeconds = 15,
+
+    differentDistractRoomTexture = true,
+    differentRewardRoomTexture = true,
+    differentSecondOrderRoomTexture = true,
+    secondOrderExploreRoomSize = {4, 4},
+    correctReward = 10,
+    incorrectReward = 1,
+}
@@ -0,0 +1,23 @@
+-- Copyright 2019 DeepMind Technologies Limited. All Rights Reserved.
+-- Licensed under the Apache License, Version 2.0 (the "License");
+-- you may not use this file except in compliance with the License.
+-- You may obtain a copy of the License at
+--    http://www.apache.org/licenses/LICENSE-2.0
+-- Unless required by applicable law or agreed to in writing, software
+-- distributed under the License is distributed on an "AS IS" BASIS,
+-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or  implied.
+-- See the License for the specific language governing permissions and
+-- limitations under the License.
+-- ============================================================================
+local factory = require 'latent_information_acquisition_factory'
+
+return factory.createLevelApi{
+    episodeLengthSeconds = 40,
+    exploreLengthSeconds = 5,
+    distractorLengthSeconds = 30,
+    numObjects = 3,
+    probGoodObject = 0.5,
+    correctReward = 20,
+    incorrectReward = -10,
+    differentDistractRoomTexture = true,
+}
@@ -0,0 +1,418 @@
+-- Copyright 2019 DeepMind Technologies Limited. All Rights Reserved.
+-- Licensed under the Apache License, Version 2.0 (the "License");
+-- you may not use this file except in compliance with the License.
+-- You may obtain a copy of the License at
+--    http://www.apache.org/licenses/LICENSE-2.0
+-- Unless required by applicable law or agreed to in writing, software
+-- distributed under the License is distributed on an "AS IS" BASIS,
+-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or  implied.
+-- See the License for the specific language governing permissions and
+-- limitations under the License.
+-- ============================================================================
+local make_map = require 'common.make_map'
+local custom_decals = require 'decorators.custom_decals_decoration'
+local custom_entities = require 'common.custom_entities'
+local custom_observations = require 'decorators.custom_observations'
+local datasets_selector = require 'datasets.selector'
+local game = require 'dmlab.system.game'
+local maze_generation = require 'dmlab.system.maze_generation'
+local pickup_decorator = require 'decorators.human_recognisable_pickups'
+local random = require 'common.random'
+local setting_overrides = require 'decorators.setting_overrides'
+local texture_sets = require 'themes.texture_sets'
+local themes = require 'themes.themes'
+local hrp = require 'common.human_recognisable_pickups'
+
+local SHOW_COLOR_CUE_SECOND = 0.25
+local EPISODE_LENGTH_SECONDS = 30
+local EXPLORE_LENGTH_SECONDS = 10
+local DISTRACTOR_LENGTH_SECONDS = 10
+local NUM_OBJECTS = 3
+local PROB_GOOD_OBJECT = 0.5
+local GAURANTEE_GOOD_OBJECTS = 0
+local GAURANTEE_BAD_OBJECTS = 0
+
+local PROB_APPLE_IN_DISTRACTOR_MAP = 0.3
+local APPLE_REWARD = 5
+local APPLE_EXTRA_REWARD_RANGE = 0
+local DISTRACTOR_ROOM_SIZE = {11, 11}
+local APPLE_ID = 1000
+local CORRECT_REWARD = 2
+local INCORRECT_REWARD = -1
+local ROOM_SIZE = {3, 5}
+local OBJECT_SCALE = 1.62
+
+local EXPLORE_MAP = "exploreMap"
+local DISTRACTOR_MAP = "distractorMap"
+local EXPLOIT_MAP = "exploitMap"
+
+
+local DIFFERENT_DISTRACT_ROOM_TEXTURE = false
+
+-- Set texture set for all maps.
+local textureSet = texture_sets.TRON
+local secondTextureSet = texture_sets.TETRIS
+
+-- Takes goal/location:i -> i
+local function nameToLocationId(name)
+  return tonumber(name:match('^.+:(%d+)$'))
+end
+
+-- Takes goal/location:i -> goal/pickup
+local function nameToLocationClass(name)
+  return name:match('^(.+):%d+$')
+end
+
+local factory = {}
+game:console('cg_drawScriptRectanglesAlways 1')
+
+function factory.createLevelApi(kwargs)
+  kwargs.episodeLengthSeconds = kwargs.episodeLengthSeconds or
+                                EPISODE_LENGTH_SECONDS
+  kwargs.exploreLengthSeconds = kwargs.exploreLengthSeconds or
+                                EXPLORE_LENGTH_SECONDS
+  if kwargs.distractorLengthSeconds == 0 then
+    kwargs.skipDistractor = true
+  else
+    kwargs.distractorLengthSeconds = kwargs.distractorLengthSeconds or
+                                     DISTRACTOR_LENGTH_SECONDS
+  end
+  kwargs.numObjects = kwargs.numObjects or NUM_OBJECTS
+  kwargs.probGoodObject = kwargs.probGoodObject or PROB_GOOD_OBJECT
+  kwargs.guaranteeGoodObjects = kwargs.guaranteeGoodObjects or
+                                GAURANTEE_GOOD_OBJECTS
+  kwargs.guaranteeBadObjects = kwargs.guaranteeBadObjects or
+                               GAURANTEE_BAD_OBJECTS
+  kwargs.correctReward = kwargs.correctReward or CORRECT_REWARD
+  kwargs.incorrectReward = kwargs.incorrectReward or INCORRECT_REWARD
+  kwargs.roomSize = kwargs.roomSize or ROOM_SIZE
+  kwargs.distractorRoomSize = kwargs.distractorRoomSize or DISTRACTOR_ROOM_SIZE
+  kwargs.probAppleInDistractorMap = kwargs.probAppleInDistractorMap or
+                                    PROB_APPLE_IN_DISTRACTOR_MAP
+  kwargs.differentDistractRoomTexture = kwargs.differentDistractRoomTexture or
+                                        DIFFERENT_DISTRACT_ROOM_TEXTURE
+  kwargs.appleReward = kwargs.appleReward or APPLE_REWARD
+  kwargs.appleExtraRewardRange = kwargs.appleExtraRewardRange or
+                                 APPLE_EXTRA_REWARD_RANGE
+  kwargs.objectScale = kwargs.objectScale or OBJECT_SCALE
+
+  local api = {}
+
+  function api:init(params)
+    self:_createExploreMap()
+    self:_createDistractorMap()
+    self:_createExploitMap()
+  end
+
+  function api:pickup(spawnId)
+    if self._map == EXPLORE_MAP then
+      -- Setup to show color cue.
+      self._showObjectCue = true
+      self._cueColor = self._objects[spawnId].cueColor
+      self._cueStartTime = self._time
+    elseif self._map == EXPLOIT_MAP then
+      -- Give corresponding reward and termiante when all good objects collected
+      game:addScore(self._objects[spawnId].reward)
+      -- Update the instruction channel (to record final phase rewards.)
+      self._finalRewardMainTask = (
+          self._finalRewardMainTask  + self._objects[spawnId].reward)
+      self.setInstruction(tostring(self._finalRewardMainTask))
+    end
+
+    if spawnId == APPLE_ID then
+      -- note the -1 to offset default 1 point for apple in dmlab
+      appleReward = kwargs.appleReward +
+          random:uniformInt(0, kwargs.appleExtraRewardRange) - 1
+      game:addScore(appleReward)
+    end
+  end
+
+  function api:_createRoomCommon()
+    local roomHeight = kwargs.roomSize[1]
+    local roomWidth = kwargs.roomSize[2]
+    local maze = maze_generation:mazeGeneration{
+        height = roomHeight + 2,
+        width = roomWidth + 2
+    }
+
+    -- Set (2,2) as 'P' for the avatar location.
+    -- Set (i,j) as 'O' for possible object location if i%2 == 0 && j%2 == 0.
+    -- Otherwise, fill with '.' for empty location.
+    self._numLocations = 0
+    for i = 2, roomHeight + 1 do
+      for j = 2, roomWidth + 1 do
+        if i == 2 and j == 2 then
+          maze:setEntityCell(i, j, 'P')
+        elseif i % 2 == 0 and j % 2 == 0 then
+          maze:setEntityCell(i, j, 'O')
+          self._numLocations = self._numLocations + 1
+        else
+          maze:setEntityCell(i, j, '.')
+        end
+      end
+    end
+
+    return maze
+  end
+
+  function api:_createExploreMap()
+    maze = self:_createRoomCommon()
+    print('Generated explore maze with entity layer:')
+    print(maze:entityLayer())
+    io.flush()
+
+    local mapTheme = themes.fromTextureSet{
+        textureSet = textureSet,
+        decalFrequency = 0.0,
+    }
+
+    local counter = 1
+    self._exploreMap = make_map.makeMap{
+        mapName = EXPLORE_MAP,
+        mapEntityLayer = maze:entityLayer(),
+        theme = mapTheme,
+        callback = function (i, j, c, maker)
+          if c == 'O' then
+            pickup = 'location:' .. counter
+            counter = counter + 1
+            return maker:makeEntity{i = i, j = j, classname = pickup}
+          end
+        end
+    }
+  end
+
+  function api:_createDistractorMap()
+    -- Create map theme with no wall decals.
+    local distractorMapTheme = themes.fromTextureSet{
+        textureSet = textureSet,
+        decalFrequency = 0.0,
+    }
+
+    -- Example room with height = 2, width = 3
+    -- *****
+    -- *APA*
+    -- *AAA*
+    -- *****
+    local roomHeight = kwargs.distractorRoomSize[1]
+    local roomWidth = kwargs.distractorRoomSize[2]
+    centerWidth = 1 + math.ceil(roomWidth / 2)
+    local maze = maze_generation:mazeGeneration{
+        height = roomHeight + 2,
+        width = roomWidth + 2
+    }
+
+    -- Fill the room with 'A' for apples. updateSpawnVars decides which to use.
+    for i = 2, roomHeight + 1 do
+      for j = 2, roomWidth + 1 do
+        maze:setEntityCell(i, j, 'A')
+      end
+    end
+    -- Override one cell with 'P' for spawn point.
+    maze:setEntityCell(2, centerWidth, 'P')
+
+    print('Generated distractor maze with entity layer:')
+    print(maze:entityLayer())
+    io.flush()
+
+    local texture = textureSet
+    if kwargs.differentDistractRoomTexture then
+      texture = secondTextureSet
+    end
+    local mapTheme = themes.fromTextureSet{
+        textureSet = texture,
+        decalFrequency = 0.0,
+    }
+    self._distractMap = make_map.makeMap{
+        mapName = DISTRACTOR_MAP,
+        mapEntityLayer = maze:entityLayer(),
+        theme = mapTheme,
+    }
+  end
+
+  function api:_createExploitMap()
+    maze = self:_createRoomCommon()
+    print('Generated exploit maze with entity layer:')
+    print(maze:entityLayer())
+    io.flush()
+
+    local mapTheme = themes.fromTextureSet{
+        textureSet = textureSet,
+        decalFrequency = 0.0,
+    }
+
+    local counter = 1
+    self.exploitMap = make_map.makeMap{
+        mapName = EXPLOIT_MAP,
+        mapEntityLayer = maze:entityLayer(),
+        theme = mapTheme,
+        useSkybox = false,
+        callback = function (i, j, c, maker)
+          if c == 'O' then
+            pickup = 'location:' .. counter
+            counter = counter + 1
+            return maker:makeEntity{i = i, j = j, classname = pickup}
+          end
+        end
+    }
+  end
+
+  function api:_generateRandomObjects()
+    -- 1. Generate a random list of positive/negative reward, `objectValence`
+    -- as function(numObjects, guaranteeGood, guaranteeBad, probGoodObject)
+
+    local objectValence = {}
+    for i = 1, kwargs.numObjects do
+      if i <= kwargs.guaranteeGoodObjects then
+        objectValence[i] = 1
+      elseif i<= kwargs.guaranteeGoodObjects + kwargs.guaranteeBadObjects then
+        objectValence[i] = -1
+      else
+        if random:uniformReal(0, 1) < kwargs.probGoodObject then
+          objectValence[i] = 1
+        else
+          objectValence[i] = -1
+        end
+      end
+    end
+    random:shuffleInPlace(objectValence)
+
+    -- 2. Generate random objects and link to the object valence above.
+    local objects = hrp.uniquelyShapedPickups(kwargs.numObjects)
+    for i = 1, kwargs.numObjects do
+      objects[i].scale= kwargs.objectScale
+    end
+
+    self._objects = {}
+    for i, object in ipairs(objects) do
+      self._objects[i] = {}
+      self._objects[i].data = hrp.create(object)
+      if objectValence[i] == 1 then
+        self._objects[i].isGoodObject = true
+        self._objects[i].reward = kwargs.correctReward
+        self._objects[i].cueColor = {0, 1, 0, 1} -- green means good
+      else
+        self._objects[i].isGoodObject = false
+        self._objects[i].reward = kwargs.incorrectReward
+        self._objects[i].cueColor = {1, 0, 0, 1} -- red means bad
+      end
+    end
+  end
+
+  function api:start(episode, seed)
+    random:seed(seed)
+
+    -- Setup a random mapping from locationId to pickupId
+    -- There should be more locationId than pickupId
+    -- The location set with pickupId == 0 will have no object presented there.
+    self._mapLocationIdToPickupId = {}
+    for i = 1, self._numLocations do
+      if i <= kwargs.numObjects then
+        self._mapLocationIdToPickupId[i] = i
+      else
+        self._mapLocationIdToPickupId[i] = 0
+      end
+    end
+    random:shuffleInPlace(self._mapLocationIdToPickupId)
+
+    self:_generateRandomObjects()
+    self._map = nil
+    self._numTrials = 0
+    self._timeOut = kwargs.exploreLengthSeconds
+
+    -- Set the instruction channel to record the rewards in the final phase.
+    self._finalRewardMainTask = 0
+    self.setInstruction("0")
+  end
+
+  function api:nextMap()
+    if self._map == nil then  -- Start of episode.
+      self._map = EXPLORE_MAP
+    elseif not kwargs.skipDistractor and self._map == EXPLORE_MAP then
+      -- Move from explore to distractor.
+      self._map = DISTRACTOR_MAP
+      self._timeOut = self._time + kwargs.distractorLengthSeconds
+    elseif (kwargs.skipDistractor and self._map == EXPLORE_MAP)
+           or self._map == DISTRACTOR_MAP then
+      -- Move from distractor or explore map to exploit map.
+      self._map = EXPLOIT_MAP
+      random:shuffleInPlace(self._mapLocationIdToPickupId)
+      self._timeOut = nil
+    end
+
+    return self._map
+  end
+
+  function api:hasEpisodeFinished(timeSeconds)
+    self._time = timeSeconds
+    if self._showObjectCue then
+      if self._time - self._cueStartTime > SHOW_COLOR_CUE_SECOND then
+        self._showObjectCue = false
+      end
+    end
+
+    if self._map == EXPLORE_MAP or self._map == DISTRACTOR_MAP then
+      if timeSeconds > self._timeOut then
+        game:finishMap()
+      end
+      return false
+    end
+  end
+
+  -- END TRIGGER functions -----------------------------------------------------
+  function api:filledRectangles(args)
+    if self._map == EXPLORE_MAP and self._showObjectCue then
+      return {{
+          x = 12,
+          y = 12,
+          width = 600,
+          height = 300,
+          rgba = self._cueColor,
+      }}
+    end
+    return {}
+  end
+
+  function api:updateSpawnVars(spawnVars)
+    local classname = spawnVars.classname
+    if classname == "info_player_start" then
+      -- Spawn facing South.
+      spawnVars.angle = "-90"
+      spawnVars.randomAngleRange = "0"
+    elseif classname == "apple_reward" then
+      local useApple = false
+      if kwargs.probAppleInDistractorMap > 0 then
+        useApple = random:uniformReal(0, 1) < kwargs.probAppleInDistractorMap
+        spawnVars.id = tostring(APPLE_ID)
+      end
+      if not useApple then
+        return nil
+      end
+    else
+      -- Allocate objects onto the map by mapLocationIdToPickupId.
+      local locationClass = nameToLocationClass(classname)
+      if locationClass then
+        local locationId = nameToLocationId(classname)
+        id = self._mapLocationIdToPickupId[locationId]
+        if id == 0 then
+          return nil
+        else
+          spawnVars.classname = self._objects[id].data
+          spawnVars.id = tostring(id)
+        end
+      end
+    end
+
+    return spawnVars
+  end
+
+  custom_observations.decorate(api)
+  pickup_decorator.decorate(api)
+  setting_overrides.decorate{
+      api = api,
+      apiParams = kwargs,
+      decorateWithTimeout = true
+  }
+  return api
+end
+
+return factory
@@ -0,0 +1,24 @@
+-- Copyright 2019 DeepMind Technologies Limited. All Rights Reserved.
+-- Licensed under the Apache License, Version 2.0 (the "License");
+-- you may not use this file except in compliance with the License.
+-- You may obtain a copy of the License at
+--    http://www.apache.org/licenses/LICENSE-2.0
+-- Unless required by applicable law or agreed to in writing, software
+-- distributed under the License is distributed on an "AS IS" BASIS,
+-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or  implied.
+-- See the License for the specific language governing permissions and
+-- limitations under the License.
+-- ============================================================================
+local factory = require 'visual_match_factory'
+
+return factory.createLevelApi{
+    exploreMapMode = 'PASSIVE',
+    episodeLengthSeconds = 40,
+    exploreLengthSeconds = 5,
+    distractorLengthSeconds = 30,
+
+    differentDistractRoomTexture = true,
+    differentRewardRoomTexture = true,
+    correctReward = 10,
+    incorrectReward = 1,
+}
@@ -0,0 +1,19 @@
+-- Copyright 2019 DeepMind Technologies Limited. All Rights Reserved.
+-- Licensed under the Apache License, Version 2.0 (the "License");
+-- you may not use this file except in compliance with the License.
+-- You may obtain a copy of the License at
+--    http://www.apache.org/licenses/LICENSE-2.0
+-- Unless required by applicable law or agreed to in writing, software
+-- distributed under the License is distributed on an "AS IS" BASIS,
+-- WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or  implied.
+-- See the License for the specific language governing permissions and
+-- limitations under the License.
+-- ============================================================================
+local factory = require 'two_keys_to_choose_factory'
+
+return factory.createLevelApi{
+    episodeLengthSeconds = 37,
+    exploreLengthSeconds = 5,
+    distractorLengthSeconds = 30,
+    differentDistractRoomTexture = true,
+}