Description

PressurePlate is a multi-agent environment that requires agents to cooperate during the traversal of a gridworld. The grid is partitioned into several rooms, and each room contains a plate and a closed doorway. Before episodes begin, each agent is assigned a plate that only they can activate. For the group of agents to proceed into the next room, an agent must remain behind, standing on their assigned plate. The task is considered solved when the goal (depicted with a treasure chest) is reached.

Currently, PressurePlate supports four-, five-, and six-player levels but is easily configurable for
custom scenarios. See Customizing Scenarios for more information.

Observation Space

Each agent has a distance-limited view of the environment, as defined by the sensor_range attribute of the PressurePlate
class. The PressurePlate world is made of several 2D grids, where each grid corresponds to an entity type. For example,
one grid corresponds to walls, one grid corresponds to plates, and so on. When queried, the environment produces a subsection
of each grid that corresponds to each agent’s viewing range. Next, these subsections are flattened and concatenated together.
Finally, the agent’s (x,y) coordinates are concatenated to the end of the observation vector.

See the below figure for a depiction of this process for Agent 0 and the Doors grid.

Action Space

PressurePlate’s action space is discrete and has five options: up, down, left, right, and no-op (do nothing).

For each call of .step(), the ordering of action-execution is randomized.

Reward Function

Each agent receives rewards independent of other agents. If an agent is in the room that contains their assigned plate,
their reward is the negative normalized Manhattan distance between their current position and the plate. Otherwise, their reward is
the number of rooms between their current room and the room that contains their assigned plate.

Installation

After cloning the repo, cd into pressureplate and:

pip install -e .

Using PressurePlate

Within your Python script, access the three currently-available tasks as follows:

env = gym.make('pressureplate-linear-4p-v0')
env = gym.make('pressureplate-linear-5p-v0')
env = gym.make('pressureplate-linear-6p-v0')

The PressurePlate environment is implemented within the Gym paradigm, and therefore uses the usual .step(),
.reset(), and .render() methods.

Customizing Scenarios

To create a custom PressurePlate layout, you can add a layout dictionary to the pressureplate/assets.py file.
The dictionary must contain lists of (x,y) coordinates of the following elements:

A unique identifier (e.g., 'FOUR_PLAYERS')
'WALLS'
'DOORS'
'PLATES'
'AGENTS'
'GOAL'

Additionally, you will need to register the new task as a gym environment within pressureplate/__init__.py.
Finally, edit the PressurePlate class with pressureplate/environment.py to load your custom layout into the
self.layout attribute.

For detailed instructions, please refer to the docstring within pressureplate/assets.py.