Properties or dimension of task environments
- Fully observable (vs. partially observable): An agent’s sensors give it access to the complete state of the environment at each point in time.
- Deterministic (vs. stochastic): The next state of the environment is completely determined by the current state and the action executed by the agent. (If the environment is deterministic except for the actions of other agents, then the environment is strategic)
- Episodic (vs. sequential): The agent’s experience is divided into atomic “episodes” (each episode consists of the agent perceiving and then performing a single action), and the choice of action in each episode depends only on the episode itself.
- Static (vs. dynamic): The environment is unchanged while an agent is deliberating. (The environment is semidynamic if the environment itself does not change with the passage of time but the agent’s performance score does)
- Discrete (vs. continuous): A limited number of distinct, clearly defined percepts and actions.
- Single agent (vs. multiagent): An agent operating by itself in an environment.
Example:
Chess with a clock | Chess without a clock | Taxi driving | |
Fully observable | Yes | Yes | No |
Deterministic | Strategic | Strategic | No |
Episodic | No | No | No |
Static | Semi | Yes | No |
Discrete | Yes | Yes | No |
Single agent | No | No | No |
The environment type largely determines the agent design
The real world is (of course) partially observable, stochastic, sequential, dynamic, continuous, multi-agent