Q-learning implementation for Taxi-v3 environment

### Proposal

### Code Overview

1. **Q-Learning Agent (`QLearningAgent` class)**:
   - Implements a Q-learning algorithm with epsilon-greedy exploration
   - Maintains a Q-table to learn state-action values
   - Features include:
     - Epsilon decay for reducing exploration over time
     - Handling of action masks (valid actions)
     - Learning rate and discount factor configuration

2. **Training Function (`train_taxi()`)**:
   - Trains the agent for a specified number of episodes
   - Uses a progress bar to track training
   - Tracks and stores episode rewards
   - Periodically reports average reward and current epsilon value

3. **Testing Function (`test_agent()`)**:
   - Evaluates the trained agent in the Taxi environment
   - Renders the environment for visual demonstration
   - Prints total reward for each episode

### Environment Details

The Taxi-v3 environment is a grid-world problem where an agent must:
- Pick up a passenger at one of four locations
- Drop the passenger at another specified location
- Navigate efficiently while avoiding invalid moves





### Motivation

Training agents improvement and I can expand it to the other agents, such as Cliff Walking Agent

### Pitch

_No response_

### Alternatives

_No response_

### Additional context

_No response_

### Checklist

- [X] I have checked that there is no similar [issue](https://github.com/Farama-Foundation/Gymnasium/issues) in the repo


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Q-learning implementation for Taxi-v3 environment #1274

Proposal

Code Overview

Environment Details

Motivation

Pitch

Alternatives

Additional context

Checklist

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Q-learning implementation for Taxi-v3 environment #1274

Description

Proposal

Code Overview

Environment Details

Motivation

Pitch

Alternatives

Additional context

Checklist

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions