Launch separate TaskTracker instances for Map and Reduce slots

With the recent enhancements that landed related to freeing up some resources when a TaskTracker becomes idle, Hadoop is a little less greedy about holding onto cluster resources when it's not actually using them. However, because this is based on the whole TaskTracker being idle, we don't get the best chance of freeing resources when TTs have mixed slots, both map and reduce.

We should launch separate TTs for map and reduce slots. To do this effectively, we probably want to try and bunch up a many map or reduce slots onto each node as possible, as opposed to the current logic, which is to apply the map/reduce slot ratio to each incoming offer. Take the following example...

---

1 Slot = 1 CPU and 1GB RAM

Offers:
- Slave (1) 10 CPUs, 10GB RAM
- Slave (2) 10 CPUs, 10GB RAM
- Slave (3) 10 CPUs, 10GB RAM

Pending tasks:
- 1000 Map
- 100 Reduce

Current result:
- Slave(1) -> TaskTracker(9 Map, 1 Reduce)
- Slave(2) -> TaskTracker(9 Map, 1 Reduce)
- Slave(3) -> TaskTracker(9 Map, 1 Reduce)

Ideal Result:
- Slave(1) -> TaskTracker(10 Map)
- Slave(2) -> TaskTracker(10 Map)
- Slave(3) -> TaskTracker(7 Map)
- Slave(3) -> TaskTracker(3 Reduce)


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Launch separate TaskTracker instances for Map and Reduce slots #47

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Launch separate TaskTracker instances for Map and Reduce slots #47

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions