Although the programming model of MapReduce framework force one to express algorithms in terms of a small set of rigidly defined components, there are many tools at one's disposal to shape the flow of computation. Ultimately, this boils down to effectively use of the following techniques:
- Constructing complex keys and values that bring together data necessary for a computation.
- Executing user-specified initialization and termination code in either the mapper or reducer. For example, in-mapping combining depends on emission of intermediate key-value pairs in the map task termination code.
- Preserving state across multiple inputs in the mapper and reducer.
- Controlling the sort order of intermediate keys with built-in or user-defined sorters.
- Controlling the partitioning of the intermediate key space with built-in or user-defined partitioners.