Hadoop--MapReduce3--自定义Partitioner

在MapReduce处理过程中,map阶段每个maptask读取负责的文件切片,输入key为行的起始偏移量,输入value为行的内容;输出key-value为自定义类型,然后每个map将各自产生的key-value分发到不同的reducetask,相同的key必将分发到同一个reducetask以实现相同key数据聚合,其基本原理如下:

       每个maptask在分发数据时其由Partitioner接口决定,默认实现HashPartitioner根据key的哈希值/reduceNum来分发数据,因此相同key数据必定聚合到同一个reducetask中。

Partitioner接口

/** 
 * Partitions the key space.
 * 
 * <p><code>Partitioner</code> controls the partitioning of the keys of the 
 * intermediate map-outputs. The key (or a subset of the key) is used to derive
 * the partition, typically by a hash function. The total number of partitions
 * is the same as the number of reduce tasks for the job. Hence this controls
 * which of the <code>m</code> reduce tasks the intermediate key (and hence the 
 * record) is sent for reduction.</p>
 * 
 * Note: If you require your Partitioner class to obtain the Job's configuration
 * object, implement the {@link Configurable} interface.
 * 
 * @see Reducer
 */
@InterfaceAudience.Public
@InterfaceStability.Stable
public abstract class Partitioner<KEY, VALUE> {
  
  /** 
   * Get the partition number for a given key (hence record) given the total 
   * number of partitions i.e. number of reduce-tasks for the job.
   *   
   * <p>Typically a hash function on a all or a subset of the key.</p>
   *
   * @param key the key to be partioned.
   * @param value the entry value.
   * @param numPartitions the total number of partitions.
   * @return the partition number for the <code>key</code>.
   */
  public abstract int getPartition(KEY key, VALUE value, int numPartitions);
  
}

默认HashPartitioner实现如下:   key的哈希值/numReduceTasks

/** Partition keys by their {@link Object#hashCode()}. */
@InterfaceAudience.Public
@InterfaceStability.Stable
public class HashPartitioner<K, V> extends Partitioner<K, V> {

  /** Use {@link Object#hashCode()} to partition. */
  public int getPartition(K key, V value,
                          int numReduceTasks) {
    return (key.hashCode() & Integer.MAX_VALUE) % numReduceTasks;
  }

}

可以通过自定义Partitioner接口方法来自定义key值的分发规则,控制某一key值所有数据对都分发至某一特定的reducetask中来进行处理。

例如:

统计每一个用户的总流量信息,并且按照其归属地,将统计结果输出在不同的文件中。

1363157982040 	13502468823	5C-0A-5B-6A-0B-D4:CMCC-EASY	120.196.100.99	y0.ifengimg.com	综合门户	57	102	7335	110349	200
1363157986072 	18320173382	84-25-DB-4F-10-1A:CMCC-EASY	120.196.100.99	input.shouji.sogou.com	搜索引擎	21	18	9531	2412	200
1363157990043 	13925057413	00-1F-64-E1-E6-9A:CMCC	120.196.100.55	t3.baidu.com	搜索引擎	69	63	11058	48243	200

map阶段读取每一行数据,key为手机号  value为自定义类型来封装流量信息 

归属地的数量设定为reducetask的数量,即每个reducetask仅处理一个归属地的数据信息

实现特定Partitioner方法,将指定key值分发到指定的reducetask中

每一个reducetask仅处理一个归属地数据,进行叠加,得到最终结果

具体实现如下:

自定义分区控制器Partitioner

public class ProvincePartitioner extends Partitioner<Text, FlowBean>{
	static HashMap<String,Integer> codeMap = new HashMap<>();
	static{
		codeMap.put("135", 0);
		codeMap.put("136", 1);
		codeMap.put("137", 2);
		codeMap.put("138", 3);
		codeMap.put("139", 4);
	}
	
	@Override
	public int getPartition(Text key, FlowBean value, int numPartitions) {
		
		Integer code = codeMap.get(key.toString().substring(0, 3));
		return code == null ? 5 : code;
	}
}

这里简单对于手机号与归属地做一下简单映射,假设共有6个归属地。

自定义数据类型,Mapper以及Reducer接口实现与上一篇相同

提交任务到本地模拟器运行

public class JobSubmitter {

	public static void main(String[] args) throws Exception {

		Configuration conf = new Configuration();

		Job job = Job.getInstance(conf);
		job.setJarByClass(JobSubmitter.class);
		job.setMapperClass(FlowCountMapper.class);
		job.setReducerClass(FlowCountReducer.class);
		//指定maptask数据分区逻辑类,默认使用HashPartitioner
		job.setPartitionerClass(ProvincePartitioner.class);
		// 由于ProvincePartitioner可能会产生6种分区号需要6个reducetask来接收
		job.setNumReduceTasks(6);
		job.setMapOutputKeyClass(Text.class);
		job.setMapOutputValueClass(FlowBean.class);
		job.setOutputKeyClass(Text.class);
		job.setOutputValueClass(FlowBean.class);
		
		FileInputFormat.setInputPaths(job, new Path("F:\\hadoop-2.8.1\\data\\flow\\input"));
		FileOutputFormat.setOutputPath(job, new Path("F:\\hadoop-2.8.1\\data\\flow\\output2"));

		job.waitForCompletion(true);

	}

}

通过job.setPartitionerClass(ProvincePartitioner.class)来指定分区控制器,reducetask的数量与分区控制器逻辑要对应。

运行日志:

[INFO ] 2019-02-27 20:57:26,864 method:org.apache.hadoop.conf.Configuration.warnOnceIfDeprecated(Configuration.java:1181)
session.id is deprecated. Instead, use dfs.metrics.session-id
[INFO ] 2019-02-27 20:57:26,872 method:org.apache.hadoop.metrics.jvm.JvmMetrics.init(JvmMetrics.java:79)
Initializing JVM Metrics with processName=JobTracker, sessionId=
[WARN ] 2019-02-27 20:57:28,516 method:org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(JobResourceUploader.java:64)
Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
[WARN ] 2019-02-27 20:57:28,562 method:org.apache.hadoop.mapreduce.JobResourceUploader.uploadFiles(JobResourceUploader.java:171)
No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
[INFO ] 2019-02-27 20:57:28,847 method:org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:289)
Total input files to process : 1
[INFO ] 2019-02-27 20:57:28,930 method:org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:200)
number of splits:1
[INFO ] 2019-02-27 20:57:29,167 method:org.apache.hadoop.mapreduce.JobSubmitter.printTokens(JobSubmitter.java:289)
Submitting tokens for job: job_local1894406223_0001
[INFO ] 2019-02-27 20:57:29,483 method:org.apache.hadoop.mapreduce.Job.submit(Job.java:1345)
The url to track the job: http://localhost:8080/
[INFO ] 2019-02-27 20:57:29,485 method:org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1390)
Running job: job_local1894406223_0001
[INFO ] 2019-02-27 20:57:29,488 method:org.apache.hadoop.mapred.LocalJobRunner$Job.createOutputCommitter(LocalJobRunner.java:498)
OutputCommitter set in config null
[INFO ] 2019-02-27 20:57:29,500 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:123)
File Output Committer Algorithm version is 1
[INFO ] 2019-02-27 20:57:29,500 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:138)
FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
[INFO ] 2019-02-27 20:57:29,502 method:org.apache.hadoop.mapred.LocalJobRunner$Job.createOutputCommitter(LocalJobRunner.java:516)
OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
[INFO ] 2019-02-27 20:57:29,579 method:org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:475)
Waiting for map tasks
[INFO ] 2019-02-27 20:57:29,581 method:org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:251)
Starting task: attempt_local1894406223_0001_m_000000_0
[INFO ] 2019-02-27 20:57:29,638 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:123)
File Output Committer Algorithm version is 1
[INFO ] 2019-02-27 20:57:29,641 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:138)
FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
[INFO ] 2019-02-27 20:57:29,668 method:org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.isAvailable(ProcfsBasedProcessTree.java:168)
ProcfsBasedProcessTree currently is supported only on Linux.
[INFO ] 2019-02-27 20:57:29,832 method:org.apache.hadoop.mapred.Task.initialize(Task.java:619)
 Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@393c874d
[INFO ] 2019-02-27 20:57:29,849 method:org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:756)
Processing split: file:/F:/hadoop-2.8.1/data/flow/input/flow.log:0+2226
[INFO ] 2019-02-27 20:57:30,019 method:org.apache.hadoop.mapred.MapTask$MapOutputBuffer.setEquator(MapTask.java:1205)
(EQUATOR) 0 kvi 26214396(104857584)
[INFO ] 2019-02-27 20:57:30,019 method:org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:998)
mapreduce.task.io.sort.mb: 100
[INFO ] 2019-02-27 20:57:30,019 method:org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:999)
soft limit at 83886080
[INFO ] 2019-02-27 20:57:30,020 method:org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:1000)
bufstart = 0; bufvoid = 104857600
[INFO ] 2019-02-27 20:57:30,020 method:org.apache.hadoop.mapred.MapTask$MapOutputBuffer.init(MapTask.java:1001)
kvstart = 26214396; length = 6553600
[INFO ] 2019-02-27 20:57:30,025 method:org.apache.hadoop.mapred.MapTask.createSortingCollector(MapTask.java:403)
Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
[INFO ] 2019-02-27 20:57:30,045 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)

[INFO ] 2019-02-27 20:57:30,046 method:org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1462)
Starting flush of map output
[INFO ] 2019-02-27 20:57:30,046 method:org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1484)
Spilling map output
[INFO ] 2019-02-27 20:57:30,046 method:org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1485)
bufstart = 0; bufend = 808; bufvoid = 104857600
[INFO ] 2019-02-27 20:57:30,047 method:org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1487)
kvstart = 26214396(104857584); kvend = 26214312(104857248); length = 85/6553600
[INFO ] 2019-02-27 20:57:30,089 method:org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1669)
Finished spill 0
[INFO ] 2019-02-27 20:57:30,104 method:org.apache.hadoop.mapred.Task.done(Task.java:1099)
Task:attempt_local1894406223_0001_m_000000_0 is done. And is in the process of committing
[INFO ] 2019-02-27 20:57:30,142 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
map
[INFO ] 2019-02-27 20:57:30,142 method:org.apache.hadoop.mapred.Task.sendDone(Task.java:1219)
Task 'attempt_local1894406223_0001_m_000000_0' done.
[INFO ] 2019-02-27 20:57:30,142 method:org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:276)
Finishing task: attempt_local1894406223_0001_m_000000_0
[INFO ] 2019-02-27 20:57:30,143 method:org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:483)
map task executor complete.
[INFO ] 2019-02-27 20:57:30,155 method:org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:475)
Waiting for reduce tasks
[INFO ] 2019-02-27 20:57:30,156 method:org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:329)
Starting task: attempt_local1894406223_0001_r_000000_0
[INFO ] 2019-02-27 20:57:30,174 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:123)
File Output Committer Algorithm version is 1
[INFO ] 2019-02-27 20:57:30,174 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:138)
FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
[INFO ] 2019-02-27 20:57:30,175 method:org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.isAvailable(ProcfsBasedProcessTree.java:168)
ProcfsBasedProcessTree currently is supported only on Linux.
[INFO ] 2019-02-27 20:57:30,330 method:org.apache.hadoop.mapred.Task.initialize(Task.java:619)
 Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@3a6e816
[INFO ] 2019-02-27 20:57:30,337 method:org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362)
Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@3506b443
[INFO ] 2019-02-27 20:57:30,361 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.<init>(MergeManagerImpl.java:206)
MergerManager: memoryLimit=1323407744, maxSingleShuffleLimit=330851936, mergeThreshold=873449152, ioSortFactor=10, memToMemMergeOutputsThreshold=10
[INFO ] 2019-02-27 20:57:30,365 method:org.apache.hadoop.mapreduce.task.reduce.EventFetcher.run(EventFetcher.java:61)
attempt_local1894406223_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
[INFO ] 2019-02-27 20:57:30,432 method:org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.copyMapOutput(LocalFetcher.java:145)
localfetcher#1 about to shuffle output of map attempt_local1894406223_0001_m_000000_0 decomp: 158 len: 162 to MEMORY
[INFO ] 2019-02-27 20:57:30,452 method:org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.doShuffle(InMemoryMapOutput.java:93)
Read 158 bytes from map-output for attempt_local1894406223_0001_m_000000_0
[INFO ] 2019-02-27 20:57:30,455 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.closeInMemoryFile(MergeManagerImpl.java:321)
closeInMemoryFile -> map-output of size: 158, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->158
[INFO ] 2019-02-27 20:57:30,458 method:org.apache.hadoop.mapreduce.task.reduce.EventFetcher.run(EventFetcher.java:76)
EventFetcher is interrupted.. Returning
[INFO ] 2019-02-27 20:57:30,460 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
1 / 1 copied.
[INFO ] 2019-02-27 20:57:30,460 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:693)
finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
[INFO ] 2019-02-27 20:57:30,489 method:org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1411)
Job job_local1894406223_0001 running in uber mode : false
[INFO ] 2019-02-27 20:57:30,491 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:606)
Merging 1 sorted segments
[INFO ] 2019-02-27 20:57:30,492 method:org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1418)
 map 100% reduce 0%
[INFO ] 2019-02-27 20:57:30,492 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:705)
Down to the last merge-pass, with 1 segments left of total size: 144 bytes
[INFO ] 2019-02-27 20:57:30,498 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:760)
Merged 1 segments, 158 bytes to disk to satisfy reduce memory limit
[INFO ] 2019-02-27 20:57:30,503 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:790)
Merging 1 files, 162 bytes from disk
[INFO ] 2019-02-27 20:57:30,505 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:805)
Merging 0 segments, 0 bytes from memory into reduce
[INFO ] 2019-02-27 20:57:30,505 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:606)
Merging 1 sorted segments
[INFO ] 2019-02-27 20:57:30,509 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:705)
Down to the last merge-pass, with 1 segments left of total size: 144 bytes
[INFO ] 2019-02-27 20:57:30,510 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
1 / 1 copied.
[INFO ] 2019-02-27 20:57:30,522 method:org.apache.hadoop.conf.Configuration.warnOnceIfDeprecated(Configuration.java:1181)
mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
[INFO ] 2019-02-27 20:57:30,536 method:org.apache.hadoop.mapred.Task.done(Task.java:1099)
Task:attempt_local1894406223_0001_r_000000_0 is done. And is in the process of committing
[INFO ] 2019-02-27 20:57:30,540 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
1 / 1 copied.
[INFO ] 2019-02-27 20:57:30,541 method:org.apache.hadoop.mapred.Task.commit(Task.java:1260)
Task attempt_local1894406223_0001_r_000000_0 is allowed to commit now
[INFO ] 2019-02-27 20:57:30,548 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitTask(FileOutputCommitter.java:582)
Saved output of task 'attempt_local1894406223_0001_r_000000_0' to file:/F:/hadoop-2.8.1/data/flow/output2/_temporary/0/task_local1894406223_0001_r_000000
[INFO ] 2019-02-27 20:57:30,550 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
reduce > reduce
[INFO ] 2019-02-27 20:57:30,550 method:org.apache.hadoop.mapred.Task.sendDone(Task.java:1219)
Task 'attempt_local1894406223_0001_r_000000_0' done.
[INFO ] 2019-02-27 20:57:30,551 method:org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:352)
Finishing task: attempt_local1894406223_0001_r_000000_0
[INFO ] 2019-02-27 20:57:30,551 method:org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:329)
Starting task: attempt_local1894406223_0001_r_000001_0
[INFO ] 2019-02-27 20:57:30,554 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:123)
File Output Committer Algorithm version is 1
[INFO ] 2019-02-27 20:57:30,554 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:138)
FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
[INFO ] 2019-02-27 20:57:30,554 method:org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.isAvailable(ProcfsBasedProcessTree.java:168)
ProcfsBasedProcessTree currently is supported only on Linux.
[INFO ] 2019-02-27 20:57:30,730 method:org.apache.hadoop.mapred.Task.initialize(Task.java:619)
 Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@61c1c6cc
[INFO ] 2019-02-27 20:57:30,730 method:org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362)
Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@2aa31a05
[INFO ] 2019-02-27 20:57:30,733 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.<init>(MergeManagerImpl.java:206)
MergerManager: memoryLimit=1323407744, maxSingleShuffleLimit=330851936, mergeThreshold=873449152, ioSortFactor=10, memToMemMergeOutputsThreshold=10
[INFO ] 2019-02-27 20:57:30,735 method:org.apache.hadoop.mapreduce.task.reduce.EventFetcher.run(EventFetcher.java:61)
attempt_local1894406223_0001_r_000001_0 Thread started: EventFetcher for fetching Map Completion Events
[INFO ] 2019-02-27 20:57:30,758 method:org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.copyMapOutput(LocalFetcher.java:145)
localfetcher#2 about to shuffle output of map attempt_local1894406223_0001_m_000000_0 decomp: 80 len: 84 to MEMORY
[INFO ] 2019-02-27 20:57:30,761 method:org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.doShuffle(InMemoryMapOutput.java:93)
Read 80 bytes from map-output for attempt_local1894406223_0001_m_000000_0
[INFO ] 2019-02-27 20:57:30,762 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.closeInMemoryFile(MergeManagerImpl.java:321)
closeInMemoryFile -> map-output of size: 80, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->80
[INFO ] 2019-02-27 20:57:30,763 method:org.apache.hadoop.mapreduce.task.reduce.EventFetcher.run(EventFetcher.java:76)
EventFetcher is interrupted.. Returning
[INFO ] 2019-02-27 20:57:30,765 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
1 / 1 copied.
[INFO ] 2019-02-27 20:57:30,765 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:693)
finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
[INFO ] 2019-02-27 20:57:30,793 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:606)
Merging 1 sorted segments
[INFO ] 2019-02-27 20:57:30,793 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:705)
Down to the last merge-pass, with 1 segments left of total size: 66 bytes
[INFO ] 2019-02-27 20:57:30,797 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:760)
Merged 1 segments, 80 bytes to disk to satisfy reduce memory limit
[INFO ] 2019-02-27 20:57:30,800 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:790)
Merging 1 files, 84 bytes from disk
[INFO ] 2019-02-27 20:57:30,801 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:805)
Merging 0 segments, 0 bytes from memory into reduce
[INFO ] 2019-02-27 20:57:30,801 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:606)
Merging 1 sorted segments
[INFO ] 2019-02-27 20:57:30,804 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:705)
Down to the last merge-pass, with 1 segments left of total size: 66 bytes
[INFO ] 2019-02-27 20:57:30,805 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
1 / 1 copied.
[INFO ] 2019-02-27 20:57:30,818 method:org.apache.hadoop.mapred.Task.done(Task.java:1099)
Task:attempt_local1894406223_0001_r_000001_0 is done. And is in the process of committing
[INFO ] 2019-02-27 20:57:30,823 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
1 / 1 copied.
[INFO ] 2019-02-27 20:57:30,823 method:org.apache.hadoop.mapred.Task.commit(Task.java:1260)
Task attempt_local1894406223_0001_r_000001_0 is allowed to commit now
[INFO ] 2019-02-27 20:57:30,830 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitTask(FileOutputCommitter.java:582)
Saved output of task 'attempt_local1894406223_0001_r_000001_0' to file:/F:/hadoop-2.8.1/data/flow/output2/_temporary/0/task_local1894406223_0001_r_000001
[INFO ] 2019-02-27 20:57:30,832 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
reduce > reduce
[INFO ] 2019-02-27 20:57:30,832 method:org.apache.hadoop.mapred.Task.sendDone(Task.java:1219)
Task 'attempt_local1894406223_0001_r_000001_0' done.
[INFO ] 2019-02-27 20:57:30,833 method:org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:352)
Finishing task: attempt_local1894406223_0001_r_000001_0
[INFO ] 2019-02-27 20:57:30,833 method:org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:329)
Starting task: attempt_local1894406223_0001_r_000002_0
[INFO ] 2019-02-27 20:57:30,837 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:123)
File Output Committer Algorithm version is 1
[INFO ] 2019-02-27 20:57:30,837 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:138)
FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
[INFO ] 2019-02-27 20:57:30,838 method:org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.isAvailable(ProcfsBasedProcessTree.java:168)
ProcfsBasedProcessTree currently is supported only on Linux.
[INFO ] 2019-02-27 20:57:30,997 method:org.apache.hadoop.mapred.Task.initialize(Task.java:619)
 Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@50b5f094
[INFO ] 2019-02-27 20:57:30,997 method:org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362)
Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@6274bbb5
[INFO ] 2019-02-27 20:57:30,998 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.<init>(MergeManagerImpl.java:206)
MergerManager: memoryLimit=1323407744, maxSingleShuffleLimit=330851936, mergeThreshold=873449152, ioSortFactor=10, memToMemMergeOutputsThreshold=10
[INFO ] 2019-02-27 20:57:31,000 method:org.apache.hadoop.mapreduce.task.reduce.EventFetcher.run(EventFetcher.java:61)
attempt_local1894406223_0001_r_000002_0 Thread started: EventFetcher for fetching Map Completion Events
[INFO ] 2019-02-27 20:57:31,014 method:org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.copyMapOutput(LocalFetcher.java:145)
localfetcher#3 about to shuffle output of map attempt_local1894406223_0001_m_000000_0 decomp: 158 len: 162 to MEMORY
[INFO ] 2019-02-27 20:57:31,017 method:org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.doShuffle(InMemoryMapOutput.java:93)
Read 158 bytes from map-output for attempt_local1894406223_0001_m_000000_0
[INFO ] 2019-02-27 20:57:31,017 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.closeInMemoryFile(MergeManagerImpl.java:321)
closeInMemoryFile -> map-output of size: 158, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->158
[INFO ] 2019-02-27 20:57:31,018 method:org.apache.hadoop.mapreduce.task.reduce.EventFetcher.run(EventFetcher.java:76)
EventFetcher is interrupted.. Returning
[INFO ] 2019-02-27 20:57:31,020 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
1 / 1 copied.
[INFO ] 2019-02-27 20:57:31,020 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:693)
finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
[INFO ] 2019-02-27 20:57:31,044 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:606)
Merging 1 sorted segments
[INFO ] 2019-02-27 20:57:31,044 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:705)
Down to the last merge-pass, with 1 segments left of total size: 144 bytes
[INFO ] 2019-02-27 20:57:31,048 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:760)
Merged 1 segments, 158 bytes to disk to satisfy reduce memory limit
[INFO ] 2019-02-27 20:57:31,050 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:790)
Merging 1 files, 162 bytes from disk
[INFO ] 2019-02-27 20:57:31,051 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:805)
Merging 0 segments, 0 bytes from memory into reduce
[INFO ] 2019-02-27 20:57:31,051 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:606)
Merging 1 sorted segments
[INFO ] 2019-02-27 20:57:31,053 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:705)
Down to the last merge-pass, with 1 segments left of total size: 144 bytes
[INFO ] 2019-02-27 20:57:31,054 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
1 / 1 copied.
[INFO ] 2019-02-27 20:57:31,071 method:org.apache.hadoop.mapred.Task.done(Task.java:1099)
Task:attempt_local1894406223_0001_r_000002_0 is done. And is in the process of committing
[INFO ] 2019-02-27 20:57:31,076 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
1 / 1 copied.
[INFO ] 2019-02-27 20:57:31,077 method:org.apache.hadoop.mapred.Task.commit(Task.java:1260)
Task attempt_local1894406223_0001_r_000002_0 is allowed to commit now
[INFO ] 2019-02-27 20:57:31,083 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitTask(FileOutputCommitter.java:582)
Saved output of task 'attempt_local1894406223_0001_r_000002_0' to file:/F:/hadoop-2.8.1/data/flow/output2/_temporary/0/task_local1894406223_0001_r_000002
[INFO ] 2019-02-27 20:57:31,084 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
reduce > reduce
[INFO ] 2019-02-27 20:57:31,085 method:org.apache.hadoop.mapred.Task.sendDone(Task.java:1219)
Task 'attempt_local1894406223_0001_r_000002_0' done.
[INFO ] 2019-02-27 20:57:31,085 method:org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:352)
Finishing task: attempt_local1894406223_0001_r_000002_0
[INFO ] 2019-02-27 20:57:31,086 method:org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:329)
Starting task: attempt_local1894406223_0001_r_000003_0
[INFO ] 2019-02-27 20:57:31,088 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:123)
File Output Committer Algorithm version is 1
[INFO ] 2019-02-27 20:57:31,089 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:138)
FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
[INFO ] 2019-02-27 20:57:31,090 method:org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.isAvailable(ProcfsBasedProcessTree.java:168)
ProcfsBasedProcessTree currently is supported only on Linux.
[INFO ] 2019-02-27 20:57:31,246 method:org.apache.hadoop.mapred.Task.initialize(Task.java:619)
 Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@41f1c4e4
[INFO ] 2019-02-27 20:57:31,246 method:org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362)
Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@3e3fad1e
[INFO ] 2019-02-27 20:57:31,248 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.<init>(MergeManagerImpl.java:206)
MergerManager: memoryLimit=1323407744, maxSingleShuffleLimit=330851936, mergeThreshold=873449152, ioSortFactor=10, memToMemMergeOutputsThreshold=10
[INFO ] 2019-02-27 20:57:31,250 method:org.apache.hadoop.mapreduce.task.reduce.EventFetcher.run(EventFetcher.java:61)
attempt_local1894406223_0001_r_000003_0 Thread started: EventFetcher for fetching Map Completion Events
[INFO ] 2019-02-27 20:57:31,265 method:org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.copyMapOutput(LocalFetcher.java:145)
localfetcher#4 about to shuffle output of map attempt_local1894406223_0001_m_000000_0 decomp: 41 len: 45 to MEMORY
[INFO ] 2019-02-27 20:57:31,267 method:org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.doShuffle(InMemoryMapOutput.java:93)
Read 41 bytes from map-output for attempt_local1894406223_0001_m_000000_0
[INFO ] 2019-02-27 20:57:31,267 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.closeInMemoryFile(MergeManagerImpl.java:321)
closeInMemoryFile -> map-output of size: 41, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->41
[INFO ] 2019-02-27 20:57:31,268 method:org.apache.hadoop.mapreduce.task.reduce.EventFetcher.run(EventFetcher.java:76)
EventFetcher is interrupted.. Returning
[INFO ] 2019-02-27 20:57:31,270 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
1 / 1 copied.
[INFO ] 2019-02-27 20:57:31,271 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:693)
finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
[INFO ] 2019-02-27 20:57:31,293 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:606)
Merging 1 sorted segments
[INFO ] 2019-02-27 20:57:31,293 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:705)
Down to the last merge-pass, with 1 segments left of total size: 27 bytes
[INFO ] 2019-02-27 20:57:31,298 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:760)
Merged 1 segments, 41 bytes to disk to satisfy reduce memory limit
[INFO ] 2019-02-27 20:57:31,301 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:790)
Merging 1 files, 45 bytes from disk
[INFO ] 2019-02-27 20:57:31,301 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:805)
Merging 0 segments, 0 bytes from memory into reduce
[INFO ] 2019-02-27 20:57:31,301 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:606)
Merging 1 sorted segments
[INFO ] 2019-02-27 20:57:31,303 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:705)
Down to the last merge-pass, with 1 segments left of total size: 27 bytes
[INFO ] 2019-02-27 20:57:31,304 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
1 / 1 copied.
[INFO ] 2019-02-27 20:57:31,316 method:org.apache.hadoop.mapred.Task.done(Task.java:1099)
Task:attempt_local1894406223_0001_r_000003_0 is done. And is in the process of committing
[INFO ] 2019-02-27 20:57:31,318 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
1 / 1 copied.
[INFO ] 2019-02-27 20:57:31,319 method:org.apache.hadoop.mapred.Task.commit(Task.java:1260)
Task attempt_local1894406223_0001_r_000003_0 is allowed to commit now
[INFO ] 2019-02-27 20:57:31,325 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitTask(FileOutputCommitter.java:582)
Saved output of task 'attempt_local1894406223_0001_r_000003_0' to file:/F:/hadoop-2.8.1/data/flow/output2/_temporary/0/task_local1894406223_0001_r_000003
[INFO ] 2019-02-27 20:57:31,326 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
reduce > reduce
[INFO ] 2019-02-27 20:57:31,326 method:org.apache.hadoop.mapred.Task.sendDone(Task.java:1219)
Task 'attempt_local1894406223_0001_r_000003_0' done.
[INFO ] 2019-02-27 20:57:31,327 method:org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:352)
Finishing task: attempt_local1894406223_0001_r_000003_0
[INFO ] 2019-02-27 20:57:31,327 method:org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:329)
Starting task: attempt_local1894406223_0001_r_000004_0
[INFO ] 2019-02-27 20:57:31,329 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:123)
File Output Committer Algorithm version is 1
[INFO ] 2019-02-27 20:57:31,330 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:138)
FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
[INFO ] 2019-02-27 20:57:31,331 method:org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.isAvailable(ProcfsBasedProcessTree.java:168)
ProcfsBasedProcessTree currently is supported only on Linux.
[INFO ] 2019-02-27 20:57:31,487 method:org.apache.hadoop.mapred.Task.initialize(Task.java:619)
 Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@620882f4
[INFO ] 2019-02-27 20:57:31,487 method:org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362)
Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@49584eb8
[INFO ] 2019-02-27 20:57:31,489 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.<init>(MergeManagerImpl.java:206)
MergerManager: memoryLimit=1323407744, maxSingleShuffleLimit=330851936, mergeThreshold=873449152, ioSortFactor=10, memToMemMergeOutputsThreshold=10
[INFO ] 2019-02-27 20:57:31,491 method:org.apache.hadoop.mapreduce.task.reduce.EventFetcher.run(EventFetcher.java:61)
attempt_local1894406223_0001_r_000004_0 Thread started: EventFetcher for fetching Map Completion Events
[INFO ] 2019-02-27 20:57:31,493 method:org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1418)
 map 100% reduce 100%
[INFO ] 2019-02-27 20:57:31,503 method:org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.copyMapOutput(LocalFetcher.java:145)
localfetcher#5 about to shuffle output of map attempt_local1894406223_0001_m_000000_0 decomp: 158 len: 162 to MEMORY
[INFO ] 2019-02-27 20:57:31,505 method:org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.doShuffle(InMemoryMapOutput.java:93)
Read 158 bytes from map-output for attempt_local1894406223_0001_m_000000_0
[INFO ] 2019-02-27 20:57:31,505 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.closeInMemoryFile(MergeManagerImpl.java:321)
closeInMemoryFile -> map-output of size: 158, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->158
[INFO ] 2019-02-27 20:57:31,506 method:org.apache.hadoop.mapreduce.task.reduce.EventFetcher.run(EventFetcher.java:76)
EventFetcher is interrupted.. Returning
[INFO ] 2019-02-27 20:57:31,508 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
1 / 1 copied.
[INFO ] 2019-02-27 20:57:31,508 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:693)
finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
[INFO ] 2019-02-27 20:57:31,535 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:606)
Merging 1 sorted segments
[INFO ] 2019-02-27 20:57:31,536 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:705)
Down to the last merge-pass, with 1 segments left of total size: 144 bytes
[INFO ] 2019-02-27 20:57:31,540 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:760)
Merged 1 segments, 158 bytes to disk to satisfy reduce memory limit
[INFO ] 2019-02-27 20:57:31,543 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:790)
Merging 1 files, 162 bytes from disk
[INFO ] 2019-02-27 20:57:31,543 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:805)
Merging 0 segments, 0 bytes from memory into reduce
[INFO ] 2019-02-27 20:57:31,543 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:606)
Merging 1 sorted segments
[INFO ] 2019-02-27 20:57:31,545 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:705)
Down to the last merge-pass, with 1 segments left of total size: 144 bytes
[INFO ] 2019-02-27 20:57:31,545 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
1 / 1 copied.
[INFO ] 2019-02-27 20:57:31,557 method:org.apache.hadoop.mapred.Task.done(Task.java:1099)
Task:attempt_local1894406223_0001_r_000004_0 is done. And is in the process of committing
[INFO ] 2019-02-27 20:57:31,560 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
1 / 1 copied.
[INFO ] 2019-02-27 20:57:31,560 method:org.apache.hadoop.mapred.Task.commit(Task.java:1260)
Task attempt_local1894406223_0001_r_000004_0 is allowed to commit now
[INFO ] 2019-02-27 20:57:31,566 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitTask(FileOutputCommitter.java:582)
Saved output of task 'attempt_local1894406223_0001_r_000004_0' to file:/F:/hadoop-2.8.1/data/flow/output2/_temporary/0/task_local1894406223_0001_r_000004
[INFO ] 2019-02-27 20:57:31,568 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
reduce > reduce
[INFO ] 2019-02-27 20:57:31,568 method:org.apache.hadoop.mapred.Task.sendDone(Task.java:1219)
Task 'attempt_local1894406223_0001_r_000004_0' done.
[INFO ] 2019-02-27 20:57:31,568 method:org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:352)
Finishing task: attempt_local1894406223_0001_r_000004_0
[INFO ] 2019-02-27 20:57:31,568 method:org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:329)
Starting task: attempt_local1894406223_0001_r_000005_0
[INFO ] 2019-02-27 20:57:31,571 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:123)
File Output Committer Algorithm version is 1
[INFO ] 2019-02-27 20:57:31,571 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.<init>(FileOutputCommitter.java:138)
FileOutputCommitter skip cleanup _temporary folders under output directory:false, ignore cleanup failures: false
[INFO ] 2019-02-27 20:57:31,572 method:org.apache.hadoop.yarn.util.ProcfsBasedProcessTree.isAvailable(ProcfsBasedProcessTree.java:168)
ProcfsBasedProcessTree currently is supported only on Linux.
[INFO ] 2019-02-27 20:57:31,725 method:org.apache.hadoop.mapred.Task.initialize(Task.java:619)
 Using ResourceCalculatorProcessTree : org.apache.hadoop.yarn.util.WindowsBasedProcessTree@2a2d6de4
[INFO ] 2019-02-27 20:57:31,726 method:org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:362)
Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@c945577
[INFO ] 2019-02-27 20:57:31,729 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.<init>(MergeManagerImpl.java:206)
MergerManager: memoryLimit=1323407744, maxSingleShuffleLimit=330851936, mergeThreshold=873449152, ioSortFactor=10, memToMemMergeOutputsThreshold=10
[INFO ] 2019-02-27 20:57:31,730 method:org.apache.hadoop.mapreduce.task.reduce.EventFetcher.run(EventFetcher.java:61)
attempt_local1894406223_0001_r_000005_0 Thread started: EventFetcher for fetching Map Completion Events
[INFO ] 2019-02-27 20:57:31,742 method:org.apache.hadoop.mapreduce.task.reduce.LocalFetcher.copyMapOutput(LocalFetcher.java:145)
localfetcher#6 about to shuffle output of map attempt_local1894406223_0001_m_000000_0 decomp: 269 len: 273 to MEMORY
[INFO ] 2019-02-27 20:57:31,744 method:org.apache.hadoop.mapreduce.task.reduce.InMemoryMapOutput.doShuffle(InMemoryMapOutput.java:93)
Read 269 bytes from map-output for attempt_local1894406223_0001_m_000000_0
[INFO ] 2019-02-27 20:57:31,745 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.closeInMemoryFile(MergeManagerImpl.java:321)
closeInMemoryFile -> map-output of size: 269, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->269
[INFO ] 2019-02-27 20:57:31,746 method:org.apache.hadoop.mapreduce.task.reduce.EventFetcher.run(EventFetcher.java:76)
EventFetcher is interrupted.. Returning
[INFO ] 2019-02-27 20:57:31,747 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
1 / 1 copied.
[INFO ] 2019-02-27 20:57:31,747 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:693)
finalMerge called with 1 in-memory map-outputs and 0 on-disk map-outputs
[INFO ] 2019-02-27 20:57:31,770 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:606)
Merging 1 sorted segments
[INFO ] 2019-02-27 20:57:31,770 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:705)
Down to the last merge-pass, with 1 segments left of total size: 255 bytes
[INFO ] 2019-02-27 20:57:31,774 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:760)
Merged 1 segments, 269 bytes to disk to satisfy reduce memory limit
[INFO ] 2019-02-27 20:57:31,776 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:790)
Merging 1 files, 273 bytes from disk
[INFO ] 2019-02-27 20:57:31,776 method:org.apache.hadoop.mapreduce.task.reduce.MergeManagerImpl.finalMerge(MergeManagerImpl.java:805)
Merging 0 segments, 0 bytes from memory into reduce
[INFO ] 2019-02-27 20:57:31,776 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:606)
Merging 1 sorted segments
[INFO ] 2019-02-27 20:57:31,778 method:org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:705)
Down to the last merge-pass, with 1 segments left of total size: 255 bytes
[INFO ] 2019-02-27 20:57:31,779 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
1 / 1 copied.
[INFO ] 2019-02-27 20:57:31,793 method:org.apache.hadoop.mapred.Task.done(Task.java:1099)
Task:attempt_local1894406223_0001_r_000005_0 is done. And is in the process of committing
[INFO ] 2019-02-27 20:57:31,796 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
1 / 1 copied.
[INFO ] 2019-02-27 20:57:31,797 method:org.apache.hadoop.mapred.Task.commit(Task.java:1260)
Task attempt_local1894406223_0001_r_000005_0 is allowed to commit now
[INFO ] 2019-02-27 20:57:31,804 method:org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter.commitTask(FileOutputCommitter.java:582)
Saved output of task 'attempt_local1894406223_0001_r_000005_0' to file:/F:/hadoop-2.8.1/data/flow/output2/_temporary/0/task_local1894406223_0001_r_000005
[INFO ] 2019-02-27 20:57:31,806 method:org.apache.hadoop.mapred.LocalJobRunner$Job.statusUpdate(LocalJobRunner.java:618)
reduce > reduce
[INFO ] 2019-02-27 20:57:31,806 method:org.apache.hadoop.mapred.Task.sendDone(Task.java:1219)
Task 'attempt_local1894406223_0001_r_000005_0' done.
[INFO ] 2019-02-27 20:57:31,806 method:org.apache.hadoop.mapred.LocalJobRunner$Job$ReduceTaskRunnable.run(LocalJobRunner.java:352)
Finishing task: attempt_local1894406223_0001_r_000005_0
[INFO ] 2019-02-27 20:57:31,807 method:org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:483)
reduce task executor complete.
[INFO ] 2019-02-27 20:57:32,495 method:org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1429)
Job job_local1894406223_0001 completed successfully
[INFO ] 2019-02-27 20:57:32,536 method:org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1436)
Counters: 30
	File System Counters
		FILE: Number of bytes read=36824
		FILE: Number of bytes written=2256448
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
	Map-Reduce Framework
		Map input records=22
		Map output records=22
		Map output bytes=808
		Map output materialized bytes=888
		Input split bytes=111
		Combine input records=0
		Combine output records=0
		Reduce input groups=21
		Reduce shuffle bytes=888
		Reduce input records=22
		Reduce output records=21
		Spilled Records=44
		Shuffled Maps =6
		Failed Shuffles=0
		Merged Map outputs=6
		GC time elapsed (ms)=11
		Total committed heap usage (bytes)=1640497152
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=2226
	File Output Format Counters 
		Bytes Written=872

猜你喜欢

转载自blog.csdn.net/u014106644/article/details/87989491