DBILITY

hadoop window10 x64 eclipse에서 mapreduce debug 본문

bigdata/hadoop

hadoop window10 x64 eclipse에서 mapreduce debug

DBILITY 2016. 10. 2. 15:13
반응형

WindowsLocalFileSystem.java는 https://bigdatanerd.wordpress.com/2013/11/14/에서 참조하였습니다.
세상엔 훌륭하고, 뛰어난 재능을 가진 분들이 정말 많은 것 같습니다.
전문가분들이야 당연히 더 좋은 방법을 아시겠지만, 입문자 입장에선 거의 신의 한 수 같은 느낌.
window환경에서 개발 시 MapReduce를 eclipse에서 바로 실행해 볼 수 있습니다.

pom에 추가하세요.
hadoop-core 1.2.1
slf4j-api, jcl-over-slf4j 1.7.20
logback-classic 1.1.7

 

logback.xml이 없을 경우 debug level로 동작합니다.(설정 참고).

  1. WindowsLocalFileSystem.java
    package com.dbility.hadoop;
    /*
     * @Auther bigdatanerd
     * @url https://bigdatanerd.wordpress.com/2013/11/14/
     *
     * */
    import java.io.IOException;
    
    import org.apache.hadoop.fs.LocalFileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.fs.permission.FsPermission;
    
    public class WindowsLocalFileSystem extends LocalFileSystem {
    
    	public WindowsLocalFileSystem() {
    		super();
    	}
    
    	public boolean mkdirs(final Path path, final FsPermission permission)
    			throws IOException {
    		final boolean result = super.mkdirs(path);
    		this.setPermission(path, permission);
    		return result;
    	}
    
    	public void setPermission(final Path path, final FsPermission permission)
    			throws IOException {
    		try {
    			super.setPermission(path, permission);
    		} catch (final IOException e) {
    			System.err.println("Cant help it, hence ignoring IOException setting persmission for path \""
    							+ path + "\": " + e.getMessage());
    		}
    	}
    }
    테스트 결과 파일 시스템 권한 관련 오류가 발생하더라도, 작업이 진행됩니다.
  2. 드라이버 클래스 WordCount.java
    WindowsLocalFileSystem작성자의 예시로는 Tool implements 하고 Configured를 extends 해서 ToolRunner.run 하도록 되어 있는데,
    아래와 같이 그냥 해도 5. 실행 로그의 16 라인 같은 경고가 출력되지만, 동작합니다.
    jobClient소스 확인 결과 mapred.used.genericoptionsparser 옵션을 true로 설정하면 경고가 출력되지 않습니다.
    hadoop 2.6.4, 2.7.2 문서를 확인해보니 mapreduce.client.genericoptionsparser.used로 변경되었습니다.
    GenericOptionsParser를 사용하면 eclipse Run configuration에 -D 옵션을 통해 기존 하둡 환경설정 옵션을 변경할 수 있습니다.
    이때, mapred.used.genericoptionsparser 옵션은 GenericOptionsParser에서 자동으로 true로 변경됩니다.
    우선 테스트 목적으로 이대로 사용해 봅니다.
    debug 할 Driver클래스에 31~32의 내용을 추가합니다.
    Memory설정은 하둡 문서를 참고하세요.
    package com.dbility.hadoop;
     
    import org.apache.hadoop.conf.Configuration;
    import org.apache.hadoop.fs.FileSystem;
    import org.apache.hadoop.fs.Path;
    import org.apache.hadoop.io.IntWritable;
    import org.apache.hadoop.io.Text;
    import org.apache.hadoop.mapreduce.Job;
    import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
    import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
    import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
    import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
    import org.slf4j.Logger;
    import org.slf4j.LoggerFactory;
     
    public class WordCount {
     
        private static final Logger LOG = LoggerFactory.getLogger(WordCount.class);
     
        public static void main(String[] args) throws Exception {
     
            if ( args.length != 2 ) {
                LOG.info("Usage : <input> <output>");
                Runtime.getRuntime().exit(2);
            }
     
            Configuration conf = new Configuration();
     
            conf.set("fs.default.name", "file:///");
            conf.set("mapred.job.tracker", "local");
            conf.set("fs.file.impl", "com.dbility.hadoop.WindowsLocalFileSystem");
            conf.set("io.serializations", "org.apache.hadoop.io.serializer.JavaSerialization,org.apache.hadoop.io.serializer.WritableSerialization");
            conf.set("mapred.map.child.java.opts", "-Xmx1024m");
            conf.set("io.sort.mb", "512");
     
            Job job = new Job(conf,"wordcount");
     
            job.setJarByClass(WordCount.class);
            job.setMapperClass(WordCountMapper.class);
            job.setReducerClass(WordCountReducer.class);
     
            job.setInputFormatClass(TextInputFormat.class);
            job.setOutputFormatClass(TextOutputFormat.class);
     
            job.setOutputKeyClass(Text.class);
            job.setOutputValueClass(IntWritable.class);
     
            FileInputFormat.addInputPath(job, new Path(args[0]));
            FileOutputFormat.setOutputPath(job, new Path(args[1]));
     
            FileSystem hdfs = FileSystem.get(conf);
            Path path = new Path(args[1]);
            if ( hdfs.exists(path) ) hdfs.delete(path, true); //이전 output 디렉토리가 존재하는 경우 삭제
     
            Runtime.getRuntime().exit( job.waitForCompletion(true) ? 0 : 1 );
     
        }
     
    }​
  3. input.txt 파일에 Hello World! 를 입력하고 저장합니다.
  4. Run Configuration에 input outout 경로를 설정하고, 실행합니다. 
  5. 실행 로그
    logback.xml에 org.apache.hadoop.mapred.JobClient에 대한 logger를 추가하고, info level로 console출력을 할 경우,
    Map/Reduce 진행률 및 결과만 출력할 수 있습니다.


    16:27:31.544 [main] DEBUG org.apache.hadoop.security.Groups -  Creating new Groups object
    16:27:31.560 [main] DEBUG org.apache.hadoop.security.Groups - Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000
    16:27:31.638 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - hadoop login
    16:27:31.638 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - hadoop login commit
    16:27:31.653 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - using local user:NTUserPrincipal: ROOKIE
    16:27:31.653 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - UGI loginUser:ROOKIE
    16:27:31.669 [main] DEBUG org.apache.hadoop.fs.FileSystem - Creating filesystem for file:///
    16:27:31.685 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - PriviledgedAction as:ROOKIE from:org.apache.hadoop.mapreduce.Job.connect(Job.java:561)
    16:27:31.716 [main] DEBUG org.apache.hadoop.security.UserGroupInformation - PriviledgedAction as:ROOKIE from:org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:936)
    16:27:31.716 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Trying to load the custom-built native-hadoop library...
    16:27:31.732 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - Failed to load native-hadoop with error: java.lang.UnsatisfiedLinkError: no hadoop in java.library.path
    16:27:31.732 [main] DEBUG org.apache.hadoop.util.NativeCodeLoader - java.library.path=C:\Dev64_Kepler\tools\jdk\jdk170\bin;C:\WINDOWS\Sun\Java\bin;C:\WINDOWS\system32;C:\WINDOWS;C:\Program Files\Intel\iCLS Client\;C:\WINDOWS\system32;C:\WINDOWS;C:\WINDOWS\System32\Wbem;C:\WINDOWS\System32\WindowsPowerShell\v1.0\;C:\Program Files\Intel\WiFi\bin\;C:\Program Files\Common Files\Intel\WirelessCommon\;C:\Program Files\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files\Intel\Intel(R) Management Engine Components\IPT;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\DAL;C:\Program Files (x86)\Intel\Intel(R) Management Engine Components\IPT;C:\Program Files\TortoiseSVN\bin;C:\Program Files\Intel\WiFi\bin\;C:\Program Files\Common Files\Intel\WirelessCommon\;C:\Users\ROOKIE\AppData\Local\Microsoft\WindowsApps;.
    16:27:31.732 [main] WARN org.apache.hadoop.util.NativeCodeLoader - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
    Cant help it, hence ignoring IOException setting persmission for path "file:/tmp/hadoop-ROOKIE/mapred/staging/ROOKIE2024676219/.staging": Failed to set permissions of path: \tmp\hadoop-ROOKIE\mapred\staging\ROOKIE2024676219\.staging to 0700
    16:27:31.732 [main] DEBUG org.apache.hadoop.mapred.JobClient - adding the following namenodes' delegation tokens:null
    16:27:31.732 [main] WARN org.apache.hadoop.mapred.JobClient - Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
    16:27:31.732 [main] DEBUG org.apache.hadoop.mapred.JobClient - default FileSystem: file:///
    Cant help it, hence ignoring IOException setting persmission for path "file:/tmp/hadoop-ROOKIE/mapred/staging/ROOKIE2024676219/.staging/job_local2024676219_0001": Failed to set permissions of path: \tmp\hadoop-ROOKIE\mapred\staging\ROOKIE2024676219\.staging\job_local2024676219_0001 to 0700
    16:27:31.732 [main] WARN org.apache.hadoop.mapred.JobClient - No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
    16:27:31.732 [main] DEBUG org.apache.hadoop.mapred.JobClient - Creating splits at file:/tmp/hadoop-ROOKIE/mapred/staging/ROOKIE2024676219/.staging/job_local2024676219_0001
    16:27:31.732 [main] INFO org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total input paths to process : 1
    16:27:31.747 [main] WARN org.apache.hadoop.io.compress.snappy.LoadSnappy - Snappy native library not loaded
    16:27:31.747 [main] DEBUG org.apache.hadoop.mapreduce.lib.input.FileInputFormat - Total # of splits: 1
    Cant help it, hence ignoring IOException setting persmission for path "file:/tmp/hadoop-ROOKIE/mapred/staging/ROOKIE2024676219/.staging/job_local2024676219_0001/job.split": Failed to set permissions of path: \tmp\hadoop-ROOKIE\mapred\staging\ROOKIE2024676219\.staging\job_local2024676219_0001\job.split to 0644
    Cant help it, hence ignoring IOException setting persmission for path "file:/tmp/hadoop-ROOKIE/mapred/staging/ROOKIE2024676219/.staging/job_local2024676219_0001/job.splitmetainfo": Failed to set permissions of path: \tmp\hadoop-ROOKIE\mapred\staging\ROOKIE2024676219\.staging\job_local2024676219_0001\job.splitmetainfo to 0644
    Cant help it, hence ignoring IOException setting persmission for path "file:/tmp/hadoop-ROOKIE/mapred/staging/ROOKIE2024676219/.staging/job_local2024676219_0001/job.xml": Failed to set permissions of path: \tmp\hadoop-ROOKIE\mapred\staging\ROOKIE2024676219\.staging\job_local2024676219_0001\job.xml to 0644
    16:27:31.841 [main] DEBUG org.apache.hadoop.mapred.JobClient - Printing tokens for job: job_local2024676219_0001
    16:27:31.857 [main] DEBUG org.apache.hadoop.mapred.TaskRunner - Fully deleting contents of C:\tmp\hadoop-ROOKIE\mapred\local\localRunner
    16:27:31.935 [main] INFO org.apache.hadoop.mapred.JobClient - Running job: job_local2024676219_0001
    16:27:31.966 [Thread-2] DEBUG org.apache.hadoop.mapred.LocalJobRunner - Starting thread pool executor.
    16:27:31.982 [Thread-2] DEBUG org.apache.hadoop.mapred.LocalJobRunner - Max local threads: 1
    16:27:31.982 [Thread-2] DEBUG org.apache.hadoop.mapred.LocalJobRunner - Map tasks to process: 1
    16:27:31.982 [Thread-2] INFO org.apache.hadoop.mapred.LocalJobRunner - Waiting for map tasks
    16:27:31.982 [pool-1-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner - Starting task: attempt_local2024676219_0001_m_000000_0
    16:27:31.982 [pool-1-thread-1] DEBUG org.apache.hadoop.mapred.SortedRanges - currentIndex 0   0:0
    16:27:31.997 [pool-1-thread-1] DEBUG org.apache.hadoop.mapred.Counters - Creating group org.apache.hadoop.mapred.Task$Counter with bundle
    16:27:31.997 [pool-1-thread-1] DEBUG org.apache.hadoop.mapred.Counters - Adding SPILLED_RECORDS
    16:27:32.013 [pool-1-thread-1] DEBUG org.apache.hadoop.mapred.TaskRunner - mapred.local.dir for child : /tmp/hadoop-ROOKIE/mapred/local/taskTracker/ROOKIE/jobcache/job_local2024676219_0001/attempt_local2024676219_0001_m_000000_0
    16:27:32.013 [pool-1-thread-1] DEBUG org.apache.hadoop.mapred.Task - using new api for output committer
    16:27:32.028 [pool-1-thread-1] INFO org.apache.hadoop.mapred.Task -  Using ResourceCalculatorPlugin : null
    16:27:32.028 [pool-1-thread-1] DEBUG org.apache.hadoop.mapred.Counters - Adding SPLIT_RAW_BYTES
    16:27:32.028 [pool-1-thread-1] INFO org.apache.hadoop.mapred.MapTask - Processing split: file:/d:/hadoop_test/input.txt:0+15
    16:27:32.028 [pool-1-thread-1] DEBUG org.apache.hadoop.mapred.Counters - Adding MAP_INPUT_RECORDS
    16:27:32.028 [pool-1-thread-1] DEBUG org.apache.hadoop.mapred.Counters - Creating group org.apache.hadoop.mapreduce.lib.input.FileInputFormat$Counter with bundle
    16:27:32.028 [pool-1-thread-1] DEBUG org.apache.hadoop.mapred.Counters - Adding BYTES_READ
    16:27:32.044 [pool-1-thread-1] INFO org.apache.hadoop.mapred.MapTask - io.sort.mb = 100
    16:27:32.090 [pool-1-thread-1] INFO org.apache.hadoop.mapred.MapTask - data buffer = 79691776/99614720
    16:27:32.090 [pool-1-thread-1] INFO org.apache.hadoop.mapred.MapTask - record buffer = 262144/327680
    16:27:32.090 [pool-1-thread-1] DEBUG org.apache.hadoop.mapred.Counters - Adding MAP_OUTPUT_BYTES
    16:27:32.090 [pool-1-thread-1] DEBUG org.apache.hadoop.mapred.Counters - Adding MAP_OUTPUT_RECORDS
    16:27:32.090 [pool-1-thread-1] DEBUG org.apache.hadoop.mapred.Counters - Adding COMBINE_INPUT_RECORDS
    16:27:32.090 [pool-1-thread-1] DEBUG org.apache.hadoop.mapred.Counters - Adding COMBINE_OUTPUT_RECORDS
    16:27:32.090 [pool-1-thread-1] DEBUG org.apache.hadoop.mapred.Counters - Adding MAP_OUTPUT_MATERIALIZED_BYTES
    16:27:32.105 [pool-1-thread-1] INFO com.dbility.hadoop.WordCountMapper - token word : Hello
    16:27:32.105 [pool-1-thread-1] INFO com.dbility.hadoop.WordCountMapper - token word : World!
    16:27:32.105 [pool-1-thread-1] INFO org.apache.hadoop.mapred.MapTask - Starting flush of map output
    16:27:32.121 [pool-1-thread-1] INFO org.apache.hadoop.mapred.MapTask - Finished spill 0
    16:27:32.121 [pool-1-thread-1] INFO org.apache.hadoop.mapred.Task - Task:attempt_local2024676219_0001_m_000000_0 is done. And is in the process of commiting
    16:27:32.137 [pool-1-thread-1] DEBUG org.apache.hadoop.mapred.Counters - Creating group FileSystemCounters with nothing
    16:27:32.137 [pool-1-thread-1] DEBUG org.apache.hadoop.mapred.Counters - Adding FILE_BYTES_READ
    16:27:32.137 [pool-1-thread-1] DEBUG org.apache.hadoop.mapred.Counters - Adding FILE_BYTES_WRITTEN
    16:27:32.137 [pool-1-thread-1] DEBUG org.apache.hadoop.mapred.Counters - Adding COMMITTED_HEAP_BYTES
    16:27:32.137 [pool-1-thread-1] DEBUG org.apache.hadoop.mapred.Counters - Creating group FileSystemCounters with nothing
    16:27:32.137 [pool-1-thread-1] DEBUG org.apache.hadoop.mapred.Counters - Creating group org.apache.hadoop.mapreduce.lib.input.FileInputFormat$Counter with bundle
    16:27:32.137 [pool-1-thread-1] DEBUG org.apache.hadoop.mapred.Counters - Creating group org.apache.hadoop.mapred.Task$Counter with bundle
    16:27:32.137 [pool-1-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner -
    16:27:32.137 [pool-1-thread-1] INFO org.apache.hadoop.mapred.Task - Task 'attempt_local2024676219_0001_m_000000_0' done.
    16:27:32.137 [pool-1-thread-1] INFO org.apache.hadoop.mapred.LocalJobRunner - Finishing task: attempt_local2024676219_0001_m_000000_0
    16:27:32.137 [Thread-2] INFO org.apache.hadoop.mapred.LocalJobRunner - Map task executor complete.
    16:27:32.137 [Thread-2] DEBUG org.apache.hadoop.mapred.SortedRanges - currentIndex 0   0:0
    16:27:32.137 [Thread-2] DEBUG org.apache.hadoop.mapred.Counters - Creating group org.apache.hadoop.mapred.Task$Counter with bundle
    16:27:32.137 [Thread-2] DEBUG org.apache.hadoop.mapred.Counters - Adding SPILLED_RECORDS
    16:27:32.137 [Thread-2] DEBUG org.apache.hadoop.mapred.Counters - Adding REDUCE_SHUFFLE_BYTES
    16:27:32.137 [Thread-2] DEBUG org.apache.hadoop.mapred.Counters - Adding REDUCE_INPUT_GROUPS
    16:27:32.137 [Thread-2] DEBUG org.apache.hadoop.mapred.Counters - Adding REDUCE_INPUT_RECORDS
    16:27:32.137 [Thread-2] DEBUG org.apache.hadoop.mapred.Counters - Adding REDUCE_OUTPUT_RECORDS
    16:27:32.137 [Thread-2] DEBUG org.apache.hadoop.mapred.Counters - Adding COMBINE_OUTPUT_RECORDS
    16:27:32.137 [Thread-2] DEBUG org.apache.hadoop.mapred.TaskRunner - mapred.local.dir for child : /tmp/hadoop-ROOKIE/mapred/local/taskTracker/ROOKIE/jobcache/job_local2024676219_0001/attempt_local2024676219_0001_r_000000_0
    16:27:32.152 [Thread-2] DEBUG org.apache.hadoop.mapred.Task - using new api for output committer
    16:27:32.152 [Thread-2] INFO org.apache.hadoop.mapred.Task -  Using ResourceCalculatorPlugin : null
    16:27:32.152 [Thread-2] DEBUG org.apache.hadoop.mapred.Counters - Creating group org.apache.hadoop.mapred.Task$Counter with bundle
    16:27:32.152 [Thread-2] INFO org.apache.hadoop.mapred.LocalJobRunner -
    16:27:32.152 [Thread-2] INFO org.apache.hadoop.mapred.Merger - Merging 1 sorted segments
    16:27:32.168 [Thread-2] INFO org.apache.hadoop.mapred.Merger - Down to the last merge-pass, with 1 segments left of total size: 30 bytes
    16:27:32.168 [Thread-2] DEBUG org.apache.hadoop.mapred.Counters - Creating group org.apache.hadoop.mapred.Task$Counter with bundle
    16:27:32.168 [Thread-2] INFO org.apache.hadoop.mapred.LocalJobRunner -
    16:27:32.168 [Thread-2] DEBUG org.apache.hadoop.mapred.Counters - Creating group org.apache.hadoop.mapreduce.lib.output.FileOutputFormat$Counter with bundle
    16:27:32.168 [Thread-2] DEBUG org.apache.hadoop.mapred.Counters - Adding BYTES_WRITTEN
    16:27:32.184 [Thread-2] INFO com.dbility.hadoop.WordCountReducer - key : World!
    16:27:32.184 [Thread-2] INFO com.dbility.hadoop.WordCountReducer - count : 1
    16:27:32.184 [Thread-2] INFO com.dbility.hadoop.WordCountReducer - key : Hello
    16:27:32.184 [Thread-2] INFO com.dbility.hadoop.WordCountReducer - count : 1
    16:27:32.184 [Thread-2] INFO org.apache.hadoop.mapred.Task - Task:attempt_local2024676219_0001_r_000000_0 is done. And is in the process of commiting
    16:27:32.184 [Thread-2] DEBUG org.apache.hadoop.mapred.Counters - Creating group FileSystemCounters with nothing
    16:27:32.184 [Thread-2] DEBUG org.apache.hadoop.mapred.Counters - Adding FILE_BYTES_READ
    16:27:32.184 [Thread-2] DEBUG org.apache.hadoop.mapred.Counters - Adding FILE_BYTES_WRITTEN
    16:27:32.184 [Thread-2] DEBUG org.apache.hadoop.mapred.Counters - Adding COMMITTED_HEAP_BYTES
    16:27:32.184 [Thread-2] DEBUG org.apache.hadoop.mapred.Counters - Creating group org.apache.hadoop.mapreduce.lib.output.FileOutputFormat$Counter with bundle
    16:27:32.184 [Thread-2] DEBUG org.apache.hadoop.mapred.Counters - Creating group FileSystemCounters with nothing
    16:27:32.184 [Thread-2] DEBUG org.apache.hadoop.mapred.Counters - Creating group org.apache.hadoop.mapred.Task$Counter with bundle
    16:27:32.184 [Thread-2] INFO org.apache.hadoop.mapred.LocalJobRunner -
    16:27:32.184 [Thread-2] INFO org.apache.hadoop.mapred.Task - Task attempt_local2024676219_0001_r_000000_0 is allowed to commit now
    16:27:32.184 [Thread-2] DEBUG org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Moved file:/d:/hadoop_test/output/_temporary/_attempt_local2024676219_0001_r_000000_0/part-r-00000 to d:/hadoop_test/output/part-r-00000
    16:27:32.184 [Thread-2] INFO org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter - Saved output of task 'attempt_local2024676219_0001_r_000000_0' to d:/hadoop_test/output
    16:27:32.184 [Thread-2] DEBUG org.apache.hadoop.mapred.Counters - Creating group org.apache.hadoop.mapreduce.lib.output.FileOutputFormat$Counter with bundle
    16:27:32.184 [Thread-2] DEBUG org.apache.hadoop.mapred.Counters - Creating group FileSystemCounters with nothing
    16:27:32.184 [Thread-2] DEBUG org.apache.hadoop.mapred.Counters - Creating group org.apache.hadoop.mapred.Task$Counter with bundle
    16:27:32.184 [Thread-2] INFO org.apache.hadoop.mapred.LocalJobRunner - reduce > reduce
    16:27:32.184 [Thread-2] INFO org.apache.hadoop.mapred.Task - Task 'attempt_local2024676219_0001_r_000000_0' done.
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -  map 100% reduce 100%
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient - Job complete: job_local2024676219_0001
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Creating group org.apache.hadoop.mapreduce.lib.input.FileInputFormat$Counter with bundle
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding BYTES_READ
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Creating group FileSystemCounters with nothing
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding FILE_BYTES_READ
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding FILE_BYTES_WRITTEN
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Creating group org.apache.hadoop.mapred.Task$Counter with bundle
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding MAP_OUTPUT_MATERIALIZED_BYTES
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding COMBINE_OUTPUT_RECORDS
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding MAP_INPUT_RECORDS
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding SPILLED_RECORDS
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding MAP_OUTPUT_BYTES
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding COMMITTED_HEAP_BYTES
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding SPLIT_RAW_BYTES
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding MAP_OUTPUT_RECORDS
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding COMBINE_INPUT_RECORDS
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Creating group FileSystemCounters with nothing
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding FILE_BYTES_READ
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding FILE_BYTES_WRITTEN
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Creating group org.apache.hadoop.mapreduce.lib.input.FileInputFormat$Counter with bundle
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding BYTES_READ
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Creating group org.apache.hadoop.mapred.Task$Counter with bundle
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding MAP_OUTPUT_MATERIALIZED_BYTES
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding COMBINE_OUTPUT_RECORDS
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding MAP_INPUT_RECORDS
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding SPILLED_RECORDS
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding MAP_OUTPUT_BYTES
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding COMMITTED_HEAP_BYTES
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding COMBINE_INPUT_RECORDS
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding MAP_OUTPUT_RECORDS
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding SPLIT_RAW_BYTES
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Creating group org.apache.hadoop.mapreduce.lib.output.FileOutputFormat$Counter with bundle
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding BYTES_WRITTEN
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding REDUCE_INPUT_GROUPS
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding REDUCE_SHUFFLE_BYTES
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding REDUCE_OUTPUT_RECORDS
    16:27:32.949 [main] DEBUG org.apache.hadoop.mapred.Counters - Adding REDUCE_INPUT_RECORDS
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient - Counters: 17
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -   File Output Format Counters
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -     Bytes Written=32
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -   File Input Format Counters
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -     Bytes Read=15
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -   FileSystemCounters
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -     FILE_BYTES_READ=362
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -     FILE_BYTES_WRITTEN=103452
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -   Map-Reduce Framework
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -     Reduce input groups=2
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -     Map output materialized bytes=34
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -     Combine output records=0
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -     Map input records=1
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -     Reduce shuffle bytes=0
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -     Reduce output records=2
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -     Spilled Records=4
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -     Map output bytes=24
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -     Total committed heap usage (bytes)=448790528
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -     SPLIT_RAW_BYTES=95
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -     Map output records=2
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -     Combine input records=0
    16:27:32.949 [main] INFO org.apache.hadoop.mapred.JobClient -     Reduce input records=2
    16:27:32.949 [Thread-1] DEBUG org.apache.hadoop.fs.FileSystem - Starting clear of FileSystem cache with 1 elements.
    16:27:32.949 [Thread-1] DEBUG org.apache.hadoop.fs.FileSystem - Removing filesystem for file:///
    16:27:32.949 [Thread-1] DEBUG org.apache.hadoop.fs.FileSystem - Removing filesystem for file:///
    16:27:32.949 [Thread-1] DEBUG org.apache.hadoop.fs.FileSystem - Done clearing cache​


  6. 약 56mb 크기의 log파일 분석 시 JVM상태입니다. standalone 실행이 되는 게 맞나 봅니다.

 

반응형
Comments