[Java]统计目录下Java源文件的关键字出现次数

题目

题目也可抽象为统计文件正文中某字符串出现的次数.

解题思路

1.Java中关键字共有50个,分别为:

    final String[] KEYWORDS = { //50个关键字
            "abstract", "assert", "boolean", "break", "byte",
            "case", "catch", "char", "class", "const",
            "continue", "default", "do", "double", "else",
            "enum", "extends", "final", "finally", "float",
            "for", "goto", "if", "implements", "import",
            "instanceof", "int", "interface", "long", "native",
            "new", "package", " private", " protected", "public",
            "return", "strictfp", "short", "static", "super",
            "switch", "synchronized", "this", "throw", "throws",
            "transient", "try", "void", "volatile", "while"
    };

2.说明与初始化

	...
    ArrayList<File> fileList;//保存Java文件列表
    File root;//给定的目录
    Map keywords; //HashMap用于保存关键字与出现次数, 例如:<key,value>=<"int",3>
    ...
        public KeywordsAnalyzer(String pathName) {
        root = new File(pathName);
        fileList = new ArrayList<>();
        keywords = new HashMap();
        for (String word : KEYWORDS) {
            keywords.put(word,0);//按KEYWORDS顺序初始化Map
        }

    }

3.使用递归搜索目录下所有的Java文件

    ArrayList<File> fileList; 
    File root; 
        public void searchFiles() {
        File[] files = root.listFiles();
        int length = files.length;
        for (int i = 0; i < length; i++) {
            if (files[i].isDirectory()) {
                root = files[i];
                searchFiles();
            } else {
                if (files[i].getName().endsWith(".java"))
                    fileList.add(files[i]);
            }
        }
    }

3.关键字筛查

读取文件中的某一行,将该行split为字符串数组,逐个判断是否为关键字.
需要首先去除非字母和数字字符的影响,例如:

private void fixUp(int k) {
//直接分割会少计算了一个int
private
void 
fixUp(int //此处有一个关键字int
k) 
{
//使用正则表达式"\\W"处理成
private
void 
fixUp 
int
k

代码如下:

    public void matchKeywords(String line) {
        String[] wordList = line.replaceAll("\\W", " ").split(" ");
        for (int i = 0; i < wordList.length; i++) {
            for (int j = 0; j < 50; j++) {
                if (wordList[i].equals(KEYWORDS[j])) { //循环判断
                    int count = (int) keywords.get(KEYWORDS[j]);
                    keywords.put(KEYWORDS[j], count + 1);
                }
            }
        }

    }

4.处理注释

说明有四种不同的注释,分别为:

   /**
    文档注释
   */
   
   /*
     多行注释
   */
  
   //单行注释
   
   int number;  /*第一行当作代码
                           *
                   其他行当作注释 */

读取文件中的每一行,首先判断是否属于注释,若属于则跳过,若不属于则进行关键字筛查.

    public void countKeyWords(File file) throws IOException {
        BufferedReader input = new BufferedReader(new FileReader(file));
        String line = null;
        while ((line = input.readLine()) != null) {
            line = line.trim();
            if (line.startsWith("//")) continue; //不处理单行注释
            else if (line.contains("/*")) { //多行,文档与尾行注释
                if (!line.startsWith("/*")) matchKeywords(line);//第一行算代码,其余算注释
                while (!line.endsWith("*/")) {
                    line = input.readLine().trim();
                }
            }
            matchKeywords(line); //对代码行进行统计
        }
    }

流程与结果输出

   public void keywordsAnalyze() {
        for (File file : fileList) {
            try {
                countKeyWords(file);
            } catch (IOException e) {
                e.printStackTrace();
            }
        }
        //排序并输出结果
        List<Map.Entry<String, Integer>> list = new ArrayList<Map.Entry<String, Integer>>(keywords.entrySet());
        Collections.sort(list, new Comparator<Map.Entry<String, Integer>>() {
            @Override
            public int compare(Map.Entry<String, Integer> o1, Map.Entry<String, Integer> o2) {
                return o2.getValue().compareTo(o1.getValue());
            }
        });
        int count = 0;
        for (Map.Entry<String, Integer> word : list) {
            count++;
            System.out.print(word.getKey() + ": " + word.getValue() + " ");
            if (count == 5) { //每行输出5个关键字
                count = 0;
                System.out.println();
            }
        }

    }

这里输出的结果是按照出现次数的多少降序排序.这里涉及了HashMap的按值排序的思路.详情可以参考我的另外一篇文章 还没写好,文章里面同样以关键字为例,分析了HashMap按键排序和按值排序两种简便方法.

测试结果

对测试用例在这里插入图片描述进行统计得到如下结果:
在这里插入图片描述

源码下载

包含完整代码与测试用例.

百度网盘

猜你喜欢

转载自blog.csdn.net/xHibiki/article/details/82937137