Reference to this article: https://blog.csdn.net/u014204432/article/details/40348839
I, entitled
Top N English words most frequently occurring single output file ( "Gone with the Wind" in English) is, and the result is input into a text file.
Second, the program design ideas
1, the first English novel Gone with the Wind file with the contents of the file read and write StringBuffer read, and then read line by line and remove sentences and words
Then convert the spaces between StringBuffer into String, then all characters converted to lowercase characters, and then divided into words and sentences
Into an array of characters.
2, then loop through the array to save it Map <String, Integer>, and continue mapping strings and integers, to each word or each digit
Group count, mapping each word and the frequency of its occurrence, in descending order achieved by the comparator, to achieve word appears sorting number.
3. Add the abnormal operation of the file, the final output frequency and the first N highest number of words in English and the corresponding floating occurs Novel
Third, the source code
1 import java.io.*; 2 import java.util.*; 3 import java.util.Map.Entry; 4 5 public class tongjidanci 6 { 7 public static int n=0; 8 public static void main(String[] args) { 9 Scanner input=new Scanner(System.in); 10 String s; 11 int count=0; 12 int num=1; 13 //作为FileReader和FileWriter读取的对象 14 String file1="C:\\Users\\米羊\\Desktop\\piao.txt"; 15 String file2="C:\\Users\\米羊\\Desktop\\fenxijieguo.txt"; 16 try 17 { 18 BufferedReader a=new BufferedReader(new FileReader(file1)); 19 BufferedWriter b=new BufferedWriter(new FileWriter(file2)); 20 StringBuffer c=new StringBuffer(); 21 //将文件内容存入StringBuffer中 22 while((s = a.readLine()) != null) 23 { 24 // for splicing string 25 c.append (S); 26 is } 27 // convert into StringBuffer String, then all characters into lower case character 28 String m = c.toString () the toLowerCase ();. 29 // matching string of numbers and letters 26 30 string [] D = m.split ( "[^ a-zA-Z0-9] +" ); 31 is // iterate stores it in Map <string , Integer> in 32 the Map <String, Integer> = myTreeMap new new the TreeMap <String, Integer> (); 33 is for ( int I = 0; I <d.length; I ++ ) { 34 is // containsKey () method is used to check particular key is mapped in the TreeMap 35 IF (myTreeMap.containsKey (D [I])) { 36 COUNT = myTreeMap.get (D [I]); 37 [ myTreeMap.put (D [I], COUNT +. 1 ); 38 is } 39 the else { 40 myTreeMap.put (D [I],. 1 ); 41 is } 42 is } 43 is // implements sorting by the comparator 44 is List <of Map.Entry <String, Integer List >> = new new the ArrayList <of Map.Entry <String, Integer >> (myTreeMap. the entrySet ()); 45 // Sort descending 46 is the Collections.sort (List, new newComparator <of Map.Entry <String, Integer >> () { 47 48 public int Compare (the Entry <String, Integer> K1, the Entry <String, Integer> K2) { 49 // return two times the word appears more number of occurrences of the word 50 return k2.getValue () the compareTo (k1.getValue ());. 51 is } 52 is 53 is }); 54 is System.out.println ( "Please enter the name of the output of the previous N N" ); 55 n- = input.nextInt (); 56 is for (of Map.Entry <String, Integer> Map: List) { 57 is IF (NUM <= n-) { 58 //Output according to the contents of the specified file to 59 b.write ( "The first number appears" + num + "word is:" + map.getKey () + " , the frequency of occurrence for the" + map.getValue () + "times." ); 60 // wrap 61 is b.newLine (); 62 is // outputs it to the console 63 is System.out.println (map.getKey () + ":" + map.getValue ()); 64 NUM ++ ; 65 } 66 // output completes exit 67 the else BREAK ; 68 } 69 // close file pointer 70 a.close (); 71 is b.close (); 72 } 73 is the catch (a FileNotFoundException E) 74 { 75 System.out.println ( "find the file specified" ); 76 } 77 the catch (IOException E) 78 { 79 System.out.println ( "Error reading file" ); 80 } 81 System.out.println ( "output completion" ); 82 } 83 }
Fourth, operating results
1, the program results
2. File Results
Disclaimer: This article is a blogger original article, follow the CC 4.0 BY-SA copyright agreement, reproduced, please attach the original source link and this statement.