java statistics text in different words and repeating words

System.out.println(str[p]);import java.io.*;
import java.util.*;
import java.nio.file.*;
public class Duqudata {
	public static void main(String[] args){
		String token="";   
		Path path=Paths.get("D:\\out.txt");
		try(InputStream input=Files.newInputStream(path, StandardOpenOption.READ);
				Scanner sc=new Scanner(input)){
					while(sc.hasNext()){
						 token +=sc.next()+" ";				
					}
				}catch(IOException e){
					e.printStackTrace();
				}		
		String[] str=token.split("[ ,.]");	//正则表达式中有空格
		Set<String> hs=new HashSet<>();
		Set<String> se=new HashSet<>();	
		for(String s:str){
			//if(!s.equals("")){
			      if(!hs.add(s)){          //如果集合中已经存在s,则add()返回false
				    se.add(s);
			      // }	
			}
		}
		System.out.println("不同的单词为:"+hs);
		System.out.print("重复的单词为:"+se);
	}
}
  1. Starting with the first non-white character is retained in front of a blank character strings when processing regular expressions

Look deformation:

import java.io.*;
import java.util.*;
import java.nio.file.*;
public class Duqudata2 {
	public static void main(String[] args){
		String token="";
		Path path=Paths.get("D:\\out.txt");
		try(InputStream input=Files.newInputStream(path, StandardOpenOption.READ);
				Scanner sc=new Scanner(input)){
					while(sc.hasNext()){
						 token +=" "+sc.next();		//字符串前加空格		
					}
				}catch(IOException e){
					e.printStackTrace();
				}		
		String[] str=token.split("[ ,.]");	//注意正则表达式里面有空格
		Set<String> hs=new HashSet<>();
		Set<String> se=new HashSet<>();	
		for(String s:str){
			if(!s.equals("")){              //注意("")中无空格,该语句把空格过滤掉,如果(" ")中有空格,则不能过滤空格
			      if(!hs.add(s)){           
				    se.add(s);
			       }	
			}
		}
		System.out.println("不同的单词为:"+hs);
		System.out.print("重复的单词为:"+se);
	}
}
Where D: is the test text \\ out.txt

116
200
189
172
200
200
134
126
173
166
按时
ss
sss
sss



Published 46 original articles · won praise 19 · views 50000 +

Guess you like

Origin blog.csdn.net/qq_28929579/article/details/51340113
Recommended