Individual project - WC

github project Address: https://github.com/ScarletF6084/WC

Project requirements

Title Description

The Count Word
1. achieve a simple and complete software tools (statistical characteristics of the source program).
2. The unit test, regression test, performance test, the use of associated tools in the course of achieving the above program.
3. Practice Personal Software Process (PSP), and gradually record their own time spent in each part of software engineering. WC project requirements

wc.exe is a common tool, it can count the number of characters of text files, words, and lines. The project asked to write a command line program, to mimic the function of the existing wc.exe, and be expanded, given the number of characters in a programming language source files, words, and lines.

Implement a statistical program, the number of characters it can correct statistics program file, the number of words, lines, and also has other extended functions, and can handle multiple files quickly.
Specific functional requirements:
mode handles user needs to:

wc.exe [parameter] [file_name]

 

The basic list of features:

wc.exe -c file.c // returns the number of characters in the file file.c

The number of wc.exe -w file.c // Returns the file of the word file.c  

wc.exe -l file.c // returns the number of rows of file file.c


Extensions:
    -s recursive processing files that meet the conditions of the directory.
    -a return more complex data (line / space lines / comment lines).

Blank line: Bank format all control characters or spaces, if included code, no more than a displayable character, such as "{."

Line: Bank codes comprises more than one character.

Comment lines: The Bank is not a line of code, and the Bank includes comments. An interesting example is that some programmers will add a comment behind a single character:

    } // NOTE
In this case, the line belongs to a comment line.

[file_name]: file or directory name, you can handle general wildcard.

Advanced Features:

 -x parameter. This parameter is used alone. If you have the command line parameters, the program will display a graphical interface, the user can select a single file through the interface, the program will display the file number of characters, lines, etc. All statistics.

Demand For example: wc.exe -s -a * .c
returns the number of lines of code the current directory and all subdirectories * .c files, the number of blank lines, the number of comments.

Problem-solving ideas

1. For statistical character, selecting read File class () method reads the file by character, that is, every time a character is read;

2. For the statistics of the number of words, characters read by selecting the file, and then determine a character before the current character and, if the character is a former English characters, the current character is not English characters, the word count + 1;

3. For the statistics of the number of rows, you can use readLine BufferedReader class () method to read the file rows, each row that is read once; but in order to facilitate the statistics together, and finally I chose to read by character file, when confronted with a line break line number + 1;

Special Lines

Using readLine BufferedReader class () method to read the file in rows, so that the use of regular expressions to match particular line

1. For comment lines, selected to match regular expressions, if matched "//" comment line number is +1, if the matching to the "/ *" the process proceeds to state flag, flag state during each read line, the number of comment lines +1, if the matching to "* /" flag is cleared;

2. For the blank line, using the same regular expression matching, if the matching does not contain any non-blank character rows, the number of blank lines + 1;

3. For the line of code, still using the regular expression matching, matching if two or more non-whitespace characters, and the line is not an arbitrary number of blanks in contact with "//", the number of lines by +1.

 

 

Design and implementation process

flow chart:

Want to start calculating the corresponding data is determined in accordance with the instruction input, and an output, but so will result in a method corresponding to each file to be read once, and thus appears to code bloat, the final decision will be uniform in the initialization data when the class calculated, then the data in accordance with command output required, it is designed only a Counter class constructor initializes a, and the class corresponding to the print data output.

Key Code

The number of characters, number of words, the number of lines statistics:

. 1      // TEMP store the current character, wordFlag statistics for determining the number of word
 2 int TEMP, wordFlag = -1 ; . 3 = path filePath; . 4 = the FileReader FileReader new new the FileReader (path); . 5 // character-read files . 6 the while ( (the TEMP = FileReader.read ()) = -1! ) { 7 // count the number of characters 8 chars ++ ; 9 // count the number of words, when faced with non-English characters after English characters, number of words +1 10 iF (( TEMP> = 'A' && TEMP <= 'Z') || (TEMP> = 'A' && TEMP <= 'the Z')) wordFlag = TEMP; . 11 the else IF (TEMP = wordFlag wordFlag &&!!= -1){ 12 words++; = -1 wordFlag 13 is ; 14 } 15 // statistics line, following a line break, the number of rows + 1'd 16 IF (TEMP == '\ n-') Lines ++ ; . 17}

 

Special line of statistics:

1          // statistical comment line, and lines empty lines
 2 the BufferedReader br = new new the BufferedReader ( new new the InputStreamReader ( new new the FileInputStream (path))); . 3 int the cflag = 0, starComments = 0 ; . 4 String lineTemp; . 5 the while ((lineTemp = ! br.readLine ()) = null ) { . 6 // System.out.println (lineTemp); . 7 // statistical blank lines . 8 of Pattern.compile the pattern pattern = ( "\\ S"); // not contain any non- blank character 9 Matcher Matcher = Pattern.matcher (lineTemp); 10 IF (! matcher.find ()) blankLines ++ ; 11 //Statistical comment line 12 is the Pattern of Pattern.compile pattern0 = ( "//"); // single-line comment, the number of comment lines + 1'd 13 is Matcher matcher0 = pattern0.matcher (lineTemp); 14 IF (matcher0.find ()) commentLines ++ ; 15 Pattern pattern1 Pattern.compile = 16 ( "/ \\ *"); // multi-line comments, from "/ *" to start, until it encounters "* /" before the comment line number +1 17 = Pattern.compile pattern2 Pattern ( "\\ * /" ); 18 is Matcher matcher1 = pattern1.matcher (lineTemp); . 19 Matcher matcher2 = pattern2.matcher (lineTemp); 20 is IF (matcher1.find ()) = the cflag. 1 ; 21 is IF (the cflag == . 1 ) { 22 is commentLines ++ ;23 starComments++; 24 } 25 IF (matcher2.find ()) = 0 the cflag ; 26 is 27 // statistical line 28 of Pattern.compile the pattern1 = ( "\\ S 2 {,}"); // non-whitespace character in the two or two more than 29 of Pattern.compile pattern2 = ( "S * ^ \\ //"); // can not be connected to any number of blanks " //" beginning 30 matcher1 = pattern1.matcher (lineTemp); 31 is matcher2 = pattern2. Matcher (lineTemp); 32 IF (matcher1.find () && matcher2.find ()!) codelines ++ ; 33 is } 34 is codelines - = starComments; // negative comment interference interbank

 

Test Run

Empty file:

 

Only one character of the file:

 

 

 Only one word of the file:

 

 

 

Only one line of the file:

 

 

 

Typical source file:

 

 

 

PSP

 

    PSP Personal Software Process Stages Estimated time consuming (minutes) The actual time-consuming (minutes)
Planning plan 30 30
Estimate Estimate how much time this task requires 10 15
Development Develop 360 360
Analysis Needs analysis (including learning new technologies) 120 180
Design Spec Design document generation 10 10
Design Review Design review (and colleagues reviewed the design documents) 5 5
Coding Standard Code specifications (development of appropriate norms for the current development) 15 10
Design Specific design 60 90
Coding Specific coding 240 300
Code Review Code Review 30 40
Test Test (self-test, modify the code, submit modifications) 30 40
Reporting report 60 60
Test Report testing report 30 30
Size Measurement Computing workload 10 10
Postmortem & Process Improvement Plan Hindsight, and to propose new plans 30 30
Total total 1040 1210

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Project Summary

 The project needs a bit more analysis on the time spent, a lot of the time spent beyond their expectations, some overestimate their abilities, needs more exercise their analytical skills;

 In the implementation process also learned yet how deeply before the regular expression matching and can be considered a small harvest.

 

Guess you like

Origin www.cnblogs.com/ScarletF/p/12562308.html