"Building of the law" - the fourth time job

Github project address Github project address
Where the job requires Link job requirements
Knot link to peers Peer Link
PSP2.1 Personal Software Process Stages Estimated time consuming (minutes)   The actual time-consuming (minutes)  
Planning plan 40 55
· Estimate • Estimate how much time this task requires 90 90
Development Develop 60 75
· Analysis · Needs analysis (including learning new technologies) 30 35
· Design Spec Generate design documents 15 18
· Design Review · Design Review (and his colleagues reviewed the design documents) 20 20
· Coding Standard · Code specifications (development of appropriate norms for the current development) 5 5
· Design · Specific design 30 40
· Coding · Specific coding 120 150
· Code Review · Code Review 40 60
· Test · Test (self-test, modify the code, submit modifications) 50 70
Reporting report 100 120
· Test Report · testing report 50 55
· Size Measurement · Computing workload 20 20
· Postmortem & Process Improvement Plan · Hindsight, and propose process improvement plan 15 10
  total 685 823

First, the idea of ​​the project and its summary

  • a. problem-solving ideas described.
    • Get the topic, read clarify the meaning of problems. Topic we need statistics to text characters, words, word frequency, the effective number of rows
    • In addition to the main module division function requires only one class, seven function, each different functional capabilities. Second are allocated based on the strength of personal programming skills
    • The first document to be read into the statistics, use regular expressions to meet the statistical requirements of the subject and put the word dictionary and returns
    • The conversion dictionary into an array, and sorted in descending order of frequency, if the frequency of the same alphabetical
    • An array of characters, words, word frequency, the effective head office statistics
    • Each function be encapsulated, makes the code more concise
  • b. Design implementation
    substantially, in addition to the main function of this class, there is a category getFile (), there are seven methods getFile this class, there is a common method for obtaining this method getDic dictionary stored in the dictionary length is greater than four and not by the number of
    words that begin with the word and the number of times they appear, this method returns a Hashtable, method getWordFre () method of the dictionary ordered by number of occurrences of the word, and returns a dynamic array. The other can be called directly, getWordFre This method uses the array returned by the corresponding function. Unit testing is these few methods
    for testing the respective corresponding functions.
  • c. Code Specification
    • Appropriate use of blank lines to increase the readability of the code
    • Naming method, generally named as verb-object phrase, only one way to complete a task
    • Common indentation and line breaks, to make the code-level clear, clear
    • When performed through a bad of generic as possible foreach
    • Indents and spacing: indented with TAB, without SPACES
    • Note the code and need to align
    • Ways to avoid writing too long. A typical method of code between lines 1 to 25.
  • d. Code Description
public Hashtable getDic(string pathName, ref Hashtable wordList)     //getDic:从文本文件中统计词频保存在Hashtable中
        {
            StreamReader sr = new StreamReader(pathName);
            string line;
            line = sr.ReadLine();             //按行读取
            while (line != null)
            {
                MatchCollection mc;
                Regex rg = new Regex("[0-9A-Za-z-]+");    //用正则表达式匹配单词
                mc = rg.Matches(line);
                for (int i = 0; i < mc.Count; i++)
                {
                    Regex regNum = new Regex("^[0-9]");
                    string mcTmp = mc[i].Value.ToLower();    //大小写不敏感
                    if (mcTmp.Length >= 4 && regNum.IsMatch(mcTmp) == false)//字符长度大于4且不以数字开头
                    {

                        if (!wordList.ContainsKey(mcTmp))     //第一次出现则添加为Key
                        {
                            wordList.Add(mcTmp, 1);

                        }
                        else                                            //不是第一次出现则Value加
                        {
                            int value = (int)wordList[mcTmp];
                            value++;
                            wordList[mcTmp] = value;
                        }
                    }
                    else
                        continue;
                }
                line = sr.ReadLine();
            }
            sr.Close();
            return wordList;
        }

getDic (string pathName, ref Hashtable wordList ) This method is used to extract from each word in the text, and word frequency statistics of each word into a Hashtable, and then open the file with the StreamReader,
with line by line reading while, in loop, with regular expression matching word of each line, while the for loop is used to match the word out culling, in line with the conditions added to the dictionary, does not meet the conditions excluded by the last returns a Hashtable

public ArrayList getWordFre(string pathName, ref Hashtable wordList)
        {
            getFile Wordlist = new getFile();

            Hashtable Wordlist_fre = new Hashtable();

            Wordlist_fre = Wordlist.getDic(pathName, ref wordList);
            ArrayList keysList = new ArrayList(Wordlist_fre.Keys);
            keysList.Sort();
            string tmp = String.Empty;
            int valueTmp = 0;
            for (int i = 1; i < keysList.Count; i++)
            {
                tmp = keysList[i].ToString();
                valueTmp = (int)wordList[keysList[i]];//次数
                int j = i;
                while (j > 0 && valueTmp > (int)wordList[keysList[j - 1]])
                {
                    keysList[j] = keysList[j - 1];
                    j--;
                }
                keysList[j] = tmp;//j=0
            }


            return keysList;
        }

getWordFre (string pathName, ref Hashtable wordList) passes over wordList sorted by frequency, and converted into a dynamic array and Hashtable return

 public void write(string outputPath, ref Hashtable wordList, int lines, int words, int characters, int wordsOutNumFla, int wordsOutNum,int m,string inputPath)
        {
            getFile Wordlist = new getFile();
            ArrayList keysList = new ArrayList();
            ArrayList keysList1 = new ArrayList();

            keysList1 = Wordlist.getPhrase(inputPath, outputPath, ref wordList,  m);
            keysList = Wordlist.getWordFre(outputPath, ref wordList);
            StreamWriter sw = new StreamWriter(outputPath);
            sw.WriteLine("characters:{0}", characters);
            sw.WriteLine("words:{0}", words);
            sw.WriteLine("lines:{0}", lines);
            if (wordsOutNumFla == 1)
            {
                wordsOutNum = wordsOutNum;
            }
            else
                wordsOutNum = 10;
            for (int i = 0; i < wordsOutNum; i++)
            {
                sw.WriteLine("<{0}>:{1}", keysList[i], wordList[keysList[i]]);
            }
            sw.WriteLine("以下是长度为{0}的词组:\n",m);
            foreach (string j in keysList1)
            {
                sw.WriteLine("<{0}>:{1}", j, 1);
            }
            sw.Flush();
            sw.Close();
        }

Write to the file is quite simple, but there is a small detail that is sure to open the file after the close of the open file, or if you want time to file additional written secondary return wrong,

I twice before writing files and then forgot to shut down after the first time you open the file, causing the error must remember

this method, passing in the total number of characters required to write the file, the number of words the number, frequency, and the highest frequency of words flag wordsOutNumFla,

by wordsOutNumFla this determination is output to the default ten highest frequency word, the latter using a digital or parameters -n

  • e harvest
    • Start holding this subject have a big head to say the truth, first of all because I am very new to C #, and then my companions review some relevant information together for this assignment how,

      know for in C # statistics text word in the dictionary can help to achieve, so we went to the dictionary to find the relevant information, and then slowly learned how to achieve our desired function. Second, because of this personal work is pair programming,

      can two together to achieve code modules, compared to a person, they would be a lot easier, because two people pair programming, then a person writing, a person examine if there are problems you can promptly corrected, understand their own place to go wrong,

      but if I write myself a man of words, often only to tell the compiler error by where I was wrong, himself a person is hard to find, so you can save some time. On the key code to solve the problem of the project, which is part spend a lot of time,

      the first one is just learning because knowledge applications are not familiar with, and the second because the technology does not place himself, but by a lot of time finally did come out, so there is nothing difficult it can certainly be done.


    • The first time I realized the code, there is no dividing module, only two classes, a program entry, another class implements all functions, which also led to my own code, very chaotic, very complex,

      I can not find the corresponding function is implemented, if the running error where looking for for a long time can not find the wrong place, wasted a lot of time, and then the second time I have been to my code module division,

      divided into seven methods, each independently, while other methods can be invoked this way, than the first time I looked at the code concise a lot.

Second, the test unit

  • In the absence of the package, of our own code code peer review
  • GetHangNum for testing, it is the code
[TestMethod]
public void getHangNum()
        {
            int lines;
            int m = 3;
            string input_path = "C:/Users/罗伟诚/Desktop/input.txt", out_put = "C:/Users/罗伟诚/Desktop/out.txt";

            Hashtable wordList = new Hashtable();
            ArrayList keysList = new ArrayList();
            getFile c = new getFile();

            keysList = c.getWordFre(input_path, ref wordList);

            lines = c.getHangNum(input_path);

           
        }



Test out as shown above, there is no problem

  • Test for getWordNum1
[TestMethod]
 public void getWordNum1()
        {
            int words;
            int m = 3;
            string input_path = "C:/Users/罗伟诚/Desktop/input.txt", out_put = "C:/Users/罗伟诚/Desktop/out.txt";

            Hashtable wordList = new Hashtable();
            Hashtable wordList1 = new Hashtable();
            ArrayList keysList = new ArrayList();
            getFile c = new getFile();

            keysList = c.getWordFre(input_path, ref wordList);
            words = c.getWordNum(input_path);

        }
  • Test for getCharactersNum1
[TestMethod]
public void getCharactersNum1()
        {
            int  words, characters = 0, wordsOutNum = 0, wordsOutNumFla = 0, inputPathFla = 0, outputPathFla = 0;
            int m = 3;
            string input_path = "C:/Users/罗伟诚/Desktop/input.txt", out_put = "C:/Users/罗伟诚/Desktop/out.txt";

            Hashtable wordList = new Hashtable();
            Hashtable wordList1 = new Hashtable();
            ArrayList keysList = new ArrayList();
            getFile c = new getFile();

            keysList = c.getWordFre(input_path, ref wordList);
            words = c.getWordNum(input_path);

        }

  • A test writing in three classes, with the test, are measured out by the
  • The same file before three tests are used to test, then, carry out such operations with ten test sample preparation
  • Code test coverage, since this is the offline version, there is no test coverage

Third, the exception handling

  • On the path of exception handling, because this requires the user to enter a path, you can not use Direct inspection function to determine the path, so I set myself a flag
try
            {
                if (inputPathFla == 1 || outputPathFla == 1)
                {
                    Hashtable wordList = new Hashtable();
                    Hashtable wordList1 = new Hashtable();
                    ArrayList keysList = new ArrayList();
                    getFile c = new getFile();

                    keysList = c.getWordFre(input_path, ref wordList);

                    lines = c.getHangNum(input_path);

                    words = c.getWordNum(input_path);

                    characters = c.getCharactersNum(input_path);

                    c.write(out_put, ref wordList, lines, words, characters, wordsOutNumFla, wordsOutNum,m,input_path );

                    Console.WriteLine("写入文件完成,请前往{0}查看\n", out_put);
                    
               }
                else
                {
                   Console.WriteLine("请使用 -i 参数和 -o 参数指定输入和输出路径\n");
               }
            
             

        }

            catch (Exception e)
            {
                Console.WriteLine("请检查输入路径是否正确");
            }

  • This is the case enter the path to normal

  • This is the case entered the wrong path

Fourth, improve the code

  • In fact, I do not start thinking about a dictionary, because thinking about whether we can solve with their existing knowledge, but with strings and arrays are not a good solution, lead to code too long, jumbled
  • After using a dictionary, I only need a class, the extracted data into a text document out of the dictionary, after the operation can be used directly on the line dictionary, very convenient.
  • Statistics character, before adopting the method efficiency is very low, it is later used expressions to improve the efficiency
  • Efficiency and the most time-consuming function as follows

Fifth, code review

  • The beginning because I do not use regular expressions to extract words, at the prompt Zhang Peng, learning a little bit of knowledge about regular expression, and applied to the program, because before I did not use regular expressions code is not saved, so the code now attach the regular expression
  • As a regular expression, when dealing with more data, you can increase the speed of processing (so it is still very sense Xie Zhangpeng remind students)
MatchCollection mc;
                    Regex rg = new Regex("[A-Za-z]+");    //用正则表达式匹配单词
                    mc = rg.Matches(line);
                    for (int i = 0; i < mc.Count - m + 1; i++)
                    {
                        Regex regNum = new Regex("^[0-9]");
                        string mcTmp = "";
                        int t = i;
                        for (int q = 0; q < m; q++)
                        {
                            mcTmp += mc[t].Value.ToLower() + " ";
                            t++;
                        }
                        k.Add(mcTmp);
                    }

VI Summary

Through this twinning program, he summed up the benefits of pair programming

  • Can supervise each other, not easy to be lazy: Two people working together need to cooperate with each other, if it will delay the progress of the lazy
  • Can learn from each other, programming based on two people do not like the idea of ​​not the same, in some ways much more likely to me that some aspects of his much more, so you can promote mutual
  • Many pairs of eyes, less bug: two men mutual supervision, can enhance code quality and reduce BUG
  • Pair programming can indeed achieve the effect of 1 + 1> 2

Guess you like

Origin www.cnblogs.com/lwcblogs/p/11656751.html