Fourth job - Pair Programming

Github project addresses (partner) address
Pair programming partner blog address address
Address operational requirements address

1.1 twinning process

Here Insert Picture Description

1.2 PSP table

PSP2.1 Personal Software Process Stages Estimated time consuming (minutes) The actual time-consuming (minutes)
· Planning · Plan 20 20
· Estimate • Estimate how much time this task requires 25 25
· Development · Development 890 1290
· Analysis · Needs analysis (including learning new technologies) 60 90
· Design Spec Generate design documents 30 30
· Design Review · Design Review (and his colleagues reviewed the design documents) 20 20
· Coding Standard · Code specifications (development of appropriate norms for the current development) 10 10
· Design · Specific design 30 60
· Coding · Specific coding 600 900
· Code Review · Code Review 60 60
· Test · Test (self-test, modify the code, submit modifications) 30 60
· Reporting · Report 30 30
· Test Report · testing report 30 30
· Size Measurement · Computing workload 20 20
· Postmortem & Process Improvement Plan · Hindsight, and propose process improvement plan 60 60
total 1025 1415

2.1 ideas

Title divided into three stages, three times we start analyzing the various requirements of practice

  • Basic functions
    basic function is to count the number of characters , number of active lines , the total number of words , the number of types of valid words , frequency , output frequency of the top ten words in the specified order . How difficult and key words are statistically valid, stripping out a word from the text. To this end our approach is to use regular expressions to define the conditions for the regular expression, it is to be screened英文字母开头,长度大于等于4,但不可以是数字开头的字符串
  • 扩展功能,命令行解析
    扩展功能中要求实现一个命令行程序,像LinuxShell命令一样有着一些参数选项。这一功能的难点在于命令行参数解析。为此,我们原打算通过判断Main··的入口args参数顺序以此比较来判断是否要进行某些功能。但是在实现过程中,发现题目要求命令行的参数有必填参数还有选填参数,参数的顺序还可以不固定。对此我们的方法就不再适用。通过请教同学,查阅资料,我们使用了NuGetCommandLineParse工具来帮助我们实现命令行参数的解析工作。
  • 扩展功能,窗体程序
    在实现窗体程序前,我们把第二版的扩展功能的计算核心封装成DLL类库,在窗体程序中引用DLL服务,方便了程序的编写。

2.2 设计实现过程

我们设计了两个类,CalcCore类负责统计功能,包含5个功能函数,Options类负责解析命令行参数,函数与函数、类与类间没有关联关系
FIG class CalcCore

2.3程序结构图和流程图

程序结构图

Here Insert Picture Description

命令行程序流程图

Here Insert Picture Description

2.3 单元测试

单元测试中我们针对每个函数设计了两个测试样例

测试代码如图

Here Insert Picture Description
Here Insert Picture Description

测试txt文件如图

Here Insert Picture Description

3.制定规范

Pascal——所有单词的第一个字母都大写;
一个通用的做法是:所有的类型/类/函数名都用Pascal形式,所有的变量都用。
类/类型/变量:名词或组合名词,如Member、ProductInfo等。例如单词数量取名CountOfWord
函数则用动词或动宾组合词来表示例如计算行数方法取名CalcLine
缩进设置Tab为4空格
在复杂条件表达式中使用括号表达优先级
花括号采用{}各占一行的风格
在初始化变量时一定赋初值为默认
下划线在窗体程序中命名中采用
注释,对于计算核心的每个方法都注明方法的目的,参数,为什么这样做
错误处理,对于没有包含的操作,都要有配套的异常处理

4.代码互审

  • 虽然制定了规范,但我仍有些习惯问题,比如FileStream,StreamReader对象我喜欢命名为fs,sr,这是不符合规范的,但是通常一个函数里只有一个FileStream,StreamReader对象,所以同伴没有强制改正
  • 我和同伴都只习惯在给函数注释表示函数的作用,没有具体功能的注释,导致合并代码时总得询问对方的思路
  • 同伴的功能函数包含了写入文件的功能,我认为函数功能应该单一,所以在整合代码时将写入文件的功能放在了主函数里
  • 同伴的统计词频的排序功能写的太冗杂,在与同学交流后发现使用Linq的排序能大大减少代码量和降低开发难度

5.性能分析

我们发现程序中消耗最大的函数是统计词频函数
Here Insert Picture Description
其中获得MatchCollection元素数量函数占比最大
Here Insert Picture Description
于是我们修改了代码,减少了调用该函数的次数
Here Insert Picture Description
老实说我没想通为什么调用两次与调用近三百万次的百分比居然相差不多

6.代码说明

  • CalcChar 传入文件路径,读取所有字符,剔除中文字符,返回字符串长度
        /// <summary>
        /// 统计字符数
        /// </summary>
        /// <param name="path"></param>
        /// <returns></returns>
        public int CalcChar(string path)
        {
            int charNum;
            string rest, str;
            FileStream fs = new FileStream(path, FileMode.Open);
            StreamReader sr = new StreamReader(fs);
            str = sr.ReadToEnd();
            string pattern = @"[\u4e00-\u9fa5]";
            rest = Regex.Replace(str, pattern, "");
            charNum = rest.Length;
            sr.Close();
            fs.Close();
            Console.WriteLine("字符总数:" + charNum);
            return charNum;
        }
  • CalcWords incoming file path, use regular expressions to get all eligible words, return the number of words
        /// <summary>
        /// 统计单词总数
        /// </summary>
        /// <param name="path"></param>
        /// <returns></returns>
        public int CalcWords(string path)
        {
            FileStream fileStream = new FileStream(path, FileMode.Open);
            StreamReader streamReader = new StreamReader(fileStream);
            string tool = @"\b[a-zA-z]{4,}\w{0,}";
            string rest = streamReader.ReadToEnd();
            MatchCollection mc = Regex.Matches(rest, tool);
            int res = mc.Count;
            Console.WriteLine("单词总数:" + res);
            streamReader.Close();
            fileStream.Close();
            return res;
        }
  • CalcLine incoming file path, when the space-time behavior of the reading, not counting return a valid number of rows
        /// <summary>
        /// 计算文件中的行数
        /// path为文件路径
        /// </summary>
        /// <param name="path"></param>
        /// <returns></returns>
        public int CalcLine(string path)
        {
            int res = 0;
            FileStream fileStream = new FileStream(path, FileMode.Open);
            StreamReader streamReader = new StreamReader(fileStream);
            string Line = "";
            while ((Line = streamReader.ReadLine()) != null)
            {
                if (Line.Length > 0)
                res += 1;
            }
            streamReader.Close();
            fileStream.Close();
            Console.WriteLine("有效行数:" + res);
            return res;
        }
  • CalcWordFrequence incoming file path and parameter n, use regular expressions to get the word to all eligible, into the dictionary, sorted by Linq, a new dictionary, the n key-value pairs into the new dictionary (such as n> key to a number, then all key-value pairs into the new dictionary), returns a new dictionary
        /// <summary>
        /// 统计单词词频
        /// </summary>
        /// <param name="path"></param>
        /// <param name="n"></param>
        public Dictionary<string, int> CalcWordFrequence(string path,int n)
        {
            string tool = @"\b[a-zA-z]{4,}\w{0,}";
            Dictionary<string, int> keyValuePairWord = new Dictionary<string, int>();
            FileStream fs = new FileStream(path, FileMode.Open);
            StreamReader sr = new StreamReader(fs);
            string rest = sr.ReadToEnd();
            MatchCollection mc = Regex.Matches(rest, tool);
            int number = mc.Count;

            for(int i = 0; i < number; i++)
            {
                string tmp = "";
                tmp = mc[i].ToString();
                if (!keyValuePairWord.ContainsKey(tmp))
                {
                    keyValuePairWord.Add(tmp, 1);
                }
                else
                {
                    keyValuePairWord[tmp]++;
                }
            }

            var res = from pair in keyValuePairWord
                      orderby pair.Value descending, pair.Key ascending
                      select pair;

            Dictionary<string, int> result = new Dictionary<string, int>();
            int j = 0;
            foreach (var i in res)
            {
                if (j == n)
                {
                    break;
                }
                result.Add(i.Key, i.Value);
                j++;
                Console.WriteLine(i.Key + ":" + i.Value);
            }
            sr.Close();
            fs.Close();
            return result;
        }
  • PhraseStat incoming file path and the parameters m, the read line by line, determining whether each row having m Matched word group, and the group represented by the string of words stored in the dictionary, the dictionary returns
        /// <summary>
        /// 统计词组
        /// </summary>
        /// <param name="path"></param>
        /// <param name="m"></param>
        public Dictionary<string, int> PhraseStat(string path, int m)
        {
            Dictionary<string, int> keyValuesPairPhrase = new Dictionary<string, int>();

            string tool1 = @"\b[a-zA-z]\w{0,}";
            FileStream fs = new FileStream(path, FileMode.Open);
            StreamReader sr = new StreamReader(fs);
            string Line = "";
            while ((Line = sr.ReadLine()) != null)
            {
                MatchCollection mc = Regex.Matches(Line, tool1);
                for (int i = 0; i < mc.Count - m + 1; i++)
                {
                    string tmp = "";
                    for (int j = i; j < i + m; j++)
                    {
                        if (mc[j].Length < 4)
                        {
                            goto tick;
                        }
                        tmp += mc[j].ToString() + " ";
                    }
                    if (!keyValuesPairPhrase.ContainsKey(tmp))
                    {
                        keyValuesPairPhrase.Add(tmp, 1);
                    }
                    else
                    {
                        keyValuesPairPhrase[tmp]++;
                    }
                    tick:;
                }
            }
               
            Dictionary<string, int> result = new Dictionary<string, int>();

            foreach (var i in keyValuesPairPhrase)
            {
                Console.WriteLine(i.Key + ":" + i.Value);
                result.Add(i.Key, i.Value);
            }
            sr.Close();
            fs.Close();
            return result;
        }

7. Summary

The job gains are large, first of all I consolidated the C # file read, regular expressions, Linq, dictionary use, was only studied until the practice was repeated using the master. Secondly pair programming, two different ideas, I can be a good inspiration in the design. Of course, cooperation must be strict requirements on their own, to function functions explanatory notes, naming have to specifications. Finally, peer review of code to find a good design for everyone blind, after all, some design errors for granted that they will be ignored, multiple partners can be a good catch insects. I think this pair is 1 + 1> 2.

Guess you like

Origin www.cnblogs.com/guduxuanze2014/p/11666327.html