This job requires see [ https://edu.cnblogs.com/campus/nenu/2019fall/homework/6583 ]
SPEC word frequency statistics
一、项目的重难点
Read (1) Function 1 file
c language I use to write, did not do before reading the title of the document, in the online search some relevant information and write some statements, fopen methods used.
The main code:
in = fopen("book.txt","r");
(2) Function Code 2 Main
FILE *in,*out; int i,l,j,z; in = fopen("book.txt","r"); out = fopen("books.txt","w"); l = fread(s,1,7000010,in); for (i = 0; i < l; ++i) s[i] = tolower(s[i])-'a'; tot = i = 0;
(3) Function Code 3 major
FILE *in,*out; int i,l,j,z; in = fopen("book.txt","r"); out = fopen("books.txt","w"); l = fread(s,1,7000010,in); for (i = 0; i < l; ++i) s[i] = tolower(s[i])-'a'; tot = i = 0; while (i < l) { p = 0 ; while (find(s[i])) { if (!b[p][s[i]]) b[p][s[i]] = (++tot); p = b[p][s[i]]; i++; } num[p]++; i++; } n = 0; num[0] = 0; dfs(0,0);//深度搜索函数 for (i = 60000; i >= 1; --i) for (j = first[i]; j != 0; j = next[j]) fprintf(out,"%s %d\n",x[j],i); z = 0; for (i = 60000; i >= 1; --i) for (j = first[i]; j != 0 && z < 100; j = next[j]) { printf("%s %d\n",x[j],i); z++; }
(4) the main code Function 4
int main(int argc, const char * argv[]) { FILE *in,*out; int i,l,j,z; in = fopen("book123.txt","r"); out = fopen("books.txt","w"); l = fread(s,1,7000010,in); for (i = 0; i < l; ++i) s[i] = tolower(s[i])-'a'; tot = i = 0; while (i < l) { p = 0 ; while (find(s[i])) { if (!b[p][s[i]]) b[p][s[i]] = (++tot); p = b[p][s[i]]; i++; } num[p]++; i++; } n = 0; num[0] = 0; dfs(0,0);//深度搜索函数 for (i = 60000; i >= 1; --i) for (j = first[i]; j != 0; j = next[j]) fprintf(out,"%s %d\n",x[j],i); z = 0; for (i = 60000; i >= 1; --i) for (j = first[i]; j != 0 && z < 100; j = next[j]) { printf("%s %d\n",x[j],i); z++; }
to sum up:
Never done before in similar topics, I had wanted written in Python, but now learn a language feeling will not finish the job, so I chose the C language. For reading text into my mind is the direct use of fopen method, the text can be read directly. Storage should count the number of words when words appear in each set of array large enough, or else it is easy to run no results. But the functions mentioned in the command line, type the directory name in English works stored files, batch count the number of statistics there is no repeat of the word, I find the relevant knowledge and blog on the Internet, it does not understand, for me He says too hard. There have been many mistakes in the preparation process, but also to see their programming foundation is very weak, difficult subject can not start the next step by step learning the language Python, hoping to find in this language in The purpose of this question the answer.
Two, PSP
PSP stage | Expected to spend time (min) | Actual spending time (min) |
Analysis of time-zone differences |
A function to achieve |
60 | 170 | For text reading methods, statistical methods are not familiar with the vocabulary, I do not know how to use. |
A functional test | 30 | 60 | Storage path of the text is wrong, large storage arrays inaccurate vocabulary, grammar problems |
Function two realization | 80 | 160 | Increase vocabulary, need to count the number of unique words, command line input English works of the file name is not achieved, try a lot of ways without success |
Function two tests | 30 | 40 | You can achieve count the number of words appear, but some features not realize |
Three Functions realized | 90 | 75 | English works of command line input file name is not achieved, resulting command line, type the directory name in English works stored files, batch statistics did not realize or did not understand Read the blog |
Function 4 realization | 100 | 120 | Programs written understanding can only open files directly vocabulary of statistics, not from the command line storage Artist |