3 word frequency statistics

This job requires see [ https://edu.cnblogs.com/campus/nenu/2019fall/homework/6583 ]

SPEC word frequency statistics 

一、项目的重难点

Read (1) Function 1 file

c language I use to write, did not do before reading the title of the document, in the online search some relevant information and write some statements, fopen methods used.

The main code:

in = fopen("book.txt","r");

(2) Function Code 2 Main

 FILE *in,*out;
    int i,l,j,z;
    in = fopen("book.txt","r");
    out = fopen("books.txt","w");
    
    l = fread(s,1,7000010,in);
    for (i = 0; i < l; ++i)
        s[i] = tolower(s[i])-'a';
    tot = i = 0;

 

 

 (3) Function Code 3 major

 FILE *in,*out;
    int i,l,j,z;
    in = fopen("book.txt","r");
    out = fopen("books.txt","w");
    
    l = fread(s,1,7000010,in);
    for (i = 0; i < l; ++i)
        s[i] = tolower(s[i])-'a';
    tot = i = 0;
    while (i < l)
    {
        p = 0 ;
        while (find(s[i]))
        {
                if (!b[p][s[i]])
                    b[p][s[i]] = (++tot);
                p = b[p][s[i]];
                i++;
        }
        num[p]++;
        i++;
    }
    
    n = 0;
    num[0] = 0;
    dfs(0,0);//深度搜索函数 
    
    for (i = 60000; i >= 1; --i)
    for (j = first[i]; j != 0; j = next[j])
            fprintf(out,"%s %d\n",x[j],i);
    z = 0;
    for (i = 60000; i >= 1; --i)
    for (j = first[i]; j != 0 && z < 100; j = next[j])
    {
        printf("%s %d\n",x[j],i);
        z++;
    }

 

 

 (4) the main code Function 4

int main(int argc, const char * argv[])
{
    FILE *in,*out;
    int i,l,j,z;
    in = fopen("book123.txt","r");
    out = fopen("books.txt","w");
    
    l = fread(s,1,7000010,in);
    for (i = 0; i < l; ++i)
        s[i] = tolower(s[i])-'a';
    tot = i = 0;
    while (i < l)
    {
        p = 0 ;
        while (find(s[i]))
        {
                if (!b[p][s[i]])
                    b[p][s[i]] = (++tot);
                p = b[p][s[i]];
                i++;
        }
        num[p]++;
        i++;
    }
    
    n = 0;
    num[0] = 0;
    dfs(0,0);//深度搜索函数 
    
    for (i = 60000; i >= 1; --i)
    for (j = first[i]; j != 0; j = next[j])
            fprintf(out,"%s %d\n",x[j],i);
    z = 0;
    for (i = 60000; i >= 1; --i)
    for (j = first[i]; j != 0 && z < 100; j = next[j])
    {
        printf("%s %d\n",x[j],i);
        z++;
    }

 

 

 to sum up:

Never done before in similar topics, I had wanted written in Python, but now learn a language feeling will not finish the job, so I chose the C language. For reading text into my mind is the direct use of fopen method, the text can be read directly. Storage should count the number of words when words appear in each set of array large enough, or else it is easy to run no results. But the functions mentioned in the command line, type the directory name in English works stored files, batch count the number of statistics there is no repeat of the word, I find the relevant knowledge and blog on the Internet, it does not understand, for me He says too hard. There have been many mistakes in the preparation process, but also to see their programming foundation is very weak, difficult subject can not start the next step by step learning the language Python, hoping to find in this language in The purpose of this question the answer.

Two, PSP

PSP stage                         Expected to spend time (min)

                                 Actual spending time (min)               

Analysis of time-zone differences

A function to achieve                        

60                              170 For text reading methods, statistical methods are not familiar with the vocabulary, I do not know how to use.
A functional test 30 60 Storage path of the text is wrong, large storage arrays inaccurate vocabulary, grammar problems
Function two realization 80 160 Increase vocabulary, need to count the number of unique words, command line input English works of the file name is not achieved, try a lot of ways without success
Function two tests 30 40 You can achieve count the number of words appear, but some features not realize
Three Functions realized 90 75 English works of command line input file name is not achieved, resulting command line, type the directory name in English works stored files, batch statistics did not realize or did not understand Read the blog
Function 4 realization 100 120 Programs written understanding can only open files directly vocabulary of statistics, not from the command line storage Artist

Guess you like

Origin www.cnblogs.com/qiwh/p/11523930.html