Python||Error: TypeError: can only join an iterable

1. Problem description

        Filter the words in all_seg.txt and remove stop words. The relevant code is as follows:

pkutest = [line.strip() for line in open ('all_seg.txt','r',encoding = 'utf-8').readlines()]
#GBK编码:是指中国的中文字符,其中它包含了简体中文与繁体中文字符,另外还有一种字符“gb2312”,这种字符仅能存储简体中文字符。

#UTF-8编码:它是一种全国家通过的一种编码,如果你的网站涉及到多个国家的语言,那么建议你选择UTF-8编码,适用于国际化。

final=[]
for n in pkutest:
    res=[]
    for n2 in n.split(" "):
        if n2 not in stopwords:
            res.append(n2)
    final.append(res)

         Check the contents of the variable final, as follows:

        Using DataFrame to represent data in tabular format looks more intuitive, and then look at the final:

         Write the variable final into the tingci.txt file, but the result is an error: TypeError: can only join an iterable

 2. Error analysis and resolution

        The reason for the error is that an iterable value is required in the join brackets, but the final is a list of strings, so why not?

        Is it really? ? ? Check it out with final.dtypes

        The slap in the face is like a tornado~~~ Since the dataframe has been used to format the final, it is good to forcibly convert the variable final to str type before the for loop iteration~~~            You think the problem is over like this ? (I thought it was over, but...  

        When I run the next code and want to count the frequency of word occurrences and write it to the excel sheet, I get an error again (you think you think...  

        Error content TypeError: 'int' object is not iterable

        What does it mean when you fix a bug and then add a new one? It means that the problem is not solved (this is not nonsense) at least not from the source of the problem T_T

         The content of the error is very similar, it seems that the input type is wrong and it cannot be iterated. This seems to imply something, after thinking about it, it seems that there is a problem with the code in the dataframe step, because the final.dtypes view results show that they are all object types.

        Why don't you try commenting out the code statement final=pd.DataFrame(final)? Because the function of this sentence is only to view the content, it has little to do with processing data

        Run the previous code again, it runs successfully~~~

        The reason why I write such a boring process of repeatedly changing bugs is to tell myself that the more and more bugs are changed, it may be because the problem is not solved from the source, or it may be a trivial line of code, but its existence will affect the subsequent series of The operation of the program, maybe this is the butterfly effect? !

        OK, so far, this post is over

Guess you like

Origin blog.csdn.net/Inochigohan/article/details/121186082