What is it like to have a beautiful girlfriend? Crawl topics that know about 220 million views

The text and pictures in this article are from the Internet and are for learning and communication purposes only, and do not have any commercial use. If you have any questions, please contact us for processing.

The following article is from the rookie learn Python

Author: rookie brother

Preface

For many people, having a beautiful girlfriend is a very happy thing. There is a very popular topic on Zhihu, called what kind of experience is it to have a beautiful girlfriend? , The reading volume on Zhihu has reached 220 million reading volume and has attracted more than 100,000 people's attention.

What is it like to have a beautiful girlfriend?  Crawl topics that know about 220 million views

 

Today, I will lead everyone to grab and analyze some of the content of the answerers on this topic, and feel what it is like to have a beautiful girlfriend.

Python crawler, data analysis, website development and other case tutorial videos are free to watch online

https://space.bilibili.com/523606542 

Python learning exchange group: 1039645993

01. How to crawl

For data acquisition, we need to build a specific Zhihu data interface to be able to obtain the answer content of the answerer. For the acquisition of the data interface, as shown in the following figure:

What is it like to have a beautiful girlfriend?  Crawl topics that know about 220 million views

 

You only need to open the XHR option in the network in the developer mode, and then find the json data content starting with anwsers . You can see that the data contains information such as the comment content of the respondent. Next, you only need to copy the link of the interface, and then use the program to request data from the interface. You may feel that the interface is very complicated, how to construct it? In fact, it is not particularly complicated. The editor has already constructed a request link for everyone in the program. The program is shown in the following figure:

What is it like to have a beautiful girlfriend?  Crawl topics that know about 220 million views

 

In the program, the data interface is continuously constructed through the for loop. The interface contains keywords such as include, limit, offset, etc. When you want to grab the data of other answers, you only need to change the number index of different questions, for example, the following The number part in the link below:

https://www.zhihu.com/api/v4/questions/28997505/answers?

The program obtains the json data returned by the interface through the self.get_json function . And through the self.get_comments function to parse the json data, extract the data. self.get_comments is shown in the following figure:

What is it like to have a beautiful girlfriend?  Crawl topics that know about 220 million views

 

In the self.get_comments function, the html file in the json data is parsed through the BeautifulSoup library to obtain the answer content of the answerer and grab the content of the picture uploaded by the answerer. At the same time, in the json data, content information such as the name and gender of the respondent can be directly parsed.

After running the above program, a total of 3600+ respondents' content information was obtained. The information obtained is as follows:

What is it like to have a beautiful girlfriend?  Crawl topics that know about 220 million views

 

02. Mining data

After getting the data information, let's make a simple analysis of your answers and see what inspiration can be gained from it.

1). Gender analysis

When crawling the content of the answers, the editor found that in the gender distribution of the respondents, it is not just all boys. Through visualization, we can more intuitively feel how the gender distribution of the respondents is.

What is it like to have a beautiful girlfriend?  Crawl topics that know about 220 million views

 

The program first uses the Counter class in the built-in library collections to count the gender of the respondent, and then displays it visually through a pie chart.

What is it like to have a beautiful girlfriend?  Crawl topics that know about 220 million views

 

As can be seen from the above figure, among the respondents of this question, excluding the respondents of unknown gender, although boys accounted for an overwhelming proportion, female respondents still accounted for 8.38%.

 

2). Number of likes and comments

We know that the greater the number of likes and comments, the more readers agree and support the content of a respondent's answer. Next, we sort all the content by the number of likes. Take a look at the distribution of the number of likes and comments among the top ten likes.

What is it like to have a beautiful girlfriend?  Crawl topics that know about 220 million views

 

In the above figure, the bar graph represents the distribution of the number of comments, and its ordinate refers to the left axis, while the line graph represents the distribution of the number of likes, and its ordinate refers to the right axis. It can be seen from the figure that the two respondents of Tang Jia Yunye’s Tang and Zheng Zheng received the largest number of comments, but unlike "The Tang Jia Yunye’s soup", which received high number of likes, "Zheng Zheng" received less. The number of likes.

 

3). Distribution of word cloud

As for the content of your answers, it can better show everyone's real experience and feelings. Let's take a look at your keywords through the display of the word cloud?

What is it like to have a beautiful girlfriend?  Crawl topics that know about 220 million views

 

In the program, first use the jieba library to segment your answers, and then use the stylecould library to visualize the word cloud.

What is it like to have a beautiful girlfriend?  Crawl topics that know about 220 million views

 

You can see that in everyone’s word cloud, keywords include girlfriend, beautiful, good-looking, like, us, etc. The editor really gets more sour the more I look at it.

 

4). Who is the best answerer

Regarding who is the best answerer, Zhihu has already given us the answer. The answerer at the top of this question, Tang Jia Yunye’s soup, is the most in terms of the number of likes and comments, and it should be the most. Best answerer. The author described his girlfriend very deliberately. It is not only the beauty on the outside, but more importantly, the beauty of the soul, versatile, calligraphy, and cooking.

What is it like to have a beautiful girlfriend?  Crawl topics that know about 220 million views

 

5). Who is the shaman

Among the many respondents, many people have posted pictures of themselves and their girlfriends. Among these people, who has the most pictures, and who is the fanatic, let's take a look.

What is it like to have a beautiful girlfriend?  Crawl topics that know about 220 million views

 

The program creates exclusive folders based on the names of different respondents, and then captures all the pictures posted by the respondents and saves them in a specific folder.

After statistics, it was found that the user named Instant Orangutan posted a total of 127 pictures on this issue, and became a full-scale photo craze. As for the photos posted by this user, you can check it out here. I won't make too many comments.

Guess you like

Origin blog.csdn.net/m0_48405781/article/details/114638687