The young man does not speak martial arts, but uses Python to collect tens of thousands of Ma Baoguo videos from station B for data analysis

See the title,

Come in with a snap!

If you have a small partner who frequently visits station B, you must know who is the top current ghost animal at station B?

India: Yes, it is under

That must be Mr. Ma Baoguo, the master of Hunyuan Xingyi Taijiquan!

Many people learn python and don't know where to start.
Many people learn python and after mastering the basic grammar, they don't know where to find cases to get started.
Many people who have done case studies do not know how to learn more advanced knowledge.
For these three types of people, I will provide you with a good learning platform, free to receive video tutorials, e-books, and course source code! ??¤
QQ group: 828010317

To be honest, Ma Baoguo came into everyone's field of vision or his PK was KO three times in a row in May.

But now his main material in the ghost animal area is some earlier videos of Ma Baoguo.

For example, in January 2020, Teacher Ma, whose right eye was rubbed, smiled, telling us vividly the story of how young people in the gym attacked him without using martial arts.

In the video, he accuses the young people of kicking clubs of "  not talking about martial ethics  " and persuading him to "  mouse tail juice  ". It can be used as an inbound topic for station B, and it is recommended to recite the full text.

The little clever ghost at station B even opened a special column for Teacher Ma, which also brought convenience to our subsequent crawling data.

It is different from the usual crawling of station B. Under the Ma Baoguo column of station B, F12 can easily find the interface.

https://api.bilibili.com/x/web-interface/web/channel/multiple/list?channel_id=3503796&sort_type=hot&page_size=30

After parsing the JSON, all the data we need can be obtained.

One thing to mention is that the offer in the url is obtained from the json parsing the previous url, as shown in the figure below.

Through a short crawler code,

Soon, 14,000 pieces of video data of Mr. Ma Baoguo were quickly crawled.

def get_data(url,headers):
    data_m = pd.DataFrame(columns=['id','name','view_count','like_count','duration','author_name','author_id','bvid'])
    html = requests.get(url,headers=headers).content
    data = json.loads(html.decode('utf-8'))
    offset = data['data']['offset']
    print(offset)
    for j in range(30):
        data_m = data_m.append({'id':data['data']['list'][j]['id'],'name':data['data']['list'][j]['name'],
                            'view_count':data['data']['list'][j]['view_count'],'like_count':data['data']['list'][j]['like_count'],
                            'duration':data['data']['list'][j]['duration'],'author_name':data['data']['list'][j]['author_name'],
                            'author_id':data['data']['list'][j]['author_id'],'bvid':data['data']['list'][j]['bvid']},ignore_index=True)
    return(offset,data_m)

14,000 data preview

After simple data sorting (some of the playback volume is in units of 10,000), we made a scatter plot of 14,000 videos according to the playback volume and the number of likes.

You can see what is the topic of "top stream". There are a lot of related videos with millions of views and hundreds of thousands of likes.

Sort by the amount of play.

The first place is the classic stand-up comedy in January this year! ! !

What about the amount of likes?

The first place is the Elizabeth Rat "Martial Arts Master" from the master of ghost animal up!

And the linkage performance between Wang Wang and Teacher Ma is also very good!

The performance of several special effects is even more outstanding!

As teacher Ma’s quotations are too classic, I decided to add a little more of its barrage.

# 绘制词云图
stylecloud.gen_stylecloud(text=' '.join(text1), 
                          collocations=False,
                          font_path=r'‪C:\Windows\Fonts\msyh.ttc',
                          icon_name='fas fa-play-circle',
                          size=653,
                          output_name='马保国词云图.png')
 
Image(filename='马保国词云图.png') 

Mouse tail juice is really a facade!

Tingting, British Marble, and Tamen said are also mixed in.

Finally, Xiao Wu would like to recommend a few videos:

1. Favorite human VOCALOID

2. The benchmark for technology, the focus is on Xiaopeng teaching people how to fish!

3. In the end, the video of Observer.com showed you a more diverse, "three-dimensional and three-dimensional chaotic element" teacher Ma, and even discovered some shining points, rather than blindly playing tricks.

 

Guess you like

Origin blog.csdn.net/Python_sn/article/details/110430788