Douban 9.2! 170,000 bullet screens tell you why "The Silent Truth" has a high reputation!

 Produced by CDA Data Analyst  

Author: Mika

Data: Zhenda  

[Guide] Today, I will teach you to use Python to analyze 170,000 barrages of "The Silent Truth". It has been less than 2 months since the last domestic conscience drama "The Hidden Corner" was screened. The "Bald Terrier" and "Mountain Terrier" are still fresh. Then another hit domestic drama came, and that was "The Silent Truth", which recently burst into public praise.

 It is also from iQiyi’s "Mist Theater" for suspenseful skits. "The Truth of Silence" is adapted from Zijin Chen's novel "Long Night is Difficult to See" and tells the prosecutor Jiang Yang who has spent many years to investigate the truth of the case. story.

On the day of the broadcast, "The Silent Truth" scored 8.8 points on Douban. With the broadcast of the series, the reputation of the series was unstoppable, and it went up all the way. After six episodes, Douban scored 9.2 points, successfully surpassing its previous wave. The hidden corner. You know, this trend of driving high and walking high is very rare in domestic dramas.

Many netizens didn’t believe that they would cry at the beginning of the show, but when they saw the finale, they realized that this is too good to cry. Seeing the protagonist Jiang Yang’s life-giving light, it really makes people cry out of Lanzhou Ramen...

So why does this "The Silent Truth" have a high reputation? Why is it the annual finale of the national drama? Today we will use Python to explain it to you.

01, Douban 9.2 points! Beyond the Front Wave "The Hidden Corner"

The last one known as the annual hit national drama was "The Hidden Corner", adapted from Zijin Chen's mystery novel-"Bad Child". "The Hidden Corner" was broadcast with "Little White Boat". "Mountain Terrier" and "Bald Terrier" were lively all summer.

More than 780,000 people have rated on Douban, and the final score is 8.9, which is a very amazing result.

Unexpectedly, in the past 2 months, another suspense drama "The Silent Corner" has become popular with its reputation against the sky! It was also adapted from the novel "Long Night Is Difficult to See" by the author Zi Jin Chen, and it scored 8.8 points when Douban started broadcasting. As the broadcast scores are getting higher and higher, more than 200,000 people have now rated it, with a score of 9.2, which has surpassed the previous wave of "The Hidden Corner".

Douban overall score analysis

Further analysis of the audience ratings, we found that:

 92.8% of the audience gave a five-star perfect score, which has reached a benchmark level in domestic dramas.

Douban Short Commentary Word Cloud

Then we see Douban's short comment word cloud.

We can see that the most discussed by the audience in the short comment is the protagonist "Jiang Yang". His firmness and perseverance are truly impressive. The "actor's acting skills", "the plot", and the degree of reduction to the "original" have all been widely recognized and praised.

 02. What are the 170,000 bullet screens in the drama "The Silent Truth"

So what are you talking about when you are doing the show? Next, we used Python to analyze the video barrage of the first 10 episodes of "The Silent Truth", totaling 173,226.

The first ten episodes of the barrage chart

As you can see from the picture, everyone loves to post barrage when watching a drama. The first ten episodes: the largest number of barrage is episode 9, 3 and 10 respectively . The maximum number of barrage in one episode is 18,903. The sixth episode has the least barrage, with 15,561 barrage.

Then we look at the word cloud of the main characters in the play:

Jiangyang Barrage Word Cloud

Jiang Yang, played by Bai Yu, was originally young and promising, but in order to seek the truth and persist in justice, he gave his life. Things like "justice", "excellent", "acting" and so on appear frequently in the word cloud.

 Li Jing Barrage Word Cloud

Regarding Li Jing, played by Tan Zhuo, many people would think of her role as Concubine Gao in "The Story of Yanxi Palace" in the drama. Whether it is from "Gao Guifei" to Liu Sihui in "I'm Not the God of Medicine", or this time, Li Jing, Tan Zhuo's acting skills are obvious to all.

Yanliang Barrage Word Cloud

From the very beginning of the official announcement of the actor Liao Fan, many viewers said that they had to watch "The Silent Truth" for Liao Fan. As expected, as soon as the episode was broadcast, fans praised him as an "exempt product".

Zhang Chao Barrage Word Cloud

Teacher Ning Li, who plays Zhang Chao, is an old friend of the Fog Theater. The social "Brother Toyota" he played in "Proof of Innocence" is not very ruthless. "Reverse smoking" is too popular. From "Undocumented Crimes" to "The Hidden Corner" and then to "The Silent Truth", Yan Liang has changed three people. It is really the flowing Yan Liang and the hard-pressed Li Fengtian .

03. Teach you how to analyze barrage with Python

We use Python to obtain and analyze the barrage data of the first ten episodes of iQIYI's "The Silent Truth". The entire data analysis process is divided into the following three parts:

  1. Barrage data acquisition
  2. Data reading and simple processing
  3. Data visualization analysis

1. Data Acquisition

The barrage data acquisition program of iQiyi has been explained in the previous article.

2. Data reading and preprocessing

First import the required packages, where pandas is used for data reading and data processing, os is used for file operations, jieba is used for Chinese word segmentation, and pyecharts and stylecolud are used for data visualization.

# 导入库
import os  
import jieba
import pandas as pd 

from pyecharts.charts import Bar, Pie, Line, WordCloud, Page
from pyecharts import options as opts 
from pyecharts.globals import SymbolType, WarningType
WarningType.ShowWarning = False

import stylecloud
from IPython.display import Image

Store the crawled data in the data folder, use the os operation to get the list of csv files that need to be read, and read the files in a loop.

# 读入数据
data_list = os.listdir('../data/')

df_all = pd.DataFrame()

for i in data_list:
    if i.endswith('csv'):
        df_one = pd.read_csv(f'../data/{i}', engine='python', encoding='utf-8', index_col=0)  
        df_all = df_all.append(df_one, ignore_index=False)

print(df_all.shape) 
(173226, 6)

There are a total of 173226 barrages, please preview the data:

df_all['name'] = df_all.name.str.strip() 
df_all.head() 

3. Data visualization

——Number of barrages of diversity

Code explanation:

repl_list = { 
    'The first episode': 1, 
    'The second episode': 2, 
    'The third episode': 3, 
    'The fourth episode': 4, 
    'The fifth episode': 5, 
    'The sixth episode': 6 , 
    'Episode Seven': 7, 
    'Episode Eight': 8, 
    'Episode Nine': 9 , 'Episode 
    Ten': 10 
} 

df_all['episodes_num'] = df_all['episodes'].map(repl_list )  
df_all.head() 

# Generate data 
danmu_num = df_all.episodes_num.value_counts() 
danmu_num = danmu_num.sort_index() 
x_data = ['第' + str(i) +'集' for i in danmu_num.index] 
y_data = danmu_num.values.tolist()  

# Bar graph 
bar1 = Bar(init_opts=opts.InitOpts(width='1350px', height='750px')) 
bar1.add_xaxis(xaxis_data=x_data) 
bar1.add_yaxis('', y_axis=y_data) 
bar1.set_global_opts( title_opts=opts.TitleOpts(title='Barrage number trend chart of the first ten episodes'),  
                     visualmap_opts=opts.VisualMapOpts(max_=20000, is_show=False)  
                    ) 
bar1.render() 
x_data = ['第' + str(i) +'集' for i in danmu_num.index] 
y_data = danmu_num.values.tolist()  

# Bar graph 
bar1 = Bar(init_opts=opts.InitOpts(width='1350px' , height='750px')) 
bar1.add_xaxis(xaxis_data=x_data) 
bar1.add_yaxis('', y_axis=y_data) 
bar1.set_global_opts(title_opts=opts.TitleOpts(title='Barrage chart of the first ten episodes' ),  
                     visualmap_opts=opts.VisualMapOpts(max_=20000, is_show=False)  
                    ) 
bar1.render('../html/Iqiyi Barrage Trend Chart.html') 

Barrage role-Jiangyang word cloud map

 

# Define the word segmentation function 
def get_cut_words(content_series): 
    # read the stop word list 
    stop_words = []  

    with open(r"stop_words.txt",'r', encoding='utf-8') as f: 
        lines = f. readlines() 
        for line in lines: 
            stop_words.append(line.strip()) 

    # Add keywords 
    my_words = ['Liao Fan','Yan Liang','Baiyu','Jiangyang','Tan Zhuo', 'Li Jing', 
                'Ning Li','Zhang Chao','Huang Yao','Zhang Xiaoqian',' 
    Aoli Ge ' 
               ]   
for i in my_words: 
        jieba.add_word(i)  

    # Custom stop words 
    my_stop_words = [ 'Really','this one','this is','one kind','kind','ahhhhh','hahaha', 
                     ' 
    Hahahaha ','I want']     stop_words.extend(my_stop_words)               
 
    # participle
    word_num = jieba.lcut(content_series.str.cat(sep='。'), cut_all=False)

    # 条件筛选
    word_num_selected = [i for i in word_num if i not in stop_words and len(i)>=2]

    return word_num_selected
# Get word segmentation result 
text1 = get_cut_words(content_series=df_all[df_all.name=='江阳']['content']) 

# Draw a word cloud 
image stylecloud.gen_stylecloud(text=''.join(text1), max_words=1000, 
                          collocations=False, 
                          font_path=r'C:\Windows\Fonts\msyh.ttc', 
                          icon_name='fas fa-heart', 
                          size=653, 
                          output_name=' Drag screen role-Jiangyang word cloud map.png') 

 

Guess you like

Origin blog.csdn.net/yoggieCDA/article/details/108822354