2022 National College Student Data Analysis Contest B Complete Problem-solving Tutorial and Code: Sentiment Analysis of Catering Service Evaluation

Topic B: Sentiment analysis of catering service evaluation and complete problem solving

The catering industry is known as the "golden industry that will never end". On the one hand, it has the rigid demand characteristics of "food is the first thing for the people", and on the other hand, it has a low threshold for starting a business, which makes the competition in the industry fierce. After the rapid development of China's catering market, the industrial chain has gradually improved, and the catering takeaway market has gradually matured. Under the influence of the Internet and the epidemic, young people's online consumption tendency has been further developed, which has also greatly boosted the online consumption of the catering industry. develop. For catering companies, only by combining online and offline refined operations and grasping user evaluations can they stand out from the competition. This question collects food review data from different catering companies. Please clean, analyze and mine the data according to the provided data, and answer the following questions.

2.1 The first question
analyzes the evaluation content in the data, makes a word cloud map, and gives the 10 words with the most positive and negative emotions.

Analyze the evaluation content in the data, make a word cloud map, and give the 10 words with the most positive and negative emotions.

Topic analysis: First, group the text according to negative and positive, and segment the text. You can use a stuttering word segmenter, and then count the ten words with the most occurrences of negative and positive evaluation words

2.2. The second question

In the analysis data, whether there is a relationship between the positive sentiment and negative sentiment of the user's evaluation and the evaluation time, please explain the reason.

Topic Analysis: The time data needs to be processed. It can be processed into three groups of month, day, hour, and then the time is grouped, and the number of texts with negative and positive emotions in these three time frequencies is counted, and then variance analysis is used. variance analysis

2.3. The third question

Which merchant has the most positive emotions, and summarize the advantages of this merchant.

Topic analysis: Group merchants, count the number of texts with positive emotions, and then sort them in descending order to get the merchant with the most positive emotions. You can analyze their texts with lda keywords to tap the advantages of user feedback.

2.4. The fourth question

Which merchant has the most negative emotions, and put forward relevant improvement strategies to increase the positive emotions of customers.

Topic analysis: Group merchants, count the number of texts with negative emotions, and then sort them in descending order to get the merchant with the most negative emotions. You can analyze their texts with lda keywords, dig out the shortcomings of their user feedback, and then prescribe the right medicine

2.5. Fifth question

Build a sentiment propensity model for food service reviews, and evaluate the model's performance and error. Based on the model, evaluate the test data test.xlsx in the attachment, add the evaluation results to the first column, and upload this file to the competition platform together.

Topic analysis: It is enough to build a text classification model. It is necessary to train word vectors. It is recommended to use machine learning. It is very effective for short text classification of this kind of binary classification.

The complete problem-solving video ideas have been released

2022 College Student Data Analysis Contest B Questions Full Nanny Tutorial and Complete Code Catering Service Evaluation

Supongo que te gusta

Origin blog.csdn.net/weixin_44099072/article/details/128499517
Recomendado
Clasificación