Stand-alone game sales data analysis: PS vs Xbox vs Wii

Foreword


 

A long time between the game console competition, even now still carried out quietly. The data will be analyzed by analyzing sales data between the seventh and eighth generation of consoles games, take a look at who is the winner of this competition:

Seventh generation game consoles : PS3, Xbox360, Wii

Eighth-generation consoles : PS4, Xboxone, WiiU


 

First, we need to use the import library

import numpy as np 
import pandas as pd 
import seaborn as sns
from matplotlib import pyplot as plt
%matplotlib inline
from brewer2mpl import qualitative

Next, let's read the data file, look at the data

Import OS 
the os.chdir (R & lt ' C: \ Data analysis Data \ Sales-Video-Game-with-ratings ' ) 
Video = pd.read_csv ( ' Video_Games_Sales_as_at_22_Dec_2016.csv ' ) 
video.head ()

 Look at the total number of data

print(video.shape)
>>>(16719, 16)

We can see a total of 16 fields, 16,719 rows of data.

Before you start in-depth research data, we can start with some general statistical data to understand us this data, many times we are not able to get the data can be analyzed immediately, we need to conduct some data cleaning work

The existence of null values ​​we first check data

video.isnull().any().any()
>>>True

Null values ​​seem obvious, let's get rid of a null value

video = video.dropna(axis=0)

Then look at the data type for each field

from tabulate import tabulate
tabulate(video.info(), headers='keys', tablefmt='psql')

 

 From the data type returned we can know, User_Score fields which may contain a string type, and we use it later may need to be calculated, so the value thereof to conversion where

video['User_Score'] = pd.to_numeric(video['User_Score'])

Because we are comparing the three host platform, so let's look at what the host platform

video.Platform.unique()
>>>array(['Wii', 'DS', 'X360', 'PS3', 'PS2', '3DS', 'PS4', 'PS', 'XB', 'PC',
       'PSP', 'WiiU', 'GC', 'GBA', 'XOne', 'PSV', 'DC'], dtype=object)

As can be seen from the above data, our data set there are many different host platforms, but the scope of our analysis of the data

And purposes, we need only the data eighth and seventh-generation hosting platform, so we need to filter the data of some selective

In order to check the correlation between the value of all the features and understand the relationship between them, I will draw a heat map. First of all I extract numeric column to a list, and create a simple dataframe "video_num". So, I first draw "Critic_Score" and "User_Score" column as a joint mapping (which helps to observe two different variables is how they are distributed), first to see how they interact

str_list = [] # 用一个空列表存放字段名
for colname, colvalue in video.iteritems():
    if type(colvalue[2]) == str:
         str_list.append(colname)            
num_list = video.columns.difference(str_list) 
video_num = video[num_list]
f, ax = plt.subplots(figsize=(14, 11))
plt.title('Pearson Correlation of Video Game Numerical Features')
sns.heatmap(video_num.astype(float).corr(),linewidths=0.25,vmax=1.0, 
            square=True, cmap="cubehelix_r", linecolor='k', annot=True)

 

 

sns.jointplot(x='Critic_Score',y='User_Score',data=video,
              kind='hex', cmap= 'afmhot', size=11)

 

 Analysis: As expected, showed considerable positive Pearson correlation between these two scores. This is not surprising, because usually, if a game is good, the reviewers and users will have fun, and therefore get a higher score, and vice versa. Now let's look at "Critic_Count" and "Critic_Score"

sns.jointplot('Critic_Score','Critic_Count',data=video,
              kind='hex', cmap='afmhot', size=11)

 

Analysis: In this heat map, more darker color represents a positive correlation, and vice versa. Therefore, we can already see that there is a very rational connection between "Global_Sales" and "EU_Sales" and so on. So far some interesting things.

Now we conducted the first analysis of the seventh generation of the host platform for statistical analysis and visualization

First, I will create a dataframe ( "video7th") to include only those seventh-generation hosting platform, then some data and check the time stamp.

video7th = video[(video['Platform'] == 'Wii') | (video['Platform'] == 'PS3') | (video['Platform'] == 'X360')]
video7th.shape
>>>(2106, 16)

Screening followed by more than two thousand pieces of data

First, let us look at global sales of these game consoles over the years, see if we can figure out what is worth analysis. To this end, I will call the aggregate data by "groupby" on Year_of_Release and "Platform", then Global_Sales sum. For visualization, I will draw a stacked bar chart, and I hope it will be intuitive enough.

plt.style.use('dark_background')
yearlySales = video7th.groupby(['Year_of_Release','Platform']).Global_Sales.sum()
yearlySales.unstack().plot(kind='bar',stacked=True, colormap= 'PuBu',  
                           grid=False,  figsize=(13,11))
plt.title('Stacked Barplot of Global Yearly Sales of the 7th Gen Consoles')
plt.ylabel('Global Sales')

 

Analysis: It seems that PS3 sales getting better and better, selling XB360 (except in 2009 dropped) also generally increase, while Wii sales, has a strong lead in early 2006 and in 2007, but its leading position by other Both games eroded.

Next, look at the sales of different ratings from three games consoles (E: all ages; M: Adult; T: Teen)

plt.style.use('dark_background')
ratingSales = video7th.groupby(['Rating','Platform']).Global_Sales.sum()
ratingSales.unstack().plot(kind='bar',stacked=True,  colormap= 'Greens', 
                           grid=False, figsize=(13,11))
plt.title('Stacked Barplot of Sales per Rating type of the 7th Gen Consoles')
plt.ylabel('Sales')

 

 Analysis: This is not surprising, because we know that Wii is mainly family-oriented entertainment, so it is suitable for the highest sales of all E-Class, M-Class and adult games sold almost negligible. On the other hand, PS3 and XB360 sold most of the m-class game, and this from their excessive shooter, sandbox games and hack and slash can also be seen.

 Finally, we further study the data to see sales of these 3 host, look at what type of game, and the difference between them Each host.

plt.style.use('dark_background')
genreSales = video7th.groupby(['Genre','Platform']).Global_Sales.sum()
genreSales.unstack().plot(kind='bar',stacked=True,  colormap= 'Reds', 
                          grid=False, figsize=(13,11))
plt.title('Stacked Barplot of Sales per Game Genre')
plt.ylabel('Sales')

 

Analysis: It seems to PS3 and XB360, they are of two main types of action games and shooters, as we know, because they are more attractive to hard-core, with action-oriented players. On the other hand, Wii sports to focus on the type of platform game and some other miscellaneous games.

Finally, let us look at the total global sales of pie and three hosts of the total number of users to visualize. I want to introduce the method is simply global sales for all games and add customer value. Thus, it is noted that, to be qualified for digital and visualize the result, because the output will depend on the original data set is completely contained in the first instance.

# Plotting our pie charts
# Create a list of colors 
plt.style.use('seaborn-white')
colors = ['#008DB8','#00AAAA','#00C69C']
plt.figure(figsize=(15,11))
plt.subplot(121)
plt.pie(
   video7th.groupby('Platform').Global_Sales.sum(),
    # with the labels being platform
    labels=video7th.groupby('Platform').Global_Sales.sum().index,
    # with no shadows
    shadow=False,
    # stating our colors
    colors=colors,
    explode=(0.05, 0.05, 0.05),
    # with the start angle at 90%
    startangle=90,
    # with the percent listed as a fraction
    autopct='%1.1f%%'
    )
plt.axis('equal')
plt.title('Pie Chart of Global Sales')
plt.subplot(122)
plt.pie(
   video7th.groupby('Platform').User_Count.sum(),
    labels=video7th.groupby('Platform').User_Count.sum().index,
    shadow=False,
    colors=colors,
    explode=(0.05, 0.05, 0.05),
    startangle=90,
    autopct='%1.1f%%'
    )
plt.axis('equal')
plt.title('Pie Chart of User Base')
plt.tight_layout()
plt.show()

 

Analysis: From the above pie chart and early barplot map view, PS3 and XB360 seem evenly matched, but XB360 slightly better in terms of global sales. Evident that only those indicators, Wii's performance can not compete with the other two competitors.

In summary, it can be said, in the competition of the seventh generation game console, xbox360 prevailed


 

Next, we analyze the competitive situation of the eighth generation of game consoles, the seventh generation of game consoles is almost analysis analyzing step with the front

First, we screened data

video8th = video[(video['Platform'] == 'WiiU') | (video['Platform'] == 'PS4') | (video['Platform'] == 'XOne')]
video8th.shape
>>>(487, 16)

You can see there are 487 data

Then look at the eighth-generation game console sales

plt.style.use('dark_background')
yearlySales = video8th.groupby(['Year_of_Release','Platform']).Global_Sales.sum()
yearlySales.unstack().plot(kind='bar',stacked=True, colormap= 'PuBu',  
                           grid=False,  figsize=(13,11))
plt.title('Stacked Barplot of Global Yearly Sales of the 8th Gen Consoles')
plt.ylabel('Global Sales')

 

Analysis: It is clear that global sales of PS4 more than the sum of WiiU and XOne. This is a very significant deviation from the performance of its predecessor, in the seventh generation, when PS3 and XB360 go hand in hand for many years in sales. 

Question: Why is there such a phenomenon, eighth hosts gradually began to dominate ps4?

Next we explore data, see if I can study the advantages ps4, and we look at these hosts cater to those types of audiences.

plt.style.use('dark_background')
ratingSales = video8th.groupby(['Rating','Platform']).Global_Sales.sum()
ratingSales.unstack().plot(kind='bar',stacked=True,  colormap= 'Greens', 
                           grid=False, figsize=(13,11))
plt.title('Stacked Barplot of Sales per Rating type of the 8th Gen Consoles')
plt.ylabel('Sales')

 

 Analysis: This time the results are interesting. While the seventh generation of analysis where there is a clearly defined, PS3 and XB360 game primarily for mature audiences, and Wii mainly for each player. PS4 decision seems to satisfy (or in some way to attract more) M and E audience. This may explain why before they dominate global sales, as they are now both attracts hardcore gamers, but also attract the casual, family-friendly player.

Next, we look at the comparison of the types of games

plt.style.use('dark_background')
genreSales = video8th.groupby(['Genre','Platform']).Global_Sales.sum()
genreSales.unstack().plot(kind='bar',stacked=True,  colormap= 'Reds', 
                          grid=False, figsize=(13,11))
plt.title('Stacked Barplot of Sales per Game Genre')
plt.ylabel('Sales')

 

 Analysis: As can be seen from this figure, ps4 obviously want a more in-depth exploration type game than his predecessors ps3 A quick browse, you will find ps4 have occupied 12 paragraph 7 of the type of game, and only the ps3. 4 models

Finally, let's look at, pie charts the total number of users and total global sales eighth hosts three main visualization

# Plotting our pie charts
# Create a list of colors 
plt.style.use('seaborn-white')
colors = ['#008DB8','#00AAAA','#00C69C']
plt.figure(figsize=(15,11))
plt.subplot(121)
plt.pie(
   video8th.groupby('Platform').Global_Sales.sum(),
    # with the labels being platform
    labels=video8th.groupby('Platform').Global_Sales.sum().index,
    # with no shadows
    shadow=False,
    # stating our colors
    colors=colors,
    explode=(0.05, 0.05, 0.05),
    # with the start angle at 90%
    startangle=90,
    # with the percent listed as a fraction
    autopct='%1.1f%%'
    )
plt.axis('equal')
plt.title('Pie Chart of 8th Gen Global Sales')
plt.subplot(122)
plt.pie(
   video8th.groupby('Platform').User_Count.sum(),
    labels=video8th.groupby('Platform').User_Count.sum().index,
    shadow=False,
    colors=colors,
    explode=(0.05, 0.05, 0.05),
    startangle=90,
    autopct='%1.1f%%'
    )
plt.axis('equal')
plt.title('Pie Chart of 8th Gen User Base')
plt.tight_layout()
plt.show()

 

 

Analysis: From the pie charts above and earlier barplot map view, unlike the seventh generation of the host, the host eighth there is a clear leader, ps4 far more than the other two competitors Although competition continues. but it is clear, ps4 go faster than the other two hosts, has a very large lead.

In summary, it can be said, in the eighth game console in competition, ps4 victory.

Guess you like

Origin www.cnblogs.com/lattesea/p/12596302.html