Table of contents
2. Use the median to fill the missing values of the "life value" attribute column
3. Fill the missing value of the "Health Value" attribute column with the mean value
Note: League of Legends hero attribute data resources can be obtained from blog resources.
I. Introduction
As an ancient Dota game, League of Legends can be described as the originator of the Dota game. This game has accompanied us through our school days.
Next, fill in the missing values of the acquired hero attribute data of the League of Legends and analyze the similarity matrix of the top ten characters calculated from the repaired data.
2. Task 1
Check the missing values in the original data, and fill the missing values of the "Life Value" attribute column with the median, and fill the missing values of the "Magic Value" attribute column with the mean.
1. Fill in missing values
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
pd1 = pd.read_csv("D:\\dataspace\\lolData.csv")
print('缺失值情况',pd1.isnull().sum())
life = pd1.iloc[:,2].values
magic=pd1.iloc[:,4].values
Check for missing values as shown below:
2. Use the median to fill the missing values of the "life value" attribute column
lifelist=[]
for i in range(len(life)):
if life[i]>0 and life[i]<800:
lifelist.append(life[i])
print(lifelist)
median1=int(np.median(lifelist))
print(median1)
for i in range(len(life)):
if life[i]<=0 or life[i]>=800 or np.isnan(life[i]):
life[i]=median1
print('处理后的生命值数据',life)
3. Fill the missing value of the "Health Value" attribute column with the mean value
magiclist=[]
for i in range(len(magic)):
if magic[i]>0 and magic[i]<800:
magiclist.append(magic[i])
print(magiclist)
mean=int(np.mean(magiclist))
print(mean)
for i in range(len(magic)):
if magic[i]<=0 or magic[i]>=800 or np.isnan(magic[i]):
magic[i]=mean
print('处理后的魔法值数据',magic)
3. Task 2
According to the repaired data of task 1, calculate the similarity matrix of the top ten characters in the data table, and find out the two characters who are the closest and the farthest
Magic Lingluo |
Sakaha Kasumi |
Qinggang shadow Camille |
Emerald Ivern |
Raging Knight Kled |
Rock sparrow Taliyah |
Dragon King Aurelion Sol |
Jin |
Sea beast priest Illaoi |
Eternal Hunting Twins Qian Jue |
First calculate the similarity matrix of the top ten roles in the data table.
head1=pd1.head(10)
head1=pd.DataFrame(head1)
head2=head1.drop(columns="英雄名字")
# print(head1)
res= np.zeros((10,10))
def cosine_similarity(x, y):
x=np.array(x)
y=np.array(y)
num = np.dot(x,y)
denom = np.linalg.norm(x) * np.linalg.norm(y)
return num / denom
for i in range(10):
for j in range(10):
res[i][j]= cosine_similarity(head2.iloc[i,:],head2.iloc[j,:])
print(res)
Computes the Euclidean distance.
def dist(x,y):
x = np.array(x)
y = np.array(y)
dist1 = np.linalg.norm(x-y)
return dist1
dist11=[]
for i in range(10):
for j in range(i+1,10):
dist2= dist(head2.iloc[i,:],head2.iloc[j,:])
dist11.append(dist2)
print("{0}和{1}的欧几里得距离为{2}".format(i, j,dist2))
print('欧几里得距离\n',dist11)
print('最大值:',max(dist11))
print('最小值',min(dist11))
print(np.argmin(dist11))
print(np.argmax(dist11))
Find the two closest and farthest characters.
for i in range(10):
for j in range(i+1,10):
if dist(head2.iloc[i, :], head2.iloc[j, :])==min(dist11):
print("{0}和{1}的欧几里得距离最小".format(i, j))
a=head1.iloc[i,1]
b=head1.iloc[j,1]
print("距离最小的两个英雄是:")
print(a,'|',b)
elif dist(head2.iloc[i, :], head2.iloc[j, :])==max(dist11):
print("{0}和{1}的欧几里得距离最大".format(i, j))
a = head1.iloc[i, 1]
b = head1.iloc[j, 1]
print("距离最大的两个英雄是:")
print(a, '|', b)