Article directory
foreword
We have learned how to use the python pyecharts
module to realize data visualization, and display the processed data in the form of line charts, maps and histograms. In this article, I will use an example to share with you how to combine ideas to realize data visualization 面向对象
.
The process of realizing data visualization
-
Collect data: Collect the data that needs to be visualized and ensure the accuracy and completeness of the data. Data can come from various sources such as databases, log files, APIs, etc.
-
Cleaning and sorting data: Cleaning and sorting the collected data, including removing duplicate values, dealing with missing data, dealing with outliers, etc. Ensure data quality and consistency.
-
Choose the right visualization tool: Choose the right visualization tool based on the type and needs of the data. Common visualization tools include Tableau, Power BI, matplotlib, D3.js, etc.
-
Select the visualization type: select the appropriate visualization type according to the characteristics of the data and the expression requirements. Common visualization types include bar charts, line charts, scatter charts, pie charts, radar charts, and more.
-
Design a visual interface: According to the characteristics of the data and the type of visualization, design a suitable visual interface. The interface should be concise and clear, focusing on the display and comparison of key data.
-
Draw a chart: Use the selected visualization tool to draw a designed visual chart. According to requirements, add appropriate legends, labels, titles, etc. to increase the readability and understandability of the chart.
-
Data interaction and analysis: Add interactive functions to the visual interface, such as hovering the mouse to display data details, clicking on chart elements to filter, etc. Through the interactive function, users can further analyze and explore the data.
-
Adjustment and optimization: adjust and optimize the visual interface according to user feedback and demand changes. You can modify chart styles, improve interactivity, add new data dimensions, and more.
-
Sharing and Publishing: Share and publish the completed visualization results, which can be displayed and shared by exporting static images, generating reports, and embedding them in web pages.
-
Monitoring and updating: Regularly monitor the visualization results, update the data and adjust the visualization interface in time to maintain the timeliness and accuracy of the visualization results.
What we have achieved is simple data visualization. Today we will mainly explain the aspects of collecting data, cleaning and sorting data, selecting the type of visualization, and drawing charts.
Realize data visualization
Let's take the sales of two months as an example, and display the sales of the two months in the form of images.
read data
Here we have packed the data into a file, so all we need to do is read the data from the file. Since the formats of the two files are different, one is csv
the format and the other is JSON
the format, so the methods of reading and processing data are also different. Here, there are two methods to read and process different data respectively.
An interface is provided here to facilitate the use of the latter two classes.
class FileReader():
def reader(self) -> list[Record]:
pass
What is an interface? The interface means that the methods in the class do not have a specific method body, which is used pass
to represent the method body, and the specific method implementation is realized by the subclass that inherits it.
The way to read files is basically the same, but for the subsequent data processing operations, we still implement it in two classes.
Read csv format file data
class TestFileReader(FileReader):
def __init__(self,path):
self.path = path
def reader(self) -> list[Record]:
f = open(self.path,"r",encoding="UTF8")
data_lines = f.readlines()
f.close()
Read JSON format file data
class JsonFileReader(FileReader):
def __init__(self,path):
self.path = path
def reader(self) -> list[Record]:
f = open(self.path,"r",encoding="UTF8")
data_lines = f.readlines()
f.close()
create object
Treat each piece of sales information as an object.
class Record():
def __init__(self,data,order_id,money,province):
self.data = data
self.order_id = order_id
self.money = money
self.province = province
def __str__(self):
return f'{
self.data},{
self.order_id},{
self.money},{
self.province}'
__ init __
Constructor to initialize properties.
__ str __
method to facilitate our printing.
Data processing
We convert the data into objects, each object represents a piece of sales information, and then store these objects in a list.
TestFileReader class
class TestFileReader(FileReader):
def __init__(self,path):
self.path = path
def reader(self) -> list[Record]:
f = open(self.path,"r",encoding="UTF8")
data_lines = f.readlines()
f.close()
list1 : list[Record] = []
for line in data_lines:
line = line.strip() # strip方法用来处理每一行数据后面的 \n
data_list = line.split(",")
record = Record(data_list[0],data_list[1],int(data_list[2]),data_list[3])
list1.append(record)
return list1
JsonFileReader class
class JsonFileReader(FileReader):
def __init__(self,path):
self.path = path
def reader(self) -> list[Record]:
f = open(self.path,"r",encoding="UTF8")
data_lines = f.readlines()
f.close()
list1 : list[Record] = []
for line in data_lines:
line = line.strip()
data_dict = json.loads(line) # JSON类型数据转换为python数据类型
record = Record(data_dict["date"],data_dict["order_id"],int(data_dict["money"]),data_dict["province"])
list1.append(record)
return list1
data analysis
Add up the sales of the same day and use the dictionary as a data type to store it. Why use a dictionary? Because the key value of the dictionary does not allow repetition, this corresponds to our date, and the value corresponds to our sales.
test_file = TestFileReader("D:/桌面/2011年1月销售数据.txt")
json_file = JsonFileReader("D:/桌面/2011年2月销售数据JSON.txt")
list1 = test_file.reader()
list2 = json_file.reader()
data_list = list1 + list2 # 将两天的数据综合到一起
data_dict = {
}
for record in data_list:
if record.data in data_dict.keys(): # 如果该日期已经存储了,那么我们将存储的值与当前值相加之后再存入
data_dict[record.data] += record.money
else: # 如果没有出现,那么就直接存入数据
data_dict[record.data] = record.money
draw histogram
In this example, we use a histogram to best show the difference in data, and we choose a histogram at will.
bar = Bar(init_opts=InitOpts(theme=ThemeType.LIGHT))
bar.add_xaxis(list(data_dict.keys()))
bar.add_yaxis("销售额",list(data_dict.values()),label_opts=LabelOpts(is_show=False)) # 设置系列配置项来取消柱状图中数据的显示
bar.set_global_opts(
title_opts=TitleOpts(title="2011年1、2月销售情况")
)
bar.render("2021年1、2月销售情况.html")
Overall code and effect display
data_define.py file
class Record():
def __init__(self,data,order_id,money,province):
self.data = data
self.order_id = order_id
self.money = money
self.province = province
def __str__(self):
return f'{
self.data},{
self.order_id},{
self.money},{
self.province}'
file_define.py file
from data_define import Record
import json
class FileReader():
def reader(self) -> list[Record]:
pass
class TestFileReader(FileReader):
def __init__(self,path):
self.path = path
def reader(self) -> list[Record]:
f = open(self.path,"r",encoding="UTF8")
data_lines = f.readlines()
f.close()
list1 : list[Record] = []
for line in data_lines:
line = line.strip()
data_list = line.split(",")
record = Record(data_list[0],data_list[1],int(data_list[2]),data_list[3])
list1.append(record)
return list1
class JsonFileReader(FileReader):
def __init__(self,path):
self.path = path
def reader(self) -> list[Record]:
f = open(self.path,"r",encoding="UTF8")
data_lines = f.readlines()
f.close()
list1 : list[Record] = []
for line in data_lines:
line = line.strip()
data_dict = json.loads(line)
record = Record(data_dict["date"],data_dict["order_id"],int(data_dict["money"]),data_dict["province"])
list1.append(record)
return list1
main.py file
from data_define import Record
from file_define import *
from pyecharts.charts import Bar
from pyecharts.options import TitleOpts,LabelOpts,InitOpts
from pyecharts.globals import ThemeType
test_file = TestFileReader("D:/桌面/2011年1月销售数据.txt")
json_file = JsonFileReader("D:/桌面/2011年2月销售数据JSON.txt")
list1 = test_file.reader()
list2 = json_file.reader()
data_list = list1 + list2
data_dict = {
}
for record in data_list:
if record.data in data_dict.keys():
data_dict[record.data] += record.money
else:
data_dict[record.data] = record.money
bar = Bar(init_opts=InitOpts(theme=ThemeType.LIGHT))
bar.add_xaxis(list(data_dict.keys()))
bar.add_yaxis("销售额",list(data_dict.values()),label_opts=LabelOpts(is_show=False))
bar.set_global_opts(
title_opts=TitleOpts(title="2011年1、2月销售情况")
)
bar.render("2021年1、2月销售情况.html")