Data visualization - combined with object-oriented thinking to realize data visualization

insert image description here

foreword

We have learned how to use the python pyechartsmodule to realize data visualization, and display the processed data in the form of line charts, maps and histograms. In this article, I will use an example to share with you how to combine ideas to realize data visualization 面向对象.

The process of realizing data visualization

  1. Collect data: Collect the data that needs to be visualized and ensure the accuracy and completeness of the data. Data can come from various sources such as databases, log files, APIs, etc.

  2. Cleaning and sorting data: Cleaning and sorting the collected data, including removing duplicate values, dealing with missing data, dealing with outliers, etc. Ensure data quality and consistency.

  3. Choose the right visualization tool: Choose the right visualization tool based on the type and needs of the data. Common visualization tools include Tableau, Power BI, matplotlib, D3.js, etc.

  4. Select the visualization type: select the appropriate visualization type according to the characteristics of the data and the expression requirements. Common visualization types include bar charts, line charts, scatter charts, pie charts, radar charts, and more.

  5. Design a visual interface: According to the characteristics of the data and the type of visualization, design a suitable visual interface. The interface should be concise and clear, focusing on the display and comparison of key data.

  6. Draw a chart: Use the selected visualization tool to draw a designed visual chart. According to requirements, add appropriate legends, labels, titles, etc. to increase the readability and understandability of the chart.

  7. Data interaction and analysis: Add interactive functions to the visual interface, such as hovering the mouse to display data details, clicking on chart elements to filter, etc. Through the interactive function, users can further analyze and explore the data.

  8. Adjustment and optimization: adjust and optimize the visual interface according to user feedback and demand changes. You can modify chart styles, improve interactivity, add new data dimensions, and more.

  9. Sharing and Publishing: Share and publish the completed visualization results, which can be displayed and shared by exporting static images, generating reports, and embedding them in web pages.

  10. Monitoring and updating: Regularly monitor the visualization results, update the data and adjust the visualization interface in time to maintain the timeliness and accuracy of the visualization results.

What we have achieved is simple data visualization. Today we will mainly explain the aspects of collecting data, cleaning and sorting data, selecting the type of visualization, and drawing charts.

Realize data visualization

insert image description here
insert image description here

Let's take the sales of two months as an example, and display the sales of the two months in the form of images.

read data

Here we have packed the data into a file, so all we need to do is read the data from the file. Since the formats of the two files are different, one is csvthe format and the other is JSONthe format, so the methods of reading and processing data are also different. Here, there are two methods to read and process different data respectively.

An interface is provided here to facilitate the use of the latter two classes.

class FileReader():

    def reader(self) -> list[Record]:
        pass

What is an interface? The interface means that the methods in the class do not have a specific method body, which is used passto represent the method body, and the specific method implementation is realized by the subclass that inherits it.

The way to read files is basically the same, but for the subsequent data processing operations, we still implement it in two classes.

Read csv format file data

class TestFileReader(FileReader):
    def __init__(self,path):
        self.path = path

    def reader(self) -> list[Record]:
        f = open(self.path,"r",encoding="UTF8")
        data_lines = f.readlines()
        f.close()

Read JSON format file data

class JsonFileReader(FileReader):
    def __init__(self,path):
        self.path = path

    def reader(self) -> list[Record]:
        f = open(self.path,"r",encoding="UTF8")
        data_lines = f.readlines()
        f.close()

create object

Treat each piece of sales information as an object.

class Record():
    def __init__(self,data,order_id,money,province):
        self.data = data
        self.order_id = order_id
        self.money = money
        self.province = province

    def __str__(self):
        return f'{
      
      self.data},{
      
      self.order_id},{
      
      self.money},{
      
      self.province}'

__ init __Constructor to initialize properties.
__ str __method to facilitate our printing.

Data processing

We convert the data into objects, each object represents a piece of sales information, and then store these objects in a list.

TestFileReader class

class TestFileReader(FileReader):
    def __init__(self,path):
        self.path = path

    def reader(self) -> list[Record]:
        f = open(self.path,"r",encoding="UTF8")
        data_lines = f.readlines()
        f.close()
        list1 : list[Record] = []
        for line in data_lines:
            line = line.strip()  # strip方法用来处理每一行数据后面的 \n
            data_list = line.split(",")
            record = Record(data_list[0],data_list[1],int(data_list[2]),data_list[3])
            list1.append(record)
        return list1

JsonFileReader class

class JsonFileReader(FileReader):
    def __init__(self,path):
        self.path = path

    def reader(self) -> list[Record]:
        f = open(self.path,"r",encoding="UTF8")
        data_lines = f.readlines()
        f.close()
        list1 : list[Record] = []
        for line in data_lines:
            line = line.strip()
            data_dict = json.loads(line)  # JSON类型数据转换为python数据类型
            record = Record(data_dict["date"],data_dict["order_id"],int(data_dict["money"]),data_dict["province"])
            list1.append(record)

        return list1

data analysis

Add up the sales of the same day and use the dictionary as a data type to store it. Why use a dictionary? Because the key value of the dictionary does not allow repetition, this corresponds to our date, and the value corresponds to our sales.

test_file = TestFileReader("D:/桌面/2011年1月销售数据.txt")
json_file = JsonFileReader("D:/桌面/2011年2月销售数据JSON.txt")

list1 = test_file.reader()
list2 = json_file.reader()

data_list = list1 + list2  # 将两天的数据综合到一起
data_dict = {
    
    }

for record in data_list:
    if record.data in data_dict.keys():  # 如果该日期已经存储了,那么我们将存储的值与当前值相加之后再存入
        data_dict[record.data] += record.money
    else:  # 如果没有出现,那么就直接存入数据
        data_dict[record.data] = record.money

draw histogram

In this example, we use a histogram to best show the difference in data, and we choose a histogram at will.

bar = Bar(init_opts=InitOpts(theme=ThemeType.LIGHT))
bar.add_xaxis(list(data_dict.keys()))
bar.add_yaxis("销售额",list(data_dict.values()),label_opts=LabelOpts(is_show=False))  # 设置系列配置项来取消柱状图中数据的显示

bar.set_global_opts(
    title_opts=TitleOpts(title="2011年1、2月销售情况")
)

bar.render("2021年1、2月销售情况.html")

Overall code and effect display

data_define.py file

class Record():
    def __init__(self,data,order_id,money,province):
        self.data = data
        self.order_id = order_id
        self.money = money
        self.province = province

    def __str__(self):
        return f'{
      
      self.data},{
      
      self.order_id},{
      
      self.money},{
      
      self.province}'

file_define.py file

from data_define import Record
import json

class FileReader():

    def reader(self) -> list[Record]:
        pass

class TestFileReader(FileReader):
    def __init__(self,path):
        self.path = path

    def reader(self) -> list[Record]:
        f = open(self.path,"r",encoding="UTF8")
        data_lines = f.readlines()
        f.close()
        list1 : list[Record] = []
        for line in data_lines:
            line = line.strip()
            data_list = line.split(",")
            record = Record(data_list[0],data_list[1],int(data_list[2]),data_list[3])
            list1.append(record)
        return list1

class JsonFileReader(FileReader):
    def __init__(self,path):
        self.path = path

    def reader(self) -> list[Record]:
        f = open(self.path,"r",encoding="UTF8")
        data_lines = f.readlines()
        f.close()
        list1 : list[Record] = []
        for line in data_lines:
            line = line.strip()
            data_dict = json.loads(line)
            record = Record(data_dict["date"],data_dict["order_id"],int(data_dict["money"]),data_dict["province"])
            list1.append(record)

        return list1

main.py file

from data_define import Record
from file_define import *
from pyecharts.charts import Bar
from pyecharts.options import TitleOpts,LabelOpts,InitOpts
from pyecharts.globals import ThemeType

test_file = TestFileReader("D:/桌面/2011年1月销售数据.txt")
json_file = JsonFileReader("D:/桌面/2011年2月销售数据JSON.txt")

list1 = test_file.reader()
list2 = json_file.reader()

data_list = list1 + list2
data_dict = {
    
    }

for record in data_list:
    if record.data in data_dict.keys():
        data_dict[record.data] += record.money
    else:
        data_dict[record.data] = record.money

bar = Bar(init_opts=InitOpts(theme=ThemeType.LIGHT))
bar.add_xaxis(list(data_dict.keys()))
bar.add_yaxis("销售额",list(data_dict.values()),label_opts=LabelOpts(is_show=False))

bar.set_global_opts(
    title_opts=TitleOpts(title="2011年1、2月销售情况")
)

bar.render("2021年1、2月销售情况.html")


insert image description here

Guess you like

Origin blog.csdn.net/m0_73888323/article/details/131872214