Pandas draws a stacked histogram and adds a numerical display

        Recently, there is a need to draw stacked histograms. Since matplotlib does not have native support for displaying values, it needs to be implemented indirectly using the function plt.text(x, y, value). The realization of stacked histograms is actually the same as that of ordinary histograms. Yes, mainly by traversing each column to calculate the x, y coordinates and it will be ok. There is actually only one thing to pay attention to, which is the traversal order of the columns. The structure of the ordinary column chart is simple, that is, it traverses from bottom to top (mine is horizontal). If it is stacked, how to traverse it? After observation, it is breadth first. That is to say, if a pillar is composed of four sub-pillars, it will be traversed in an N shape, that is, it will traverse all pillars of the same type first, and then traverse others. If the value of a pillar is 0, then there are pillars, only However, the length of the column is also 0, so once the data is fixed, the number of columns is fixed. Of course, we can decide whether to display the value of 0. Knowing this, it is actually easy to handle. Don’t talk nonsense, just upload the code

Raw data

license plate Number of failures at the first level Number of secondary failures Third-level failure times Abnormal failure times total number of failures
221101-001 2 0 0 3 5
221101-002 5 2 0 4 11
221101-003 3 3 15 9 30
221101-004 3 0 0 2 5
221102-005 0 3 0 31 34
221102-006 0 0 1 3 4
221103-007 2 3 0 4 9
221103-008 0 0 0 2 2
221103-009 2 0 0 0 2
221103-010 0 3 0 17 20
221104-011 2 1 0 10 13
221104-012 4 0 0 0 4
221105-013 0 0 0 4 4
221107-014 0 0 0 1 1

 the code

        stack_chart = self.error_times_1000.plot.barh(y=["一级故障次数","二级故障次数","三级故障次数","异常故障次数"],figsize=(14,8),title="千公里故障次数_堆叠图",stacked=True)
        b_list = list(np.arange(1,len(stack_chart.patches)*2,2))
        print("=================lenof pathes===============",len(stack_chart.patches))

        width_list = [0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0]
        for idx,rect in enumerate(stack_chart.patches):
            print("==============x================",idx,rect.get_x() + rect.get_width() + 0.1,b_list[idx] * rect.get_height()-0.6,int(rect.get_width()))
            print("===============rect=============",rect.get_x(),rect.get_y(),rect.get_width(),rect.get_height())
            if rect.get_width() != 0:
                plt.text(width_list[idx%14] + rect.get_width()/2, (b_list[idx] * rect.get_height()-0.5)%(28*0.5) - 0.2, '%s' % int(rect.get_width()))
            width_list[idx%14] += rect.get_width()

Pandas only needs to add a stacked=True parameter to draw a stacked histogram, and the y parameter can specify the column to be drawn, which is very simple and rude. The more troublesome thing here is the numerical display. In order to display all the numerical values ​​in the middle, you need to record the length of the previous column. When drawing the second/third/fourth column, you need to add the previous length. The vertical coordinate is because the width of the column is 0.5, and the blank space is also 0.5. The number of my columns is 14, so multiply by 2, which is 28 by 0.5. Only non-zero values ​​are displayed, otherwise it will look very strange.

Effect

Of course, if the position of the number display is not accurate, you can adjust it slowly by yourself. It is relatively simple to draw directly with pandas. If you use matplotlib to draw, the number of lines of code may be slightly more, so it is recommended that those who are familiar with pandas use pandas to draw better. .

Guess you like

Origin blog.csdn.net/zy1620454507/article/details/131638532
Recommended