What is going wrong with this stacked bar plot?

SHV_la :

I really don't understand what's going wrong with this... I've looked through what is pretty simple data several times and have restarted the kernel (running on Jupyter Notebook) and nothing seems to be solving it.

Here's the data frame I have (sorry I know the numbers look a bit silly, this is a really sparse dataset over a long time period, original is reindexed for 20 years):

DATE        NODP            NVP             VP              VDP
03/08/2002  0.083623        0.10400659      0.81235517      1.52458E-05
14/09/2003  0.24669167      0.24806379      0.5052293       1.52458E-05
26/07/2005  0.15553726      0.13324796      0.7111538       0.000060983
20/05/2006  0               0.23            0.315           0.455
05/06/2007  0.21280034      0.29139224      0.49579217      1.52458E-05
21/02/2010  0               0.55502195      0.4449628       1.52458E-05
09/04/2011  0.09531311      0.17514162      0.72954527      0
14/02/2012  0.19213217      0.12866237      0.67920546      0
17/01/2014  0.12438848      0.10297326      0.77263826      0
24/02/2017  0.01541347      0.09897548      0.88561105      0

Note that all of the rows add up to 1! I have triple, quadruple checked this...XD

I am trying to produce a stacked bar chart of this data, with the following code, which seems to have worked perfectly for everything else I have been using it for:

NODP = df['NODP']
NVP = df['NVP']
VDP = df['VDP']
VP = df['VP']
ind = np.arange(len(df.index))
width = 5.0

p1 = plt.bar(ind, NODP, width, label = 'NODP', bottom=NVP, color= 'grey')
p2 = plt.bar(ind, NVP, width, label = 'NVP', bottom=VDP, color= 'tan')
p3 = plt.bar(ind, VDP, width, label = 'VDP', bottom=VP, color= 'darkorange')
p4 = plt.bar(ind, VP, width, label = 'VP', color= 'darkgreen')
plt.ylabel('Ratio')
plt.xlabel('Year')
plt.title('Ratio change',x=0.06,y=0.8)
plt.xticks(np.arange(min(ind), max(ind)+1, 6.0), labels=xlabels) #the xticks were cumbersome so not included in this example code
plt.legend()

Which gives me the following plot:

enter image description here

As is evident, 1) NODP is not showing up at all, and 2) the remainder of them are being plotted with the wrong proportions...

I really don't understand what is wrong, it should be really simple, right?! I'm sorry if it is really simple, it's probably right under my nose. Any ideas greatly appreciated!

JohanC :

If you want to create stacked bars this way (so standard matplotlib without using pandas or seaborn for plotting), the bottom needs to be the sum of all the lower bars.

Here is an example with the given data.

from matplotlib import pyplot as plt
import numpy as np
import pandas as pd

columns = ['DATE', 'NODP', 'NVP', 'VP', 'VDP']
data = [['03/08/2002', 0.083623, 0.10400659, 0.81235517, 1.52458E-05],
        ['14/09/2003', 0.24669167, 0.24806379, 0.5052293, 1.52458E-05],
        ['26/07/2005', 0.15553726, 0.13324796, 0.7111538, 0.000060983],
        ['20/05/2006', 0, 0.23, 0.315, 0.455],
        ['05/06/2007', 0.21280034, 0.29139224, 0.49579217, 1.52458E-05],
        ['21/02/2010', 0, 0.55502195, 0.4449628, 1.52458E-05],
        ['09/04/2011', 0.09531311, 0.17514162, 0.72954527, 0],
        ['14/02/2012', 0.19213217, 0.12866237, 0.67920546, 0],
        ['17/01/2014', 0.12438848, 0.10297326, 0.77263826, 0],
        ['24/02/2017', 0.01541347, 0.09897548, 0.88561105, 0]]
df = pd.DataFrame(data=data, columns=columns)
ind = pd.to_datetime(df.DATE)
NODP = df.NODP.to_numpy()
NVP = df.NVP.to_numpy()
VP = df.VP.to_numpy()
VDP = df.VDP.to_numpy()

width = 120
p1 = plt.bar(ind, NODP, width, label='NODP', bottom=NVP+VDP+VP, color='grey')
p2 = plt.bar(ind, NVP, width, label='NVP', bottom=VDP+VP, color='tan')
p3 = plt.bar(ind, VDP, width, label='VDP', bottom=VP, color='darkorange')
p4 = plt.bar(ind, VP, width, label='VP', color='darkgreen')
plt.ylabel('Ratio')
plt.xlabel('Year')
plt.title('Ratio change')
plt.yticks(np.arange(0, 1.001, 0.1))
plt.legend()
plt.show()

resulting plot

Note that in this case the x-axis measured in days, and each bar is located at its date. This helps to know the relative spacing between the dates, in case this is important. If it isn't important, the x-positions could be chosen equidistant and labeled via the dates column.

To do so with standard matplotlib, following code would change:

ind = range(len(df))
width = 0.8
plt.xticks(ind, df.DATE, rotation=20)
plt.tight_layout() # needed to show the full labels of the x-axis

example plot

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=375972&siteId=1