Based on the Bokeh library, teach you how to make bullet charts and waterfall charts!

First, let's import and get Bokeh's output to appear in our notebook:

from bokeh.io import show, output_notebook
from bokeh.palettes import PuBu4
from bokeh.plotting import figure
from bokeh.models import Label

output_notebook()

bullet chart

In this example, we will populate the data with python lists. We could modify it to fit a Pandas dataframe, but we'll stick with simple Python datatypes for this example:

data= [("John Smith", 105, 120),
       ("Jane Jones", 99, 110),
       ("Fred Flintstone", 109, 125),
       ("Barney Rubble", 135, 123),
       ("Mr T", 45, 105)]

limits = [0, 20, 60, 100, 160]
labels = ["Poor", "OK", "Good", "Excellent"]
cats = [x[0] for x in data]

A tricky part of the code is building a list of categories in the cat variable on the y-axis.

The next step is to create a bokeh plot and set several options related to how the x-axis and gridlines are displayed. As mentioned above, we use the cats variable to define all the categories in y_range.

p=figure(title="Sales Rep Performance", plot_height=350, plot_width=800, y_range=cats)
p.x_range.range_padding = 0
p.grid.grid_line_color = None
p.xaxis[0].ticker.num_minor_ticks = 0

The next section will create a color range bar using the bokeh's hbar. To make this work, we need to define the left and right extents of each bar as well as the color. We can use python's zip function to create the data structure we need.

zip(limits[:-1], limits[1:], PuBu4[::-1])

# 结果如下:
[(0, 20, '#f1eef6'),
 (20, 60, '#bdc9e1'),
 (60, 100, '#74a9cf'),
 (100, 160, '#0570b0')]

Here's how to combine them to create a range of colors.

for left, right, color in zip(limits[:-1], limits[1:], PuBu4[::-1]):
	p.hbar(y=cats, left=left, right=right, height=0.8, color=color)

The results are as follows:
insert image description here
We use a similar process, adding a black bar to each performance metric.

perf = [x[1] for x in data]
p.hbar(y=cats, left=0, right=perf, height=0.3, color="black")

The last marker we need to add is a segment that displays the target value.

comp = [x[2]for x in data]
p.segment(x0=comp, y0=[(x, -0.5) for x in cats], x1=comp,
          y1=[(x, 0.5) for x in cats], color="white", line_width=2)

Here's the result:
insert image description here
The final step is to add labels to each range. We can use zip to create the label structure we need and then add each label to the layout.

for start, label in zip(limits[:-1], labels):
    p.add_layout(Label(x=start, y=0, text=label, text_font_size="10pt",
                       text_color='black', y_offset=5, x_offset=15))

The result is as follows:
insert image description here

waterfall chart

Constructs a data frame for use as a demo data frame.

# Create the initial dataframe
index = ['sales','returns','credit fees','rebates','late charges','shipping']
data = {
    
    'amount': [350000,-30000,-7500,-25000,95000,-7000]}
df = pd.DataFrame(data=data,index=index)

# Determine the total net value by adding the start and all additional transactions
net = df['amount'].sum()

Here's the result:
insert image description here
the final waterfall code will require us to define several additional properties for each segment, including:

  • starting point;
  • bar color;
  • label position;
  • label text;

By adding this to a single data frame, we can use Bokeh's built-in functionality to simplify the final code.

For the next step, we'll add the running total, segment start position, and label position.

df['running_total'] = df['amount'].cumsum()
df['y_start'] = df['running_total'] - df['amount']

# Where do we want to place the label?
df['label_pos'] = df['running_total']

Next, we add a row to the bottom of the data frame containing the net values.

df_net = pd.DataFrame.from_records([(net, net, 0, net)],
                                   columns=['amount', 'running_total', 'y_start', 'label_pos'],
                                   index=["net"])
df = df.append(df_net)

For this particular waterfall, I'd like to have negative values ​​set to a different color and format the labels below the chart. Let's add a column to the dataframe with values.

df['color'] = 'grey'
df.loc[df.amount < 0, 'color'] = 'red'
df.loc[df.amount < 0, 'label_pos'] = df.label_pos - 10000
df["bar_label"] = df["amount"].map('{:,.0f}'.format)

This is the final dataframe with all the data we need. It does require some manipulation of the data to get to this state, but it's pretty standard Pandas code, and it's easy to debug if something goes wrong.
insert image description here
Creating the actual plot is fairly standard Bokeh code, as the data frame has all the values ​​we need.

TOOLS = "box_zoom,reset,save"
source = ColumnDataSource(df)
p = figure(tools=TOOLS, x_range=list(df.index), y_range=(0, net+40000),
           plot_width=800, title = "Sales Waterfall")

By defining the ColumnDataSource as our data frame, Bokeh takes care of creating all the segments and labels without any looping.

p.segment(x0='index', y0='y_start', x1="index", y1='running_total',
          source=source, color="color", line_width=55)

We'll do some small formatting to add labels and format the y-axis nicely.

p.grid.grid_line_alpha=0.3
p.yaxis[0].formatter = NumeralTickFormatter(format="($ 0 a)")
p.xaxis.axis_label = "Transactions"

The final step is to use LabelSet to add all the labels to the bar chart.

labels = LabelSet(x='index', y='label_pos', text='bar_label',
                  text_font_size="8pt", level='glyph',
                  x_offset=-20, y_offset=0, source=source)
p.add_layout(labels)

The result is as follows:
insert image description here

Guess you like

Origin blog.csdn.net/weixin_41261833/article/details/120314339