Plotly: The strongest Python visualization library, none

I've been using matplotlib before, which also led me to spend countless late nights searching StackOverflow for how to "format the date" or "add a second Y-axis", and drawing a graph with matplotlib took too much time.

Today I'm going to share with you a visual product, introducing the powerful open source Python drawing library Plotly , which teaches you how to draw better charts with super simple (even just one line) code. If you like this article, remember to bookmark, like, and follow.

Note: The full version of the code, data, and technical exchange can be obtained at the end of the article

Plotly overview

The Python package for plotly is an open source codebase based on plot.js, which is based on d3.js. What we actually use is a library that encapsulates plotly, called cufflinks, which makes it easier for you to use plotly and Pandas data tables to work together.

All visualizations in this article were done in Jupyter Notebook using the offline mode plotly + cufflinks library. After installing with pip install cufflinks plotly, you can import
pictureunivariate distributions in Jupyter with code like the following: histogram and boxplot

The univariate analysis chart is often the standard practice when starting data analysis, and the histogram is basically one of the must-have charts for univariate distribution analysis (although it has some shortcomings).

Take the total number of likes on a blog post as an example (see Github for the original data: https://github.com/WillKoehrsen/Data-Analysis/tree/master/medium ), make a simple interactive histogram:

picture

(df ​​in the code is a standard Pandas dataframe object)
picture

(interactive histogram created with plotly+cufflinks)

For students who are used to matplotlib, you only need to type one more letter (change .plot to .iplot) to get a more beautiful interactive chart! Clicking on an element on the image reveals detailed information, zooms in and out, and (we'll talk about it next) highlights and filters certain parts and more.

If you want to draw a stacked column chart, just do this:

picture

picture

Simple processing of pandas data table and generate bar chart:

picture

picture

As shown above, we can combine the power of plotly + cufflinks and pandas together . For example, we can use .pivot() to do a pivot table analysis first, and then generate a bar chart.

For example, to count the number of new fans brought by each article in different publishing channels:

picture

picture

The benefit of interactive charts is that we can explore the data and break down sub-items for analysis at will. Box plots can provide a lot of information, but if you can't see the specific values, you're probably missing a lot of it!

Scatter plot

Scatter plots are at the heart of most analyses and allow us to see how a variable has changed over time, or how the relationship between two (or more) variables has changed.

time series analysis

In the real world, a considerable part of the data has a time element. Fortunately, plotly + cufflinks comes with features to support time series visualization analysis.

Taking the article data I published on the "Towards Data Science" website as an example, let's build a dataset indexed by publication time to see how the popularity of the article changes:

picture

picture

In the image above, we accomplish several things with one line of code:

  • Automatically generate beautiful time series x-axis

  • Add a second Y-axis because the ranges of the two variables do not match

  • Put the title of the article in the label displayed on hover

To display more data, we can conveniently add text annotations:

picture

picture

(scatterplot with text annotations)

In the code below, we color a bivariate scatterplot by the third categorical variable:

picture

picture

Next we're going to play with something complicated: logarithmic axes. We do this by specifying the layout parameter of plotly (for different layouts, please refer to the official documentation https://plot.ly/python/reference/ ), at the same time we put the point size (size parameter) and a The value variable read_ratio (read ratio) is bound, the larger the number, the larger the size of the bubble.

picture

picture

If we want to be a little more complicated (see the Github source code for details), we can even cram 4 variables into a single image! (However, it is not recommended that you do this)

picture

As before, we can combine pandas with plotly+cufflinks to achieve many useful graphs:

picture

picture

It is recommended that you check the official documentation, or the source code, which has more examples and function examples. With just one or two lines of code, you can add text annotations, auxiliary lines, best-fit lines and other useful elements to your charts, while maintaining the original interactive functions.

Advanced Drawing Features

Next, we will introduce several special charts in detail. You may not use them very often, but I guarantee that as long as you use them well, you will definitely be impressed. We're going to use plotly's figure_factory module to generate awesome graphs with just one line of code!

Scatter Plot Matrix

Scatterplot matrices (also known as SPLOMs) are a great choice if we want to explore relationships between many different variables:

picture

picture

Even such complex graphs are fully interactive, allowing us to explore the data in greater detail.

Relationship Heatmap

To illustrate the relationship between multiple numerical variables, we can calculate their correlation and visualize it in the form of an annotated heatmap:

picture

picture

custom theme

In addition to the endless variety of charts, Cufflinks also provides many different coloring themes, so that you can easily switch between different chart styles. The following two figures are the "space" theme and the "ggplot" theme:

picture

picture

In addition, there are 3D diagrams (surfaces and bubbles):

picture

picture

For users who are interested in research, it is not difficult to make a pie chart:

picture


Edit in Plotly Chart Studio

After you have generated these graphs in Jupyter Notebook, you will notice a small link in the lower right corner of the graph that says "Export to plot.ly". If you click on this link, you will be redirected to a "plot workshop" (https://plot.ly/create/).

Here, you can further revise and polish your diagram before final presentation. You can add callouts, choose the color of certain elements, keep everything organized, and produce an awesome diagram. Later, you can also publish it on the web, generating a link for others to view.

The following two graphs were made in the Chart Workshop:

picture

picture

After talking so much, are you tired of watching? However, we have not exhausted all the functions of this library. Due to space limitations, there are some better charts and examples, so please visit the official documents of plotly and cufflinks to check them one by one.

picture

_(Plotly interactive map showing domestic wind farm data in the United States. __Source: _plot.ly)

At last

The worst thing about the sunk cost fallacy is that people often only realize how much time they've wasted when they give up on previous efforts.

When choosing a drawing library, the features you need most are:

  • One line of code charts needed to quickly explore data

  • Interactive elements needed to split/study data

  • Option to drill down to details when needed

  • Easy to customize before final presentation

From now on, the best choice to use Python language to achieve the above functions is plotly. It allows us to quickly generate visual diagrams, and the interactive features allow us to better understand the information.

I'll admit that plotting is definitely the most enjoyable part of working in data science, and plotly makes these tasks much more enjoyable.

picture2022 is the time to upgrade your Python plotting library and make yourself faster, stronger and more beautiful in data science and visualization!

Github source code address: https://github.com/WillKoehrsen/Data-Analysis/blob/master/plotly/Plotly%20Whirlwind%20Introduction.ipynb

recommended article

Technology Exchange

Welcome to reprint, collect, like and support!

insert image description here

At present, a technical exchange group has been opened, and the group has more than 2,000 members . The best way to remark when adding is: source + interest direction, which is convenient to find like-minded friends

  • Method 1. Send the following picture to WeChat, long press to identify, and reply in the background: add group;
  • Method ②, add micro-signal: dkl88191 , note: from CSDN
  • Method ③, WeChat search public account: Python learning and data mining , background reply: add group

long press follow

Guess you like

Origin blog.csdn.net/weixin_38037405/article/details/124323127