Python data analysis and visualization (20) Visual production of Power BI and data analysis of A-share listed companies

Visualization

Show visual data content through charts showing how to make visual charts with Power BI.

column chart

Use horizontal columns to represent the size of different classification data, similar to a bar chart, which is equivalent to a vertical bar chart. A stacked column chart is where different series of data are stacked on one column. Different series in a clustered column chart use different columns to facilitate the comparison of the values ​​of different series. The data on the y-axis in the percentage stacked column chart becomes a percentage display (relative size), similar to the stacked column chart, and there is no way to compare the total amount. You can personalize the chart in the format to beautify the chart. A qualified column chart has clear content on the y-axis, legends, units, titles corresponding to the chart, and reasonable color matching.
insert image description here
Take the percentage stacked column chart as an example. Similar to the stacked column chart, different sequences are displayed on one column.
insert image description here
You can use the format to beautify the chart, take the clustered column chart as an example, change the parameters of the chart display, such as X, Y axis, title, column color, background, etc. A qualified column chart requires clear X and Y axes, legends and descriptions, data units, titles that echo the chart, and more.
insert image description here

Tree

Also known as a rectangular tree diagram, imagine the overall data as a tree, each data is a branch and leaf, the branches and leaves are placed in a rectangle, and each rectangle is arranged in a large rectangle. Applicable scenarios: It is necessary to display a large amount of hierarchical data, and the bar chart cannot effectively handle a large number of values. It is necessary to display the ratio of each part to the whole, the distribution mode of the indicators in the hierarchy in each category level, and the color and size used to display attributes, outliers, outliers, etc. There is almost no blank area in the dendrogram, and each cluster in it is used to express the relationship, and the space utilization rate is relatively high.
insert image description here
You can choose to run the Python script in the Power Query interface, conversion-script, and use the Python code to process the data. The original data will change, expand the value column, uncheck the option to use the original column name as a prefix, and the original data will be displayed. Select the tree diagram in the visual interface to display the proportion of the number of brands, put the brand in the group and value, and you will get the effect shown in the above picture.
insert image description here
In the rectangular tree diagram, the weight relationship and proportion of each data can be distinguished through the size, position and color of each rectangle, and the entire data set can be seen at a glance. The entire graph is arranged in descending order of area from left to right and from top to bottom. At this time, the tree diagram shows a single-layer data structure. Put the capacity in the detailed information, you can see the hard disk information of different capacities in each brand, and the display is a double-layer tree diagram.

map visualization

Use Microsoft's built-in Bing Maps to easily generate maps in Power BI to visualize various maps. There are three kinds of map objects by default, the bubble map is frequently used, and the colored map needs to be displayed in the online version.
Use the size of the bubbles to represent the value of GDP in different regions, and change the color of the bubbles and other information in the format. Put the data of countries, provinces, and cities in the location bar to display data at different levels. Use the drill-up and drill-down buttons on the top of the chart to select data at different levels. In the data, the full name of the data should be used, such as Beijing, etc. You can put the latitude and longitude in the data into the longitude and dimension to ensure that the data can be displayed whole.
insert image description here

common operations

1. View data

You can click the three dots in the upper right corner of the generated chart to display or export data in a list.

2. Chart Drilling

When there is a hierarchical structure in the chart data, the next-level data can be directly displayed in the chart, as long as the hierarchical structure of the specific date data is detailed enough, such as from year to quarter, month, date, hour, etc. Use the drill down button on the top of the generated chart, and then click the data in the chart to expand the corresponding data; click "drill up" to summarize the corresponding data information. Click the "Go to the next level button in the hierarchy" (two down arrows) in the chart to display the contents of all levels step by step; click "Expand all lower levels in the hierarchy" in the chart to expand all level of data information. The drilling function in other charts is similar, and you can drill down to the data information subdivided from countries, provinces, cities, etc.

3. Edit interaction

Power BI visualized charts are dynamic, and can quickly access, discover, and explore the laws behind the data through interactive functions such as filtering, drilling, and highlighting on the page. When the default filter data changes, all visual views related to this on Power BI will change. Each visual object can also be used as a filter condition for other charts, and other charts will respond dynamically, displaying data from different angles, filtering Improper operation, the displayed visualization effect will be very different.
Select the chart, click Format - Edit Interaction, and "Filter" and "None" will appear above each chart. If you want to filter one of the charts and keep the other charts unchanged, you can select "None" for other charts; if you filter one of the charts A chart, other charts also respond accordingly, just select the "filter" above the chart. There will also be a button "Highlight" in the ring chart and column chart. When other graphics are filtered, these two graphics will also change with the filtering. Click the "Highlight" button at this time, The graph will restore the previous state, retain the content of the previous graph data points, and the filtered part will be highlighted, which is conducive to our data exploration.

Actual combat: data analysis of A-share listed companies

1. Data preparation

The data are the prerequisite for the next series of analysis.
Take China Business Intelligence Network as an example.
To obtain data in PowerBI, choose to obtain from the Web, and pass the url of the web page to be obtained into the URL section under the Advanced tab. The data to be obtained can be 1-20 pages, and the URL and the following numbers are separately passed in. , click OK to get the data on page 1. Check Table 7 in the navigator, select Convert Data to enter the Power Query editor, and you can see the obtained data on page 1.
insert image description here

2. Data cleaning

Washing out the "dirty data" is also related to the subsequent analysis work.
Right-click and delete unnecessary data points such as the obtained "Prospectus" and "Company Financial Report".
You can set a custom function according to the page number parameter to realize batch download or import data.
Click Home—Query—Advanced Editor, or right-click Table 7 to select Advanced Editor. Add a line of code above the displayed code, (p as number as table) =>, change the string 1 in the first line of code in let to (Number.ToText( p ))), and click Finish.
insert image description here
Enter the interface of input parameters, change the name of the table to Data_zs, at this time p is the variable of the function, pass in the parameter value of p, the data number in the interface, and the data of the corresponding page number will be queried. To achieve batch fetching of data, it is necessary to call this function in batches.
insert image description here
Click Home - New Query - Find an empty query in the new source, enter the content, = {1...10}, create a list of queries, and then click Convert - To Table, convert to a table; click Add Column - General - Call Custom Function , enter the new column name "page number" in the pop-up dialog box, select the created "Data_zs" in the function query, and click OK to start grabbing the set page 1-10 web page data.
insert image description here
Click the double-headed arrow on the right of the page number column, uncheck the box that uses the original column name as the prefix, and you can expand the captured 10 pages (200 items) of data.

3. Data Modeling

The collaboration of multiple tables depends on the logical relationship between the tables, and the process of establishing the relationship is called data modeling.
At this point, there is only one table, and there is no need to establish a logical relationship between tables, so this step can be omitted.

4. Index Calculation

Familiarity with commonly used business metrics is required.
You can change the data format of some columns, such as changing the display content of the listing date to year according to your needs (right-click-convert-year), click Home—Close and Apply, exit the Power Query editor, and load the query data content. The calculation of data such as sales, monthly growth, and active users can add corresponding indicator data content to our analysis by adding measurement values. For example, to calculate the number of listed companies in Beijing, Shanghai, and Shenzhen, under the field, right-click the table name, and add data query by creating a new measurement value, the number of listed companies in Beijing, Shanghai, and Shenzhen = CALCULATE(COUNTROWS('query 1') ,'query 1'[city] in {"Shanghai", "Shenzhen", "Beijing"}), the field call must use double quotation marks, and the data of the measurement value will appear in the query 1 table.
insert image description here

5. Visual display

Through visualization, boring data becomes intuitive, easier to understand, and key information is delivered faster.
Draw a dashboard chart to display the proportion of the number of listed companies in Beijing, Shanghai and Shenzhen, put the newly created measurement value into "value", and modify the basic parameter values ​​​​of the chart in setting the visual object format.
Add a circular chart showing the industries in which listed companies operate, and put the industry classification data into the legend and values.
Add a pie chart showing the ranking of listed companies in the city, put the city in the legend, and put the company name in the value. You can create metrics to filter cities, or you can select cities to display in the city filter type in the filter on the right.
Add a clustered column chart to show the number of listed companies each year since 1990, put the date of listing on the axis, and the company name on the Y axis.
Add a map to display the geographical distribution of listed companies, put the city in the position and size, and now the global map is displayed, you can choose to import the visual object in the lower right corner of the visualization, import the visual map of China, and the import is successful After that, a map of China will be displayed below the visualization. You can select data only in the format of the visual object, change the color of the visual map, and change the color of the map area in the map.
You can select a theme in the view, change the theme style of the entire sketchpad, select the Insert text box in the home page - Insert, enter "A-share listed company quantity analysis", adjust the format, size and color of the font, etc.
insert image description here

Guess you like

Origin blog.csdn.net/hwwaizs/article/details/128633516