python data analysis report sample, python data analysis report + code

Hello everyone, this article will focus on the python data analysis final homework report. The python data analysis final homework is something that many people want to understand. To figure out how to write a python data analysis report, you need to understand the following things first.

background

Although I used Python to develop a crawler script and successfully crawled the data of a real estate website, the work arranged by my friend’s boss was not completed. We still need to analyze the data and generate an analysis report. Therefore, this article Let’s continue the previous article and explain how to use Python to make a good-looking and easy-to-use data analysis report.

Choice of Python library

In other words, if you want to do your job well, you must first sharpen your tools. Although we have chosen Python to complete the remaining work, we need to consider which of Python's tools will help us complete the remaining work faster and better.

We can take a look. In this task, there are mainly four types of work to be completed:

  1. Reading of csv files;

  2. For the read data, perform data processing and indicator calculation according to the indicators we want to analyze;

  3. Based on the results of data analysis, generate visual data charts;

  4. Display data analysis results reports through web pages;

Let’s take a look at which libraries of Python we choose to help us complete our work based on these four types of work.

1. Data processing and analysis library

Reading and processing files in formats like csv, excel, etc. is actually the processing of one-dimensional and two-dimensional data. For the processing of such data, the commonly used library in Python is Pandas, and the Series in the data structure it provides corresponds to a Dimensional data, DataFrame corresponds to two-dimensional data, and Pandas also provides a large number of efficient built-in functions and operations to process one-dimensional and two-dimensional data in memory.

For calculations of higher-dimensional data such as matrices, you need to use the Nunpy library to complete it in Python. Numpy is a matrix-based mathematical calculation module that provides high-performance matrix operations. The array structure is ndarray. It can be regarded as a container for multi-dimensional arrays (ndarray). It can perform element-level calculations on arrays and directly perform mathematics on arrays. Operation function.

Pandas is built based on Numpy arrays, but the biggest difference between the two is that pandas is specially designed for processing tables and mixed data, which is more suitable for the table structure in statistical analysis, while numpy is more suitable for processing unified numerical array data.

Therefore, we can basically complete steps 1 and 2 by relying on the Pandas library. However, in this data analysis report, I also used the histogram calculation function of the Numpy library, which will be discussed in detail later.

2. Data visualization library

The work in step 3 is actually a data visualization task. There are three libraries that can be used for data visualization in Python:

  • Matplotlib

  • Seanborn

  • Pyecharts

Matplotlib

Matplotlib can be said to be the originator of Python data visualization library. It is a visual operation interface for the Python programming language and its numerical calculation package NumPy. pyplot is a module of matplotlib and provides an interface similar to MATLAB. It can be seamlessly integrated with Numpy and Pandas, but the styles of some icons are not beautiful enough, and the generation of dynamic interactive charts is not natively supported. Although it can be achieved by changing the backend used, it is still relatively troublesome, and if you want to To implement a dynamic and interactive chart in a web page, there is currently no particularly good way. Recently, matplotlib has made a lot of progress in being more oriented towards web interaction, such as the new HTML5/Canvas backend. You can learn about it from the following address :

http://code.google.com/p/mplh5canvas/

But it's not quite done yet.

Seanborn

The biggest difference between Seaborn and matplotlib is that its default drawing style and color matching have a modern aesthetic. In fact, it has a more advanced API encapsulation based on matplotlib, allowing you to use less code to call matplotlib methods. This makes drawing easier. But the dynamic interactivity problem of matplotlib also exists.

Pyecharts

Speaking of Pyecharts, we have to mention ECharts, which is a very well-known library in the field of front-end data visualization. After all, it was created by the front-end engineers of my old employer Baidu. It was first incubated within Baidu. During my time at Baidu, We also collaborated on other projects with core engineers who later participated in the development of ECharts. Later, it was donated to the Apache Foundation in 2018 and became an ASF incubation-level project. It will officially graduate in 2021 and become a top-level Apache project.

Pyecharts is a python version implemented based on ECharts, which supports a large number of rich visual chart types. Its biggest advantage over the first two libraries is that it can easily generate and support interactivity (such as mouse clicks, dragging, zooming, etc. ) pictures and can be dynamically displayed on the web page.

Based on the above comparative analysis, since this time I hope to generate a dynamic and interactive web data analysis report page for my friend, at this point, Pyecharts undoubtedly has the advantage, so this time we will use the Pyecharts library to process our data Visual display.

3. Web application library

There are two main options for Python in this field:

  • Django

  • Flask

Django is a free and open source Web framework developed in Python. It provides many modules commonly used in website backend development. It comes with quite a few functions. These functions are jointly maintained by the official and the community, so it is large and comprehensive. It is a heavier framework, so the degree of coupling is higher than that of flask, and it is more difficult to make secondary modifications.

In contrast, Flask is a free and open source lightweight web framework. Flask does not include common functions of web applications such as upload processing, ORM (object relational mapper), database abstraction layer, authentication, form validation, etc. modules (which Django provides), but can use pre-existing external libraries to integrate these functions, thus a more flexible and extensible web framework.

In our scenario this time, we only need to provide a static web page to display data visualization results, and do not involve other complex web application functions. Therefore, Flask is our best choice.

Python experience sharing

Learning Python well is good whether it is employment data analysis or making money as a side job, but you still need to have a learning plan to learn Python. Finally, we share a complete set of Python learning materials to give some help to those who want to learn Python!

Python learning route

Here we have sorted out the commonly used technical points of Python, and summarized the knowledge points in various fields. You can find corresponding learning resources based on the above knowledge points.
Insert image description here

learning software

Python is a commonly used development software that will save everyone a lot of time.
Insert image description here

Learning video

When learning programming, you must watch a lot of videos. Only by combining books and videos can you get twice the result with half the effort.
Insert image description here

100 practice questions

Insert image description here

Practical cases

Optical theory is useless. When learning programming, do not talk about it on paper. You must practice it and apply the knowledge you have learned into practice.
Insert image description here
Finally, I wish you all to make progress every day! !

The above complete version of the complete set of Python learning materials has been uploaded to the CSDN official. If friends need it, they can directly scan the CSDN official certification QR code below on WeChat to get it for free [100% free guaranteed].

Guess you like

Origin blog.csdn.net/i_like_cpp/article/details/132163737