Using python for data analysis

Data analysis refers to the process of controlling, processing, organizing, and analyzing data. Here, "data" refers to structured data, such as: records, multi-dimensional arrays, data in Excel, data in relational databases, data tables, etc.

1. Why use python for data analysis?

Many people are interested in choosing Python as a data analysis language. Why? There are four reasons:

  1. Open source – free to install;
  2. Excellent online community;
  3. Easy to learn;
  4. Can become a universal language for data science and production of web-based analytics products;

2. Purpose of data analysis

The main purpose is to extract useful information for us from complex and huge databases. Let these data generate a certain value and help people make some references when making some decisions in their daily lives. For example, when buying something on Taobao, we will first see the sales volume, ranking, and customer evaluations of the item. These are all obtained through data analysis. It can be seen how important role data analysis plays in this.

3. Data acquisition

  1. Channels for public data sets
  2. Crawl website data using a crawler

4. Data storage (SQL)

  1. Extract data for specific situations;
  2. Delete, add, search and modify the database;
  3. Grouping and aggregation of data and how to establish connections between multiple tables;

5. Data preprocessing python (pandas)

  1. Selection: Data access (tags, specific values, boolean indexes, etc.)
  2. Missing value processing: delete or fill missing data rows
  3. Duplicate value processing: judgment and deletion of duplicate values
  4. Outlier handling: eliminate unnecessary whitespace and extreme, anomalous data
  5. Related operations: descriptive statistics, Apply, histogram, etc.
  6. Merge: Merge operations that conform to various logical relationships
  7. Grouping: data division, executing functions separately, data reorganization
  8. Reshaping: quickly generate pivot tables

6. Use probability theory and statistics

  1. Basic statistics: mean, median, mode, percentile, extreme value, etc.;
  2. Other descriptive statistics: skewness, variance, standard deviation, significance, etc.;
  3. Other statistical knowledge: population and sample, parameters and statistics, ErrarBar;
  4. Probability distribution and hypothesis testing: various distributions and hypothesis testing processes;
  5. Other probability theory knowledge: conditional probability, Bayes, etc.

7. python data analysis

  1. Regression analysis: linear regression, logistic regression;
  2. Basic classification algorithms: decision trees, random forests;
  3. Basic clustering algorithm: k-means;
  4. Feature Engineering Basics: How to use feature selection to optimize models;
  5. Parameter adjustment method: how to adjust parameters to optimize the model;
  6. Python data analysis package: scipy, numpy, scti-learn, etc.;

8. Finally

If you are more interested in Python technology, here I would like to share with you a complete set of Python learning materials , which I compiled during my own study, including Python learning routes, introductory videos, practical cases, e-books and a large number of interview questions . It’s not easy to organize, please like and share~

Scan the CSDN official certification QR code below on WeChat to get it

1. Learning routes in all directions of Python

The Python all-direction route is to organize the commonly used technical points of Python to form a summary of knowledge points in various fields. Its usefulness is that you can find corresponding learning resources according to the above knowledge points to ensure that you learn more comprehensively.
Insert image description here

2. Python learning software

If a worker wants to do his job well, he must first sharpen his tools. The commonly used development software for learning Python is here!
Insert image description here

3. Python introductory learning video

There are also many learning videos suitable for beginners. With these videos, you can easily get started with Python~Insert image description here

4. Python exercises

After each video lesson, there are corresponding exercises to test your learning results haha!
Insert image description here

5. Python practical cases

Optical theory is useless. You must learn to type code along with it and practice it in order to apply what you have learned to practice. At this time, you can learn from some practical cases. This information is also included~Insert image description here

6. Python interview materials

After we learn Python, we can go out and find a job if we have the skills! The following interview questions are all from first-tier Internet companies such as Alibaba, Tencent, Byte, etc., and Alibaba bosses have given authoritative answers. I believe everyone can find a satisfactory job after reviewing this set of interview materials.
Insert image description here
Insert image description here
Friends who need it can scan the CSDN official certification QR code below on WeChat to get it for free ! !

Guess you like

Origin blog.csdn.net/maiya_yaya/article/details/131850610