How to teach yourself data analysis?

90% of migrant workers teach themselves data analysis methods are wrong.

I believe that the friends who clicked into this answer are all people who have been working in the workplace for a long time like me. At our stage, do you also find that you have to learn some data analysis, otherwise the road will get worse narrower.

I have already gone through the road of learning math scores on the job. If you are also interested in math scores, you can read down, it may not be difficult at all.

1. What is data analysis

Let me tell you, in fact, data analysis is to extract valuable information from data by processing, interpreting and inferring it, so as to predict future trends, discover hidden opportunities and optimize business processes.

Considering that many friends do not know the process of data analysis, so I simply sorted it out.

insert image description here
No matter what position you are in, when you learn data analysis, you should learn around these 5 points, so you won't get lost.

It's just that different positions have different emphases in learning data analysis. I will introduce them one by one later, so don't worry.

The benefits of mastering data analysis

1. Optimize business processes : find bottlenecks and problems in the business through data analysis.

For example, you use data analysis to optimize production processes, improve supply chain management, enhance customer experience, optimize marketing strategies, etc., to improve business performance and competitiveness.

2. Decision support : Data analysis can provide decision support based on facts and evidence.

Through in-depth analysis and interpretation of data, you can better understand the nature, trends and influencing factors of problems, and reduce subjective guesses and decision-making risks.

3. Career development : Alas, job resumes in all walks of life now require applicants to have data analysis skills.

3. Different positions have different emphasis on self-study scores

1. If you are an operation, product, or channel post , and your business encounters stuck points and growth is weak, you mainly learn three types of data analysis

a. Analysis of the status quo of the project
b. Analysis of the causes of problems in the project
c. Forecast analysis of the future of the project

2. If you are in a human resources or financial position, do not understand data, and lack real data analysis projects, you must learn these three types of data analysis

a. The method of obtaining comprehensive data
b. The method of mining the business behind the data
c. Financial analysis method

3. If you are in a management position, the main focus is "data as the key link, performance evaluation of people".
What you need is to find the key points of all problems and business at a glance in the face of a large amount of data. You must learn these 3 points.

a. Establish a data index system
b. Proficient in data comparative analysis methods
c. Persist in the implementation of the first two for a long time

4. If you are in another position, or transfer to a post, then you are in the credit hours.

Focusing on the following 5 points is generally the data analysis method: clarifying the problem, understanding the problem, data cleaning, data analysis, and data visualization, so there will be no mistakes.

Here I would like to share with you a top-level introductory material, which talks about the specific details and basic points of using Python for data control, processing, organization, analysis, etc. Let you learn data analysis from scratch by introducing Python programming and the library and tool environment for data processing .

Moreover, it also starts with numpy, focusing on the various processes of data analysis, including data access, regularization, visualization, and so on . In addition, the material gives short and clear examples for each knowledge point, and practical scenarios (such as epidemic data analysis ) for most of the examples.

Applicable people : Whether it is work needs , or skill improvement , or you are a novice with zero foundation and want to improve your future , you can try to learn it.

Without further ado, let me show you:

Table of contents:

Chapter 1. Introduction to Python

  • Why choose Python

  • start using python

  • Python version selection

  • Install Python

  • test python

  • install pip

  • Install a code editor

  • Install IPython (optional)

insert image description here

Chapter 2. Python Basics

  • basic data type

  • string

  • Integers and Floats

  • data container

  • variable

  • the list

  • dictionary

  • Uses of various data types

  • String Methods: What Strings Can Do

  • Useful tools: type, dir and help

  • Integrated use

  • meaning of the code

insert image description here

Due to space reasons, all chapters will not be shown in screenshots one by one. Friends who need to learn can get it at the end of the article~ (and there will be supporting data analysis tutorial videos)

Chapter 3. Machine-readable data

  • CSV data

  • How to import CSV data

  • Save the code to a file and run it on the command line

  • JSON data

  • XML data

insert image description here

Chapter 4 Working with Excel Files

  • Install Python packages

  • Parsing Excel files

  • start parsing

insert image description here

Chapter 5 Working with PDF files and solving problems with Python

  • Try not to use PDF

  • Programmatic method for parsing PDF

  • Open and read PDF with slate library

  • Convert PDF to text

  • Parsing PDFs with pdfminer

  • learn how to solve problems

  • Exercise: use table extraction, switch to another library

  • Exercise: Cleaning Data Manually

  • Exercise: Try another tool

  • uncommon file type

insert image description here

Chapter 6 Data Acquisition and Storage

  • Not all data is created equal

  • authenticity check

  • Data readability, data cleanliness, and data longevity

  • find data

  • Case Study: Data Investigation Example

  • data storage

  • Database Introduction

  • Relational databases: MySQL and PostgreSQL

  • Non-relational database: NoSQL

  • Create a local database with Python

  • use simple file

  • Cloud Storage and Python

  • Local Storage and Python

  • Other data storage methods

insert image description here

Chapter 7 Data Cleaning: Research, Matching, and Formatting

  • Why clean data

  • Data Cleansing Basics

  • Find the data that needs to be cleaned

  • data formatting

  • Find outliers and bad data

  • find duplicates

  • fuzzy match

  • regular expression match

  • How to handle duplicate records

Due to space reasons, all chapters will not be shown in screenshots one by one. Friends who need to learn can get it at the end of the article~ (and there will be supporting data analysis tutorial videos)

Chapter 8 Data Cleansing: Standardization and Scripting

  • Data Normalization and Standardization

  • data storage

  • Find the right data cleansing method for your project

  • Data cleaning scripting

  • Test with new data

Chapter 9 Data Exploration and Analysis

  • explore data

  • Import Data

  • Explore table functions

  • join multiple datasets

  • Identify correlations

  • find outliers

  • create group

  • explore in depth

  • analyze data

  • Separate and focus data

  • what your data is saying

  • Describe the conclusion

  • document the conclusion

Chapter 10 Presenting Data

  • Avoid the Storytelling Trap

  • how to tell a story

  • understand the audience

  • visualize data

  • chart

  • time related data

  • map

  • interactive elements

  • letter

  • Images, Videos and Illustrations

  • display tool

  • publish data

  • Use available sites

  • Open Source Platform: Create a new website

  • Jupyter (formerly known as IPython notebook)

Chapter 11 Web Scraping: Obtaining and Storing Web Data

  • what to crawl and how to crawl

  • Analyze web pages

  • View: Markup Structure

  • Network/Timeline: How the page loads

  • Console: interact with JavaScript

  • In-depth analysis of the page

  • Get Pages: How to Make Requests Over the Internet

  • Reading web pages with Beautiful Soup

  • Read web pages with lxml

Chapter 12 Advanced Web Scraping: Screen Scrapers and Crawlers

  • Browser-Based Parsing

  • Screen reading with Selenium

  • Screen reading with Ghost.py

  • Crawl the web

  • Create a crawler using Scrapy

  • Use Scrapy to crawl the entire website

  • The Web: How the Internet Works, and Why It Crashes Scripts

  • Internet of Changes (or why the script crashes)

  • a few words of advice

Chapter 13 Application Programming Interface

  • API features

  • REST API vs Streaming API

  • frequency limit

  • Hierarchical data volume

  • API key 和 token

  • A simple Twitter REST API data pull

  • Advanced data collection using the Twitter REST API

  • Advanced data collection using the Twitter Streaming API

Chapter 14 Automation and Scale

  • Why automate

  • automated steps

  • what can go wrong

  • where to automate

  • Special tools for automation

  • Use local files, parameters and configuration files

  • Using the cloud in data processing

  • simple automation

  • large-scale automation

  • Monitor automated programs

  • no foolproof system

Chapter 15 Conclusion

  • Responsibilities of Data Processors

  • above data processing

  • what to do next

This full version of the data analysis PDF learning materials has been uploaded to CSDN, and there will also be a supporting data analysis tutorial video. If you need it, friends can scan the CSDN official certification QR code below on WeChat to get it for free【保证100%免费

Guess you like

Origin blog.csdn.net/JAVAmonster12/article/details/130767103