Industry Big Data Review

According to what the teacher said, it is for reference only.
The front is related to big data concepts, and the back is the concept and use of crawlers.
The actual book is an introduction to big data. . .

Chapter 1 Overview

1. Data

The general term for all symbols that can be entered into a computer and processed by a computer program

2. Classification of data
  • Structured data: Data that includes predefined data types, formats, and structures. such as relational databases and CSV documents
  • Semi-structured data: Text data files that have a recognized schema and can be parsed. such as JSON and XML
  • Unstructured data: Data without a fixed structure, usually saved as files in different formats. e.g. articles, sound, video
  • Semi-structured and unstructured data can only be converted into structured data for machine learning.
3. Big data 4V characteristics
  • Large scale (Volume): data collection, calculation, and storage capacity are very large.
  • Velocity: The data growth rate is fast, the processing speed is also fast, and the data acquisition speed is also fast.
  • Variety: Variety of species and sources. The types include structured, semi-structured, unstructured data, etc. Common
    sources include: web logs, audio, video, pictures, etc.
  • Low value density (Value): The value density of data is relatively low, just like panning for gold in the waves. We need to
    analyze and process a series of data with low value density in order to obtain some valuable information contained in a large amount of information.
4. Computational characteristics of big data

Approximate, incremental, inductive
Mindset: sample vs full sample, exact vs imprecise, causal vs correlation

5. General process of data processing
  • Data Acquisition:
    After data acquisition, it is necessary to perform preprocessing such as transformation and cleaning on the data, and output data that meets the data application requirements
  • Data management:
    classifying, encoding, storing, indexing, and querying data
  • Data Analytics:
    Descriptive, Diagnostic, Predictive, and Prescriptive
  • Data visualization and interactive analysis:
    help business people rather than data processing experts better understand the results of data analysis
6. Which industries are big data applied to? Try to cite an example.
  • Social network: a large amount of unstructured data such as audio, text information, video, and pictures appears
  • E-commerce: It can more comprehensively and richly obtain users' real shopping interests
  • Mobile Internet: Accurate and faster collection of user information, such as location, life information and other data

Chapter II Data Collection and Governance

1. Sources of big data:
  • Measurements of the real world: through sensing devices or data.
  • Human Records: Data formed by humans entering into computers
  • Computer-generated data: Data generated by a computer through programs such as real-world simulations
2. Data collection

Refers to the process, reliability and timeliness of obtaining raw data from real-world objects
Common methods: sensors, logs, crawlers, crowdsourcing

3. Evaluation criteria for data quality
  • Integrity: Whether the data information is missing
  • Consistency: Whether the data follows a unified specification, and whether the logical relationship is correct and complete
  • Accuracy: Are there any anomalies or errors in the data
  • Timeliness: the time interval from when data is generated to when it can be viewed
4. Factors affecting data quality
  • Information factor: data source specifications are not uniform
  • Technical Factors: Exceptions to Technical Handling
  • Process factor: Improper process setup
  • Management factors: personnel quality and management mechanism issues
5. Working with continuous data

Data discretization, equidistant, equal frequency, optimized discretization

6. Data Integration
  • Traditional Data Integration: Federated Database, Data Warehouse, Mediator
  • Cross-Boundary Data Integration: Phase-Based Approaches, Feature-Based Approaches, Semantic-Based Approaches
7. Binning method
  • Equal depth binning method: each bin has the same number of records, and the number of records in each bin becomes the depth of the bin.
  • Equal-width binning method: split evenly on the entire interval of data values, so that the intervals of each box are equal, and this interval is called the width of the box.
  • User-defined binning method: Binning processing is performed according to user-defined rules.

The following is the value of the customer income attribute, please perform binning according to the above three schemes
800 1000 1200 1500 1500 1800 2000 2300 2500 2800 3000 3500 4000 4500 4800 5000

insert image description here

8. Smoothing

9. Clustering:

The data set is grouped into several clusters, and the values ​​outside the clusters are isolated points, which are noise data, and these isolated points are deleted or replaced. Similar or adjacent data are aggregated together to form cluster sets, and data outside these cluster sets.

10. Regression:

By finding the correlation between two related variables, construct a regression function so that the function can satisfy the relationship between the two variables to the greatest extent, and use this function to smooth the data.

11. Dealing with redundant data

The method of filtering data is usually used to deal with redundant data:

  • Duplicate filtering, take out a record from each duplicate data and keep it, delete other duplicate data
  • Conditional filtering, to filter data based on one or more conditions
12. Inconsistency monitoring and repair

Data-based integrity constraints, including dependencies, functional dependencies, conditional functional dependencies, etc.

13. Missing value filling method
  • Delete: directly delete the corresponding attribute or sample
  • Statistical filling: Use all the statistical values ​​of the sample for this dimension, such as mean, median, mode, maximum and minimum values, etc.
  • Uniform filling: Fill all missing values ​​uniformly as custom values, such as "empty", "0", "positive infinity", "negative infinity, etc.
  • Predictive Filling: Using the existing attribute values ​​to predict missing values ​​through the predictive model.
14. Entity Recognition

Each object cluster obtained after entity recognition refers to the same entity in the real world.
Problems solved: redundancy problem, duplicate name problem
Two types of technologies:

  • Redundancy discovery: Calculate the similarity between objects, and judge whether the objects belong to the same entity class through threshold comparison.
  • Duplicate name detection: Use clustering technology to determine whether objects with the same name belong to the
    same entity class by examining the degree of association between entity attributes.
15. Some people think that it is not safe for their data to be in the hands of others, and their privacy may be violated. What is your opinion on this?

The Internet is full of our daily life, and our data cannot be completely in our own hands. Any information we transmit on the network can be obtained by the transmission platform or even illegally intercepted by some people. What we can do is not to enter our data on the phishing platform, so that criminals can take advantage of it. Even if a formal platform can obtain our data, it will not violate privacy.

In your opinion, how to restrain information "holders" can ensure the security of their customers' information?

Let it make a formal commitment not to use customer information for privacy violations. For example, when using mobile phone software, geographical location, text messages, etc. will be obtained, and it needs to be guaranteed that it is only used to make the software run normally, rather than illegally sharing the obtained private information.

Chapter 3 Big Data Management

1. Database management technology

Data management technology refers to the classification, coding, storage, indexing and querying of data. It is a key technology in the process of big data processing and is responsible for the core system of data storage (writing) to query retrieval ( reading).

2. Database

A database is a warehouse built on a computer storage device that organizes, stores and manages data according to the data structure

3. Relational database

The core is to save data in a simple table composed of rows and columns, rather than saving data in a hierarchical structure.
Features:
1. Centralized data control;
2. High data independence;
3. Good data sharing;
4. Small data redundancy;
5. Data structured;
6. Unified data protection function.

4. Relational data model
  • are expressed in relation
  • Ensuring Data Integrity: Entity Integrity, Referential Integrity, Custom Integrity
  • A row is a tuple and columns are called attributes of the relation
  • If there is an attribute set (which can be a set of single attributes) that uniquely identifies a tuple in a relation, then the attribute set is called the key or code of the relation.
  • The minimum set of attributes used to uniquely identify a tuple is called the primary key (primary key).
5. Data manipulation of relational data model
  • Query
    selection (Select), projection (Project), and (Union), difference (Except), and connection (Join), etc.
  • Update
    Insert (Insert), Modify (Update), Delete (Delete)
6. Database transaction characteristics
  • Atomicity: All operations included in the transaction are either all correctly reflected in the database, or not reflected at all;
  • Consistency: The execution of the transaction will cause the database to reach another consistent state from one consistent state, that is, the execution of the transaction will not make the database inconsistent;
  • Isolation: Transactions are isolated, and each transaction does not feel that other transactions are executing concurrently in the system;
  • Durability: After a transaction completes successfully, its changes to the database are permanent, even if the system fails.
7. Distributed file system

The distributed file system is built on multiple relatively cheap servers connected through the network, and the files to be stored are divided into multiple fragments according to a specific strategy and placed on multiple servers in the system.

8. Features of HDFS
  • Suitable for large file storage and processing
  • The cluster size can be dynamically expanded
  • Can effectively ensure data consistency
  • Large data throughput, good cross-platform portability

Chapter 4 Overview of Python Data Analysis

1. Data Analysis
  • Data analysis in a narrow sense refers to the process of processing and analyzing the collected data, extracting valuable information, giving full play to the role of data, and obtaining a characteristic statistical result by using comparative analysis, group analysis, cross analysis, and regression analysis and other analysis methods according to the analysis purpose.
  • Data mining is the process of mining potential value from a large number of incomplete, noisy, fuzzy, random practical application data by applying techniques such as clustering, classification, regression and association rules.
2. Social Media Analysis
  • User analysis is mainly based on user data such as user registration information, time of logging in to the platform, and usually published content to analyze the user's personal portrait and
    behavioral characteristics.
  • Access analysis is to analyze the user's interests and hobbies through the content that the user usually visits, and then analyze the potential commercial value.
  • Interaction analysis predicts certain future behavior characteristics of objects based on the behavior of objects of mutual concern.
3. python advantage
  • simple and easy to learn
  • free, open source
  • high level language
  • Powerful third-party class library
  • Scalability, embeddability, cross-platform
4. Common libraries for data analysis
  • IPython - part of the standard toolset for scientific computing
  • NumPy (Numerical Python) - the basic package for Python scientific computing
  • SciPy - a collection of modules dedicated to solving various standard problem domains in scientific computing
  • Pandas - data analysis core library
  • Matplotlib - Python library for drawing data charts
  • scikit-learn - tools for data mining and data analysis
  • Spyder - an interactive Python language development environment

Chapter 5 Introduction to Python Crawlers

1. Reptiles

A web crawler, also known as a web spider or web robot, is a computer program or automated script that automatically downloads web pages.

2. Features of web pages
  • Web pages have their own unique URL (uniform resource command character, such as https://www.baidu.com/) for positioning

  • Web pages use HTML (Hypertext Markup Language) to describe page information

  • Web pages use the HTTP/HTTPS (Hypertext Transfer Protocol) protocol to transmit HTML data

3. Strategy

insert image description here

  • Depth-first strategy: According to the order of depth from low to high, visit the next-level web page links in turn until it can no longer go deep. The crawling order in the hierarchy graph of a web page can be: A → D → E → B → C → F → G.
  • Breadth-first strategy: Crawl according to the depth of the content directory of the webpage, and crawl the shallower-level pages first. After all the pages in the same layer have been crawled, the crawler goes to the next layer. The crawling sequence can be: A→B→C→D→E→F→G.
4. Classification of reptiles
  • General Web Crawler
    Also known as the whole web crawler, its crawling objects are expanded from a batch of seed URLs to the entire Web. This type of crawler is more suitable for searching a wide range of topics for search engines, and is mainly used by search engines or large Web service providers
  • Focused web crawler
    , also known as theme web crawler, its biggest feature is that it only selectively crawls pages related to preset topics

Crawling strategy based on content evaluation Crawling
strategy based on link structure evaluation Crawling strategy
based on reinforcement learning Crawling strategy
based on context graph

  • Incremental web crawler
    Incremental web crawler only incrementally updates downloaded webpages or only crawls newly generated and changed webpages, and needs to update local pages by revisiting webpages, so as to keep the locally stored pages up to date

Unified update method: access all web pages with the same frequency, not affected by the frequency of changes of the web page itself.
Individual update method: The frequency of revisiting each page is determined according to the change frequency of individual web pages.
Classification-based update method: crawlers are divided into fast-updating and slow-updating webpage categories according to the frequency of webpage changes, and set different frequencies to visit these two types of webpages.

  • Deep Web Crawler
    Deep pages are web pages that cannot be obtained through static links for most of the content, but are hidden behind the search form, and can only be obtained after users submit keywords.

Form filling based on domain knowledge: This method generally maintains an ontology library, and selects appropriate keywords to fill in the form through semantic analysis.
Form filling based on web page structure analysis: This method generally has no domain knowledge or only limited domain knowledge. HTML web pages are represented in the form of a DOM tree, and forms are divided into single-attribute forms and multi-attribute forms.

5. Cannot crawl
  • Personal privacy data: such as name, mobile phone number, age, blood type, marital status, etc. Crawling such data will violate the Personal Information Protection Act.
  • Data that is clearly prohibited from being accessed by others: For example, the user has set account passwords and other permission controls and encrypted content.
    It is also necessary to pay attention to copyright-related issues. Copyright-protected content signed by the author is not allowed to be reproduced or used for commercial purposes after crawling.
  • The provisions in the robots agreement
6.robots.txt
User-agent: * 
Disallow: /
Allow: /public/

  • The above User-agent describes the name of the search crawler, and here setting it to * means that the protocol is valid for any crawler. For example, we can set: User-agent: Baiduspider. This means that the rules we set are effective for Baidu crawlers. If there are multiple User-agent records, multiple crawlers will be restricted from crawling, but at least one must be specified.
  • Disallow specifies the directory that is not allowed to be crawled. For example, if it is set to / in the above example, it means that all pages are not allowed to be crawled.
  • Allow is generally used together with Disallow, and is generally not used alone to exclude certain restrictions. Now we set it to /public/, which means that all pages are not allowed to be crawled, but the public directory can be crawled.
7. Purpose and means of website anti-crawler ***
  • Anti-climbing through User-Agent verification
  • Anti-climbing by access frequency
  • Anti-climbing through verification code verification
  • Anti-crawling by changing the structure of the web page
  • Anti-climbing through account permissions
8. There are policies above and countermeasures below (ps: (◑▽◐) this is no more than crawling strategy images)
  • Send mock User-Agent
  • Adjust visit frequency
  • Pass verification code verification
  • Responding to site structure changes
  • Restricted by account permissions
  • Evasion via proxy IP
9. Reptile-related libraries

insert image description here

Chapter 6 Web front-end basics

1. socket library

Provides a variety of protocol types and functions that can be used to establish TCP and UDP connections

2. HTTP request method and process
  • The HTTP client initiates a request to the server to create a TCP connection to the specified port of the server (port 80 by default).
  • The HTTP server listens for client requests from this port.
  • Once the request is received, the server will return a status to the client, such as "HTTP/1.1 200 OK", and the returned response content, such as the requested file, error message, or other information.
3. Common request methods

insert image description here

4. Request (request) and response (response)

The HTTP protocol uses a request/response model.

  • The client sends a request message to the server, and the request message includes the request method, URL, protocol version, request header and request data.
  • The server responds with a status line that includes the protocol version, response status, server information, response headers, and response data
5. Specific steps
  • connect to web server
  • send HTTP request
  • The server accepts the request and returns an HTTP response
  • Release connection TCP connection
  • The client parses the HTML content
6. Types of HTTP status codes***

The HTTP status code is a 3-digit code used to indicate the response status of the web server. It is divided into 5 types of status codes according to the first digit
insert image description here
insert image description here

7. HTTP header type ***
  • Common headers: Applicable to both client-side request headers and server-side response headers. It has nothing to do with the final data transmitted in the HTTP message body, only applicable to the message to be sent.
  • Request header: Provide more precise description information, and its object is the requested resource or the request itself. The request headers added by the new version of HTTP cannot be used in lower versions of HTTP, but if both the server and the client can process the relevant headers, they can be used in the request.
  • Response header: Provides more information for the response message. For example, the Location field is used to describe the location of the resource, and the Server field is used to describe the server itself. Similar to the request header, the response header added by the new version cannot be used in the lower version of HTTP.
  • Entity Header: Provides a description of the message body. For example, the length of the message body, Content-Length, and the MIME type, Content-Type, of the message body. The new version of the entity header can be used in earlier versions of HTTP.
8.cookie
  • Client requests server: client requests website page
  • The server responds to the request: Cookie is a string, in the form of key=value, the server needs to record the status of the client request, and add a Set-Cookie field in the response header.
  • The client requests the server again: the client will store the Set-Cookie header information responded by the server. When requesting again, the cookie information of the server response will be included in the request header.

The cookie mechanism can record the user status, and the server can record and identify the user status based on the cookie.

Chapter 7 Simple static web crawling

1. Static web pages

Static web pages, as opposed to dynamic web pages, refer to web pages that do not have a background database, do not contain programs, and cannot be interacted with.

2. Basic process of reptiles***
  • Initiate a request: Initiate a request to the target site through the HTTP library, that is, send a Request, which can include additional headers and other information, and wait for the server to respond.
  • Get the response content: If the server can respond normally, you will get a Response. The content of the Response is the content of the page to be obtained. The types may include HTML, Json strings, binary data (such as pictures and videos), etc.
  • Parsing content: The obtained content may be HTML, which can be parsed with regular expressions and webpage parsing libraries. It may be Json, which can be directly converted to Json object for analysis, or it may be binary data, which can be saved or further processed.
  • Saving data: There are various ways of saving, which can be saved as text, saved to a database, or saved in a specific format.
3. urllib3 library ***
  • (1) Generate a request
urllib3.request(method,url,fields=None,headers=None,**urlopen_kw)

insert image description here

  • (2) Request header processing
    In the request method, if headers parameters need to be passed in, it can be realized by defining a dictionary type. Define a dictionary containing User-Agent information, use Firefox and Chrome as browsers, and "Windows NT 6.1; Win64; x64" as the operating system, and send a GET request with headers parameters to the website "https://www.jd.com/index.html", where the hearders parameter is the defined User-Agent dictionary.

  • (3) Timeout setting
    In order to prevent packet loss when the connection is unstable due to problems such as network instability and server instability , you can add the timeout parameter setting in the request, which is usually a floating point number.
    According to different needs, the timeout parameter provides multiple setting methods. You can set all the timeout parameters of the request directly after the URL, or you can set the connection and read timeout parameters of the request separately. Setting the timeout parameter in the PoolManager instance can be applied to all requests of the instance.

  • (4) Request retry settings
    The urllib3 library can control retries by setting the retries parameter . By default, 3 request retries are performed and 3 redirects are performed . The number of custom retries is realized by assigning an integer to the retries parameter, and the number of request retries and redirection times can be customized by defining a retries instance.

  • (5) Generate a complete HTTP request
    Use the urllib3 library to generate a complete request, which should include links, request headers, timeouts, and retries.

4. requests library

The requests library is a native HTTP library, which is easier to use than the urllib3 library

  • (1) Generate a request
requests.request.method(url,**kwargs)

insert image description here
insert image description here

hd= {
    
    'User-Agent':'Chorme/10'}
r = request.get(url,headers=hd)

insert image description here

  • (2) Check the status code and encoding
    When the requests library guesses wrong, you need to manually specify the encoding code to avoid garbled characters in the returned web page content analysis.
    Use the detect library to detect

  • (3) Request Header and Response Header Processing
    The processing of the request header in the requests library is similar to that of the urllib3 library. The headers parameter is also used to upload parameters in the GET request, and the parameter form is a dictionary. Use the headers attribute to view the response headers returned by the server. Usually, the results returned by the response headers correspond to the uploaded request parameters.

  • (4) Timeout setting
    is achieved by setting the timeout parameter in the requests library. After the number of seconds set by this parameter is exceeded, the program will stop waiting.

  • (5) Generate
    a complete HTTP request A complete GET request, which includes a link, request header, response header, timeout and status code, and the encoding should be set correctly.

Use the decode function to solve Chinese garbled characters in requests.content

import requests
url = "http://xxx.com"
r = requests.get(url)
print(r.content.decode("gbk"))
5. Chrome Developer Tools

insert image description here

6. Regular expressions

Regular expression is a tool that can be used for pattern matching and replacement. It allows users to build a matching pattern by using a series of special characters, then compare the matching pattern with the string or file to be compared, and execute the corresponding program according to whether the comparison object contains the matching pattern.

import re
example_obj = "1. A small sentence. - 2. Another tiny sentence. "
re.findall('sentence',example_obj) #从字符串任意位置查找,可以找到所有满足匹配条件的结果,并以列表的形式返回!!!
re.search('sentence',example_obj) #可以扫描整个字符串并返回第一个成功的匹配!!!
re.sub('sentence','SENTENCE',example_obj) ##替换
re.match('.*sentence',example_obj) ##必须从字符串开头匹配!!!

Commonly used generalized symbols
1. English period ".": can represent any character except the newline character "\n";
2. Character class "[]": contained in square brackets, any character in the square brackets will be matched;
3. Pipeline "|": This character is regarded as an OR operation;

insert image description here

insert image description here

7. Get the title content in the web page

It is not possible to use regular expressions to locate a specific node and obtain the links and text content in it, but it is more convenient to use XPath and Beautiful Soup to realize this function.

8.xpath
  • It is an XML-based tree structure that searches for nodes in the data structure tree and determines the location of a certain part of the XML document.
  • (1) Basic grammar
    initialization
lxml.etree.HTML(text, parser=None, *, base_url=None)

insert image description here

  • (2) Common matching expressions
    insert image description here
  • (3) Function function
    insert image description here
9.Beautiful Soup

insert image description here

  • (1) create
BeautifulSoup("<html>data</html>")     #通过字符串创建
BeautifulSoup(open("index.html"))       #通过HTML文件创建
  • (2) Formatted output
BeautifulSoup.prettify(self, encoding=None, formatter='minimal')

insert image description here

Object type
1.Tag The object type
Tag has two very important attributes: name and attributes. The name attribute can be obtained and modified through the name method, and the modified name attribute will be applied to the HTML document generated by the BeautifulSoup object.
2. NavigableString object type
The NavigableString object is the text string content contained in the Tag, such as The Dormouse‘s story"The Dormouse's story" in " ", which can be obtained using the string method. The NavigableString object cannot be edited, but it can be replaced using the replace_with method.
3. BeautifulSoup object type
The BeautifulSoup object represents the entire content of a document. Most of the time, it can be regarded as a Tag object. The BeautifulSoup object is not a real HTML or XML tag, so it does not have the name and attribute attributes of the tag, but it contains a special attribute name whose value is "[document]".
4Comment object types
Tag, NavigableString, and BeautifulSoup cover almost all content in html and xml, but there are some special objects. The comment part of the document is the part that is most likely to be confused with the text string in Tag. The Beautiful Soup library recognizes the comment part of the document as the Comment type. The Comment object is a special type of NavigableString object, but when it appears in an HTML document, the Comment object will be output in a special format, and the prettify method needs to be called.

  • (3) Search
    Commonly used are the find method and the find_all method. The parameters of the two are the same. The difference is that the return result of the find_all method is a list whose value contains one element, while find directly returns the result.
BeautifulSoup.find_all(name,attrs,recursive,string,**kwargs)

insert image description here

10. Data storage

Store the data as a JSON file

json.dump(obj,fp,skipkeys=False,ensure_ascii=True,check_circular=True,allow_nan=True,cls=None,indent=None,separators=None,encoding='utf-8', default=None, sort_keys=False, **kw)

insert image description here

Code example analysis

1. Request to get the title of the web page
page = urllib.request.urlopen('https://www.cnki.net/')  #获取网页
html = page.read().decode('utf-8')#转换成utf-8

title=re.findall('<title>(.+)</title>',html)#获取title之间所有内容
print (title)#输出
2. Hot words on the website

https://weixin.sogou.com/
insert image description here
Seeing that the path is a under li under ol named topwords,
I don’t directly search for the i tag here, because not all of its i tags have special class names (what a hot chicken stuff, it’s wasted a long time)

  • bs4 get
html_doc = "https://weixin.sogou.com/"#网页链接
req = urllib.request.Request(html_doc)  #构造请求
webpage = urllib.request.urlopen(req)  #获取网页
html = webpage.read().decode('utf-8')#utf-8打开
soup = BeautifulSoup(html, 'html.parser')#指定html.parser解析器解析

str=[]
top=soup.find(name="ol", attrs={
    
    "id" :"topwords"})#找到id为topwords的ol标签
for child in top.find_all('a'):#找到子孙标签中所有a标签
    #print(child.string)
    str.append(child.string)#保存标签内容


filename='sougou.json'


with open(filename,'w') as file_obj:
    json.dump(str,file_obj)#写入json文件
    
with open(filename) as file_obj:
    str=json.load(file_obj)#从json文件读取
    
print(str)
  • xpath
url = "https://weixin.sogou.com/" #请求地址
response = requests.get(url= url) #返回结果
wb_data = response.text #文本展示返回结果
html = etree.HTML(wb_data) #将页面转换成文档树

str2=[]
content=html.xpath('//ol[@id="topwords"]/li/a/text()')#获取根节点下id为topwords的ol标签的子li标签的子a标签的内容
for item in content:
    #print(item.encode('ISO-8859-1').decode('UTF-8'))
    str2.append(item.encode('ISO-8859-1').decode('UTF-8'))#保存


filename2='sougou2.json'


with open(filename2,'w') as file_obj:
    json.dump(str2,file_obj)#写入json文件
    
with open(filename2) as file_obj:
    str2=json.load(file_obj)#读取json文件
    
print(str2)

insert image description here

The dynamic webpage is only mentioned, not written
✿✿ヽ(゚▽゚)ノ✿End of Sahua!

Guess you like

Origin blog.csdn.net/qq_44616044/article/details/118434965