Data Analysis Overview and Theoretical Basis

What is data analysis?


Data analysis: refers to the process of using appropriate statistical analysis methods to analyze a large amount of data collected, extract useful information and form conclusions, and conduct detailed research and generalization of the data.


Why data analysis?


With the rapid development of computer technology, Internet technology, database technology and other technologies, it has become easier and easier for people to generate data, obtain data, and store data, and these data also imply some laws in people's production and life.


Data analysis is to discover these regular information from data, help enterprises/individuals to predict future trends and behaviors, and make targeted decisions, thus making business and production activities forward-looking.

"Twenty-four solar terms, the sunset does not go out, the sunset travels thousands of miles..." For some simple natural phenomena, our ancestors have obtained a lot of empirical knowledge through induction and extraction, but there are too many complexities in the modern human world. The problem is that the amount of data is huge, which has far exceeded the range that the human brain can handle. How to do?


Data analysis is the product of the combination of mathematics and computer science. In practical applications, people can process data through computer tools and mathematical knowledge, and obtain results to make judgments in order to take appropriate actions.

Extracurricular Reading: Data Analytics Real Stories


beer and diapers


There is a man named Sam Walton, everyone should know that, right? If not, then Walmart, should know. It was Sam Walton who miraculously turned a department store into the world's largest retail chain. As early as October 1985, he was listed as the number one richest person in the United States by Forbes magazine. Even US President Bush praised him as an authentic American, showing the spirit of corporate innovation, and being the epitome of the American dream...


In 1983, when general retailers were still in the process of informatization construction, Wal-Mart had already begun to cooperate with Hughes Corporation, spending US$24 million to launch an artificial satellite, and then invested more than US$600 million to build computers and satellites. It also invented technologies such as barcodes, wireless scanners, and computers to track inventory. With the help of a complete set of high-tech information network, Wal-Mart's various departments communicate and operate quickly and accurately, and the database system quickly accumulates a large amount of business data, including a large number of customer consumption behavior records.


The annual Christmas is just around the corner, and Walmart's staff are preparing their marketing strategies for the holiday as usual. This time they used a new 'shopping basket analysis' software to analyze the consumption behavior of a large number of customers. An unexpected discovery made people stunned. The most purchased item together with diapers turned out to be beer!


Immediately after, Walmart sent market researchers and analysts to dig into the results, which confirmed that it revealed a hidden pattern of American behavior behind 'beer and diapers': some between the ages of 25 and 35 of young fathers often go to the supermarket to buy diapers for their babies after work, and 30% to 40% of them buy a few bottles of beer for themselves.


Since then, Walmart has taken immediate action to shorten the space between the maternity and baby products and the alcoholic beverage area, which used to be far apart, to make it more convenient for customers. Then, the spending power of newly-born families in the region was investigated, the prices of these two products were adjusted, and baby pacifiers and other small gifts were given to customers who purchased a certain amount of money at one time. The result was the sales of diapers and beer. Both increased.


Surprise win


In the 2006 World Cup, Argentina and Germany in the quarter-finals in the 120 minutes indistinguishable, before the start of the penalty shootout, the old goalkeeper Kahn handed a note to Lehmann. Lehmann looked at the note every time he made a penalty kick. As a result, all of Lehmann's penalty kicks were judged in the right direction. Except for two penalty kicks that were too high and unable to return to the sky, all the rest were saved, and Argentina could only be out of the game.


The question is, what exactly was written on that note?


It records the footwork used by Argentina's Cruz, Ayala, Rodriguez and Cambiasso. Germany's goalkeeping coach Kopke predicted the direction of the Argentine player's penalty so accurately, not because he had some extraordinary divination genius. That sloppy penalty kick came from the day and night efforts of the data analysis team at the Cologne Institute of Physical Education, Germany.


The analysis team collected video footage of 13,000 penalty kicks taken by the Argentine team, and all these collected penalty kick data were entered into a database, and based on the data from Argentina's shooting practice, they found some behavioral characteristics that can describe the shooting action, such as "Aya". Pull, short approach, bottom right; Riquelme, diagonal approach, bottom right; Maxi, long approach, top left; Cambiasso, long approach, right; Sorin, short approach, bottom right , Tevez, short run, mid lane..."


These behavioral characteristics describe the rules of who and how Argentines take penalties. Few more specific features are eventually extracted from these features. It was this note that put the Hercules Cup in the hands of the German team. These laws summarized on the small note are the results of data mining and analysis.


Follow the official account of [Python Developer Exchange Platform], reply to [Receive Resources] in the WeChat background, and get 200G dry goods of IT resources.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324507252&siteId=291194637