Knowledge of statistics and data analysis

Knowledge/summary of statistics and data analysis

In October last year, I started self-studying data analysis . I read some books, supplemented some theoretical knowledge, and also practiced some projects. After walking around, I have some experience, and I will make a phased summary here.

Personal learning habits are: tend to learn from the underlying basic knowledge. Statistics as the foundation of data analysis is also my initial accumulation of knowledge. Therefore, the statistics are summarized first, and then the data analysis is summarized.


statistics

The development of concepts such as artificial intelligence, machine learning, and data mining has led to the development of statistics now more than ever before. Yet to those who have grasped the truth, or at least part of it, this seems to have already been known. The evidence is plentiful.

(1) Mr. Chen Xiru (1934-2005) , a famous
mathematical statistician and educator in China , once said:

The cognition of contingency is a component that should be possessed in the knowledge structure of a modern person, and it is a part of a person's humanistic quality.

(2)
A little earlier, as the British scholar Wells (1866-1946) said:

Statistical thinking, like the ability to read and write, will one day be a must for citizens.

(3)
The most admirable is what the modern physicist Schrödinger (1887-1961) said in 1944:

In the last 60 or 80 years, statistical methods and probabilistic computing have entered branch after branch... This new weapon always starts with an excuse: it's a cure for our shortcomings, our ignorance of details , or an inability to cope with a large amount of data... but seemingly inadvertently, attitudes change, and we realize that the individual case is completely uninteresting, regardless of whether detailed knowledge about it is available, regardless of whether the mathematical problems it raises can be handled, we Understand: even if it could be done, we would not be able to come up with a better result than a statistical number by tracking thousands of individual cases, and what we are actually interested in is the use of statistical mechanisms.

At that time, there were no electronic computers, and people's ability to process large amounts of data was still very limited, but Schrödinger already had such a summary, and he had to admire his quick thinking and keen insight.

data analysis

(1)
The most important thing in data analysis should be the establishment of analysis ideas , the understanding of the business background , rather than the play and use of many charts, although they are also very important.

For example, I have a deep understanding of the mathematics problems in my student days. After reading a problem, I almost know what the answer is, and then I start to answer in that direction, and I usually get the result quickly; on the contrary, If you have no idea in your mind after reading the question, answer blindly, and try chaotically, there is usually no result. Therefore , mathematical intuition is more important than problem-solving methods, and further, imagination is more important than knowledge . Similarly, for data analysis, the idea of ​​analysis is more important than the means of analysis .

(2)
Data analysis is ultimately to solve practical problems and serve production and life, not analysis for the sake of analysis. There are many tools for data analysis to choose from, such as excel, sql, python, r, tableau, etc. In the process of learning many tools, you must always remind yourself not to be too obsessed with technical details and forget the original purpose of analysis . If it does not exist, Mao will be attached .

Detection of learning outcomes

In fact, the quicker and more accurate way to detect learning outcomes is to see if you can explain the problem to others clearly and clearly. This is the most effective method, no one. Because, in this world, there are only teachers who can't teach, and there are no students who are confused . Assuming that others don't understand what I'm saying, it's just that I don't fully understand it.

Another advantage of this kind of active output is to further improve one's understanding of knowledge, which is what the so-called teaching and learning means.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324892836&siteId=291194637