The connection between probability statistics and computer technology and its application in daily life


title: The connection between probability statistics and computer technology and its application in daily life
date: 2023-05-16 11:42:26
tags:


Topic: The relationship between probability statistics and computer technology and its application in daily life

Summary

Probability Theory and Mathematical Statistics is a discipline that studies the statistical regularity of random phenomena. It is an important basic course for all majors in science and engineering. On the one hand, it has unique research topics, unique concepts and methods, and rich content. , the results are profound; its theoretical method is unique and abstract, not only has a rigorous mathematical foundation, but also has close connections with many disciplines. With the rapid development of science and technology, especially computers, it has been widely used in economic management, engineering technology, finance, biology, environment, national defense and other fields.

**Keywords: **probability and statistics; daily life; application; statistical law; mathematics; computer technology;

**
**

I. Introduction

Probability and statistics take the random phenomenon in nature as the research object, which is closely related to people's daily life. Combined with the actual life, a comprehensive analysis of the application of probability and statistics will help to enhance people's consciousness of action and prevent being deceived. Article Mainly combined with specific examples, the corresponding conclusions are drawn through calculation and analysis, so as to better guide people's daily actions.

2. Probability and Statistics

Probability and statistics is a mathematical method for studying the statistical laws of random events in nature, which includes probability theory and statistics. Probability is the basic concept of probability theory, and it can also be called probability, chance rate, probability (probability) or possibility. Probability is a measure of the likelihood of a random event occurring. In general, a real number between 0 and 1 represents the probability of an event occurring. The closer the event is to 1, the more likely it will happen; the closer it is to 0, the less likely it will happen. For example, what percentage of a person who has not reviewed well can be sure to pass the exam, or tossing a coin, etc. are all probability issues. Statistics is a science related to data based on probability theory. It is a method to explore the laws of data by describing the characteristics of data. A school's enrollment and employment status, student physical fitness test results, company operating costs and benefits, etc. are all related to statistics. Life and work are full of probability data. Probability statistics are closely related to people's real life, and play an increasingly important role in daily life production and scientific research. The problems of probability and statistics in life are sometimes beyond people's expectations, but understanding the application of probability and statistics in real life, and seeing the essence through the phenomenon of things according to probability and statistics, we can simply solve some problems in life question.

3. Generation of random numbers and pseudo-random numbers

3.1 In life, one of the problems that bothers people the most is how to make an innocuous and random choice - random number. The most important feature of a random number is that the subsequent number has nothing to do with the previous number when it is generated, so it can give people a feeling that the next state is unpredictable, and is widely used in lottery and cryptography. True random numbers are generated using physical phenomena: coin flips, dice, spinning wheels, noise using electronic components, nuclear fission, etc. In truly critical applications, such as in cryptography, people generally use true random numbers. Such random number generators are physical random number generators, but they have relatively high technical requirements. Therefore, it is often enough to use pseudo-random numbers in practical applications. These series of numbers are "seemingly" random numbers, but they are actually generated by a fixed, repeatable calculation method. They are not truly random, since they can actually be calculated, but they have statistical characteristics similar to random numbers. This is the pseudo-random number generator, which is generated according to a certain algorithm and seed value.

3.2 In the computer, the Mersenne rotation algorithm can also be used to quickly generate high-quality pseudo-random numbers. Usually two close variants are used, except that different Mersenne primes are used. A newer and more commonly used is MT19937, 32-bit word size. Another variant is the 64-bit version of MT19937-64. For a length of k bits, the Mersenne rotation algorithm will generate discrete uniformly distributed random numbers in the interval [0, 2k-1]. A particularly great advantage of pseudorandom numbers is that their calculations do not require external special hardware support, so pseudorandom numbers are still used in computer science. True random numbers must use specialized equipment, such as thermal noise signals, the effects of quantum mechanics, the decay radiation of radioactive elements, or use unpredictable phenomena, such as the position and speed of the user pressing the keyboard, the path coordinates of the user moving the mouse, etc. produce.

3.3 Computer Implementation of Monte Carlo Method

The Monte Carlo method is a method of using random numbers, or the more common pseudo-random numbers to solve many calculation problems, also known as the statistical simulation method, which was introduced in the mid-1940s due to the development of science and technology and the Invented and proposed a very important numerical calculation method guided by the theory of probability and statistics. Monte Carlo methods are widely used in fields such as financial engineering, macroeconomics, biomedicine, and computational physics (such as particle transport calculations, quantum thermodynamic calculations, and aerodynamic calculations).

When the problem to be solved has inherent randomness, with the help of the computing power of the computer, this random process can be directly simulated, for example, in nuclear physics research, to analyze the neutron transmission process in the reactor. And when the problem to be solved can be transformed into a characteristic number of some random distribution, such as the probability of a random event, or the expected value of a random variable. By means of random sampling, the probability of random events is estimated by the frequency of occurrence, or the digital characteristics of random variables are estimated by the digital characteristics of samples, and it is used as the solution to the problem. This method is mostly used to solve complex multidimensional integral problems.

Suppose we want to calculate the area of ​​an irregular figure, then the degree of irregularity of the figure is directly proportional to the complexity of analytical calculations (such as integration). The Monte Carlo method is based on this idea: imagine you have a bag of beans, sprinkle the beans evenly on the graph, and then count how many beans are in the graph, the number of beans is the area of ​​the graph. When your beans are smaller and the more you sprinkle, the more precise the results. A large number of uniformly distributed coordinate points can be generated with the help of a computer program, and then the number of points in the graph can be counted, and the area of ​​the graph can be calculated through the proportion of them to the total number of points and the area of ​​the coordinate point generation range. Through some mathematical relationships and calculation formulas, the values ​​of some special constants and parameters can be calculated.

In addition, this method can also be used to estimate some integrals that are difficult to calculate directly. The interval of the integrand variable is randomly and uniformly sampled, and then the function values ​​of the sampled points are averaged, so that the approximate value of the function integral can be obtained. The correctness of this method is based on the central limit theorem of probability theory. When the number of sampling points is m, the statistical error of the approximate solution obtained by using this method is only related to m, and does not change with the change of the integral dimension. Therefore, when the integration dimension is high, the Monte Carlo method is better than other numerical solutions.

4. Computer Mathematical Software and Mathematical Statistics

In regression analysis, there is a certain correlation between two random variables X and Y, especially in physical experiments, a straight line is often obtained by drawing, and the value of its slope is an expression containing the measured physical constants , so as to obtain the measured physical quantity. The common one is a linear relationship, which can be approximated and fitted by the least square method, and the linear relationship between X and Y can be approximated.

However, due to the limitations of the human eye, it is difficult to make a fitting straight line under the least squares very accurately in ordinary drawing, and there are deviations, and it is very complicated to directly use the least squares method to calculate, so the use of mathematical software , such as MATLAB, etc., can easily obtain various parameters of the fitting curve equation. The workload of people is greatly reduced, and the accuracy is also high.

**5. ** Concrete examples of computer technology’s contribution to probability theory and mathematical statistics

The development of computer science and technology has greatly promoted the progress of probability theory and mathematical statistics, greatly improved the efficiency and accuracy of solving such problems, and reduced the computational complexity. And people can use computer programs to simulate experiments that are difficult to carry out or have high repeatability in real life, (such as simulating and demonstrating the experiment of tossing a coin) to obtain an acceptable result, as the experimental data, to obtain a more reliable in conclusion.

Computer simulations demonstrating the DeMovire-Laplace central limit theorem. By choosing different parameters n, it demonstrates the process that the distribution of the sum of the first n items of the random sequence gradually tends to the normal distribution. If the random variable X~B(n,p), (0<p<1) has a binomial distribution, then as n increases, X approximately obeys the normal distribution with parameters np and np(1-p); And the conclusion of the local limit theorem that the probability of X at a certain point is approximately equal to the value of the density function of the normal distribution at that point.

In the central limit theorem, for independent and identically distributed random variables X1, X2, X3...Xn, E(Xi)=μ, D(Xi)=σ2, i=1,2,… have

img

That is, the mean of these random samples is always around the overall mean, and the distribution is approximately normal

Then you can use PYTHON to simply verify the central limit theorem. The specific example process is as follows:

Text box: import matplotlib.pyplot as plt import math import random as rd n=30 ##Number of samples for a single trial epoch=1000 ##Number of trials res=[0]*(n+1) for i in range(epoch ): average=0 for j in range(n): average+=math.sin(rd.random()*math.pi/2) res[round(average)]+=1 # display height def autolabel(rects): for rect in rects: height = rect.get_height() plt.text(rect.get_x()+rect.get_width()/2.- 0.2, 1.03*height, '%s' % int(height)) name_list = range (0,n+1) num_list = res autolabel(plt.bar(range(len(num_list)), num_list, color='rgb', tick_label=name_list)) plt.show()

Get the picture below

IMG_256

Judging from the running results in the above figure, the current results are very close to the theoretical values, and the central limit theorem can be easily verified.

It can be seen that the computer can make the experiment of probability theory easier, and can demonstrate some relatively abstract concepts and theories, deepen our understanding, and provide powerful tools and methods for theoretical proof.

**6. ** Concrete application examples of probability and statistics in computer technology

The pattern recognition in the computer is to study the automatic processing and interpretation of the pattern through the computer using mathematical techniques.

The Bayesian decision-making applied here is based on the posterior probability calculated by the Bayesian formula, and a further decision on the attribution-classification, and a decision on which category the sample should belong to according to the observation. .

The idea and theory of Bayesian formula used in it:

img
Represents the probability that the sample X appears in a certain category ωi, and the sample may also appear in other categories, which is recorded as the conditional probability, which is the probability of appearing under the condition of a specific category.

When the subjective probability is inferred based on experience and related materials, but it is not fully sure whether it is accurate, the Bayesian formula in probability theory can be used to modify it. The probability before modification is called prior probability, and the probability after modification is called prior probability. Posterior probability

· Posterior probability:
img

· img

Among them, P(X) indicates that the sum of the probability of sample X appearing in all categories is the largest

·

Bayesian decision-making mainly includes two decision-making methods, namely, minimum error Bayesian decision-making and minimum-risk Bayesian decision-making. The former is a decision under ideal conditions or when the status of each category is equal, while the latter needs to consider the cost of the decision itself and the unequal status of each category.

Minimum error Bayesian decision:

In general pattern recognition problems, people often want to minimize the error of classification, that is, the goal is to pursue the minimum error rate. Starting from the requirement of the minimum error rate, using the Bayesian formula in probability theory, we can obtain the classification decision that minimizes the error rate, which is called the minimum error rate Bayesian decision.

Define loss as 0-1 loss

                         ![img](https://farsblog.oss-cn-beijing.aliyuncs.com/PicGo/202305161136655.gif)

Simplified conditional risk

img

Select the group or class with the largest posterior probability, then the possibility of correct judgment is the greatest, and the probability of making mistakes is the smallest, that is, the minimum error Bayesian decision = the largest posteriori Bayesian decision. Since the probability is non-negative, if the error rate of each decision is the smallest, then the total error rate is also the smallest.

Minimum risk Bayesian decision:

Depending on the specific occasion, what we should care about may not only be the error rate, but the loss caused by the error. Similarly, in the example of cancer cell identification, we should not only care about whether the decision made is wrong, but also Concerned about the losses or risks caused by wrong decisions. For example, if a normal cell is misjudged as a cancer cell, it will bring mental burden and unnecessary further examination to the patient, which is a loss; conversely, if a cancer cell is misjudged as a normal cell, the loss will be even greater , because it may cause the patient to lose the precious opportunity of early detection of cancer, which may cause serious consequences affecting the life of the patient. Treating these two types of errors equally is in many cases inappropriate.

Let the loss caused by decision αi for vector x with actual state wj be

img

Calculate the conditional risk:

img

Choose the decision with the least risk:

img

Select the group or class with the least risk of decision-making. When the costs of different decisions are different, we will add different weights to each decision

It can be seen that the application of probability statistics in computer technology has a crucial position. In pattern recognition, Bayesian decision-making can be used as the optimal decision-making, which can minimize the error rate of decision-making, or be the most optimal decision-making under other set rules. Excellent result. In addition, probability and statistics are also applied in other aspects of computer technology.

7. Study the application significance of probability and statistics in real life

Probability and statistics are closely related to people's daily behavior. People will deal with probability and statistics anytime and anywhere in their daily life. For example, knowledge of probability and statistics will appear in the insurance industry, lottery activities, life games, etc., and if some consumers Lack of relevant knowledge of probability and statistics will often make irrational choices and cause adverse effects on oneself. In fact, in these activities, merchants often use the relevant content of probability and statistics, coupled with some consumers' luck, I think that through the content of probability statistics, I will be lucky enough to become a lucky person, but the merchants will obtain huge profits. Therefore, combining with the specific life reality, discussing and analyzing the problems of probability statistics will help to fully understand the essential phenomena of certain activities, and then strengthen people The consciousness of daily behavior enables people to make more rational choices when engaging in certain consumption activities, so as to safeguard their legitimate interests and avoid unnecessary losses.

8. Summary

Probability and statistics are everywhere in our daily production and work. At the same time, the application of probability and statistics theory and methods also pervades the fields of computer science and technology. The application range of probability and statistics is very wide. It can help us improve the lottery winning rate, calculate test paper scores and interview pass rates, understand the possibility of sports competitions, and promote product promotion and other aspects. Although we cannot accurately predict how things will develop in the future, using probability and statistics, we can better deal with uncertain factors and possibilities, and bring convenience to our life, production and work. Therefore, we must learn probability and statistics well, and make rational analysis of some accidental events in life, so as to give full play to the guiding role of probability and statistics, and make our own contribution to the development of human beings.

references

[1] On the Application of Probability and Statistics Methods in Teaching [J]. He Linhai. Journal of Hubei Open Vocational College. 2019(03)

[2] Teaching research on "Probability Theory and Mathematical Statistics" based on hierarchical teaching [J]. Tang Wenju, Fang Chenghong, Fang Xiaohong. Neijiang Science and Technology. 2020(03)

[3] Preliminary study on the application of probability statistics method in the study of population statistics in China [J]. Liu Zhu. Contemporary Economic Science. 1993(04)

[4] A Case Study on the Application of Probability and Statistics Methods [J]. Jiang Liying, Zhang Guolin. Educational Modernization. 2019(81)

[5] Determining the charge of oil droplets based on probability statistics method [J]. Feng Dejun, Lu Xiaodong, Fu Wuwei, Fei Yuliang, Huang Xiaoxiao. Science and Technology Information. 2011(01)

[6] Examples of Probability and Statistics Explanation of Objective Phenomena [J]. Yuan Yanhua, Xu Ying, Chen Honghai. Value Engineering. 2011(15)

[7] Pattern Recognition (Third Edition) [M]. Zhang Xuegong. The third edition in August 2010. Beijing Tsinghua University: Tsinghua University Press, 2010

Guess you like

Origin blog.csdn.net/qq_35798433/article/details/130702358