[Statistics notes a] basic concepts of statistics

[Statistics notes a] basic concepts of statistics

Statistics is the efficient collection, processing, analysis and interpretation of data, found that the law, in order to better a methodological discipline decisions. It was found that the law of data to make better decisions. To discover the laws of statistics are usually requirements: objectivity, applicability, accuracy and timeliness.

The method of analyzing data are descriptive statistics, inferential statistics.

  • Descriptive statistics

Descriptive Statistics ① useful information to process the collected data, with values ​​represented in tabular or graphical form.

② descriptive statistics is the foundation, which provides the necessary information for statistical inference, statistical consulting, statistical decision.

  • Inferential statistics is based on sample data to estimate or test the overall characteristics of the data features.

How significant it is to solve practical problems

The basic idea is statistically solve practical problems:
       ① make practical issues related to statistics;
       ② establish an effective indicator system;
       ③ collect data;
       ④ selection or creation of efficient statistical methods processing, display characteristics of the collected data;
       ⑤ according to the data collection feature, combined with the qualitative and quantitative knowledge to make a reasonable inference overall characteristics;
       ⑥ give advice better decisions based on inference;
       in solving problems, repeat step ②-⑥.


    
Several basic concepts in statistics

Overall, unit and sample

  • Overall: The overall statistics are determined in accordance with a certain purpose, consisting of an objective reality, a lot of individual things with a certain homogeneous whole.

(1) homogeneity is the basic criteria for determining the statistical population, which is based on statistical and research purposes given. Different research purposes, also generally different as determined by its homogeneity of significance also changes.

(2) the statistical population should also have a lot of that overall statistics should be composed of a sufficient number of homogeneous units.

  • Overall unit (referred to as units) are each composed of individuals overall.
  • Sample: a collection of units of the general part is called the sample (also known as sub-sample). Units constituting the sample is referred to as a sample, the number of samples in the sample is referred to as sample size.

Statistical purposes to solve the problem is to recognize the general features of the data. However, when the investigation is destructive, or because of cost, time and other factors to consider, unnecessary or impossible to have all units constitute the overall investigation.

Signs, indicators (parameters) and statistics

  • Mark: Overall units generally have attributes or characteristics called flags. Sign into their performance quality mark and the number of markers on.

       ① quality mark shows that the unit features aspects of the property, the performance of the quality mark can only be described by non-numeric. For example, the category of goods; gender residents.
       ② number flag indicates the character of the number of units of aspects, which exhibits uses numerical values. For example: commodity prices, sales volume; income residents.

  • Parameters (mark): count the number of features and the overall concept of value has called statistical indicators, also known as parameters. Statistical indicator consists of two basic elements, namely the concept of values ​​and indicators index. The concept of the index is to study the phenomenon of the nature of abstraction, but also prescriptive nature of the overall number of features. For example: the number of resident population of 10 million, total revenue of 60 billion yuan.

Representation of statistical indicators can be divided according to the number of indicators and quality indicators.

① Those who reflect the total size of the phenomenon, the overall level of statistical indicators called quantitative indicators, with the absolute number to represent. For example, the total number of 10 million inhabitants, with a total income of 60 billion yuan and so on.
② Those who reflect the phenomenon is relatively level and quality of statistical indicators called quality indicators, with the average relative number or be represented. For example enterprise workers the average wage of 5,000 yuan, worker attendance 93% and so on. Quality indicators index is derived aggregates in order to reflect the intrinsic link between the phenomenon and relativities.

A single index does not reflect the overall picture, it will need to set up the index system. Statistical indicator system is an organic whole by a series of statistical indicators linked components of the studied phenomenon to reflect all aspects of mutual restraint of mutual dependency.

 

  • Statistics

       View a statistic sample of known function is measured, for explaining the characteristics of the sample. Is a known function of sample observables for explaining the characteristics of the sample.

       Different samples taken statistic observations is different. The sample mean, sample variance, proportion of statistic samples, after taking samples, it is usually observed and the overall statistics corresponding to the parameter, the parameter estimation as a whole. (If a car manufacturer drawn from a number of car production in the 16 car, with average values ​​of these cars with mileage, qualified values ​​were estimated as the average batch of cars with mileage, yield.)


data

Variable and variable values
of a fact or a number of features that is 1. Description of the phenomenon known as variables, the above signs, indicators and statistics to summarize name is variable.
2. Specific performance variable is a variable value, the data is variable and its performance, but also reflect the objective things called facts or quantity basis.
    For example: it is a variable income, revenue performance was variable.
3. All data collected during the study in particular together, referred to as a data set.
4. The variable determines whether or not the value of the variable into deterministic variables (affected by uncertain factors, factors that are clear, interpretable, controllable) random variables (affected by many uncertain factors, such as employees get up time).

Yardstick data

Needed when collecting data from low to high of four measurement scales: nominal scale, scale sequencing, given than the predetermined distance scale and scale, different scales of measurement determines the different methods of data analysis and processing.
1. The nominal scale is a measurement of the phenomenon objectively disordered category. The main feature of mathematics nominal scale is the "=" or "≠". As residents of gender is male, female type metering, aircraft are fighters, bombers, reconnaissance aircraft and other measurement, this value is used only as a non-occasion the code sequence classification.
2. sequencing scale is a non-numerical measurement objective phenomenon orderly categories. The main feature of mathematics scale sequencing is "<" or ">." For example, residents satisfaction measurement can be divided into very satisfied, satisfied, in general, satisfied, very satisfied with the five categories. It used the occasion of value only as a code and orderly classification.
3. fixed distance scale is a numerical objective phenomenon spacing meaningful measurement. Which reflects the difference between the amount of the phenomenon in terms of a precise value, the main mathematical characteristics of a given distance scale is "+", "-." As aggregate indicators is given from the scale of measurement.
(0 does not mean the absence of)
4. The scale is a fixed-ratio measurement objective phenomenon meaningful than two values. The main features of a given mathematical ratio scale is "x" "/"
as the number of relative quality indicators in the average number is a fixed ratio measurement scale (0 = absent)
5 data classification
(1) nominal scale, scale sequencing data collectively referred to as qualitative data. Qualitative variables are variables with qualitative data.
(2) from the fixed scale, quantitative data set is referred to as scale than the data. Quantitative variables are variables with quantitative data.
         The continuous presence or absence of the value of quantitative variables, the quantitative variables into continuous variables and discrete variables.
         ① Continuous variables are variable in the given region is continuous, we can not list them. Such as: the position of the bullet holes, the product life of military aircraft.
         ② discrete variable refers to the value of the variable is intermittent, it can enumerate. For example, the number of products. 

The type of data


Depending on the angle of observation objective phenomenon, statistics can be divided into: a cross-section data, time series data and panel data.
1. The cross-sectional data, also known as static data, which refers to the different units of the same general observation at the same time data obtained. For example, in 2014 the gross income of the provinces, municipalities and autonomous regions on the part of cross-sectional data.
2. The time-series data, also known as dynamic data, it refers to chronologically observe the same general in a certain period to obtain data. For example, China's "Twelfth Five-Year" during the year in gross income belongs to the order of time-series data.
3. The panel is two-dimensional data while the data in time and space cross section taken. For example 2005-2014 GDP data 30 enterprises. Panel Data by 30 companies 10 years of data, a total of 300 observations. From one year to see, it is by the number of total output value of 30 companies.

 

Published 619 original articles · won praise 185 · views 660 000 +

Guess you like

Origin blog.csdn.net/seagal890/article/details/104889978