It is acceptable to have less than 10% missing values |
Missing values are when the value of one or some attributes in the dataset is incomplete. |
There are various reasons for missing values, which are mainly divided into mechanical reasons and human reasons . |
Mechanical reasons are missing data due to failed data collection or preservation. Such as data storage failure, memory corruption, mechanical failure Data could not be collected for a certain period of time, etc. |
Human reason is the lack of data due to human subjective errors, historical limitations or intentional concealment, for example, the interviewee refuses to disclose in market research The answer to the relevant question, or the answer to the question, is invalid, or the data entry staff made a mistake and missed the data entry. |
In a data table, the most common manifestation of missing values is a null value or an error indicator. |
How to quickly find all missing values: |
1: Positioning input : start--edit--positioning conditions or directly use the shortcut key Ctrl+G, the "positioning" dialog box pops up, positioning conditions--null value--OK |
Four ways to handle missing values: |
Method 1: Use the value of a sample statistic to replace the missing value. The most typical way is to use the sample mean of the variable to replace the missing value. This method is a more common practical method in practice. |
Method 2: Replace missing values with values calculated by a statistical model. Commonly used models include regression models, discriminative models, etc. However, this requires the use of professional data analysis software. |
Method 3: Delete records with missing values, but it may reduce the sample size |
Method 4: Keep records with missing values, and only make necessary exclusions in the corresponding analysis. When the sample size of the survey is relatively large, When the number of missing values is not very large, and there is no high correlation between variables, this method is used to deal with missing values. relatively feasible. |
2:Ctrl+Enter |
Ctrl+Enter, useful when entering the same data or formula at the same time in discontinuous areas |
eg: |
Ctrl+Enter, multiple discontinuous cells just selected become the same content "white". |
Ctrl+Enter is used in conjunction with locate and search. After locating to a blank cell with F5 or Ctrl+G, you can enter the data you want to enter, and then press Ctrl+Enter, all blank cells will become you the way you want. |
3: Find and replace |
When the missing value is in the form of an error indicator, the second method-replacement search can be used. |
Ctrl+F Find Ctrl+H Replace Ctrl+G Quick Locate |
"Who Said Rookies Can't Analyze Data" Study Notes 2 Missing Data Processing
Guess you like
Origin http://10.200.1.11:23101/article/api/json?id=326990081&siteId=291194637
Recommended
Ranking