A processing missing values
pandas floating-point values NaN (Not a Number) to display the missing values, and missing values referred NA (not available (not available)).
Common treatment methods NA:
dropna: The value of each tag is whether the missing data to filter axis labels, and the threshold is determined according to the allowed amount of data loss.
fillna: filled with a certain value or data interpolation method (e.g., 'ffill' or 'bfill').
isnull: the return value is a Boolean value to indicate which missing values.
notnull: isnull inverse function.
1, the filter ( Data. Dropna ())
Delete rows containing missing values (default ): data.dropna () is equivalent to data [data.notnull ()], contains the missing row is deleted default values
By passing parameters, it can be
Delete all the rows of values are NA: data.dropna (How = 'All')
Delete all values are NA column : data.dropna (axis = 1, how = all)
Reserved observations contain a certain number of lines: data.dropna (thresh = 2)
2, the completed ( Data. Fillna ())
Second, data conversion
1, delete
2. Conversion
3, alternative
Three, string manipulation
1, a string object method
2, regular expressions
3, the quantization String Functions