Series and the index is DataFrame line label, and there may be one or more indexes. If DataFrame Series and an index, called single-level index; if there are multiple indexes, called a multi-level index. And similar DataFrame Sereis index of a column of data, can have a variety of data types. Type Index are: integer index (Numeric Index), classification index (Category Index), date and time index (DateTime Index, Timedelta Index), the period index (Period Index), the range of the index (Range Index), the interval index (Interval Index) , multi-level index (multi-level index).
Multi-level index (Multi-Level Index) refers to the sequence (Series), or data block (DataFrame) a plurality of indexes, similar to the two-dimensional multilevel indexing relation table, i.e., Series or DataFrame has a structure similar to DataFrame index.
The most commonly used index is an integer index, classification index and date index.
First, the basic function
For the most basic constructor creates indexes:
pandas.Index(data,dtype=object,copy,name,tupleize_cols=True)
Parameter Notes:
- data: an object similar to a one-dimensional array, used to create the index, the index is ordered.
- dtype: Default is object, for indicating the type of index element
- copy: copy of the input data
- name: name of the index, the default value is Index
- tupleize_cols: If set to True, the attempt to create a multi-level index (MultiIndex).
For example, to create an integer index:
>>> pd.Index([1, 2, 3]) Int64Index([1, 2, 3], dtype='int64')
Second, the index of property
Index is similar to a two-dimensional relational tables, with specific properties:
- values: the value of the index
- is_monotonic、is_monotonic_increasing、is_monotonic_decreasing:单调
- is_nuique, has_duplicates: unique values, duplicate values,
- hasnans: Is there value NA
- Data type index element: dtype
- name: The name of the property index,
- names: If the index is a multi-stage (MultiLevel), then each one has a name
- The number of index elements: size
- T: index transpose
Third, the lack of value of the index
Checks for missing values, ISNA () for each value of the index is checked, when the value of NA, the return True; NA when the value is not, returns False. notna () for each value of the index is checked, when the value is not NA, returns True; when the value of NA, the return False.
Index.isna (self)
Index.notna (self)
Fill in missing values, filled with a scalar value NA, the downcast type indicates downward compatibility:
Index.fillna(self, value=None, downcast=None)
Delete missing values, parameters indicate how how to remove missing values, valid values are any and all:
Index.dropna(self, how='any')
Fourth, the index ranking
Sorted by the value of the index, but the index returns the index, * args and ** kwargs parameters are passed to the function parameters numpy.ndarray.argsort.
Index.argsort(self, *args, **kwargs)
Sorted by the value of the index, return a copy of the sort, the parameter indicates whether return_indexer returns the index subscript:
Index.sort_values(self, return_indexer=False, ascending=True)
For example, the following indexes:
>>> idx = pd.Index(['b', 'a', 'd', 'c']) Index(['b', 'a', 'd', 'c'], dtype='object')
Sorted by the index value, the index returns sorted index:
>>> order = idx.argsort() >>> order array([1, 0, 3, 2])
To see the value of the index sorted by the following standard:
>>> idx[order] Index(['a', 'b', 'c', 'd'], dtype='object')
Of course, you can also return the sorted index directly:
>>> idx.sort_values() Index(['a', 'b', 'c', 'd'], dtype='object')
To return the sorted index and the corresponding subscript, set the parameter return_indexer = True:
>>> idx.sort_values(return_indexer=True) (Index(['a', 'b', 'c', 'd'], dtype='object'), array([1, 0, 3, 2], dtype=int64))
Fifth, the index of conversion
The index can be converted to List, DataFrame, sequence, array (ndarray) and the like, Ravel () function is used to index into an array expanded form.
Index.to_list(self) Index.to_frame(self, index=True, name=None) Index.to_series(self, index=None, name=None) Index.ravel(self, order='C')
The conversion type index value specified type:
Index.astype(self, dtype, copy=True)
Operating six index values
The index value can be a series of operations, the most commonly used functions are listed below index operations:
1, the index returns the index where the maximum or minimum
Index.argmin(self, axis=None, skipna=True, *args, **kwargs)
Index.argmax(self, axis=None, skipna=True, *args, **kwargs)
2, delete the index value
Deletes the specified index
Index.delete(self, loc) Index.drop(self, labels, errors='raise')
3, duplicate values
drop_duplicates () function is used to delete duplicate values, the effective value of the parameter is keep first, false and False, frist reservations first, last last reservations, False indicates to delete duplicate values.
Index.drop_duplicates(self, keep='first')
Check whether the index value is repeated when repeated values, a position corresponding to the value of the index value to True.
Index.duplicated(self, keep='first')
4, insert the new value
Index.insert(self, loc, item)
5, rename the index name attribute
Index.rename(self, name, inplace=False)
6, the only value of the index
Index.unique(self, level=None)
7, Gets the index subscript
The first way is to pass the index value list:
Index.get_indexer(self, target, method=None, limit=None, tolerance=None)
Parameter Notes:
target: index list
method:None, ‘pad’/’ffill’, ‘backfill’/’bfill’, ‘nearest’
- None expressed full match:
- pad / ffill: If there is no match, the former value to find a non-NA
- backfill / bfill: If there is no match to, after finding a non-value NA
- nearest: If there is no match, find the nearest non-NA value
It does not completely match the maximum number of consecutive tag in the target: limit
tolerance: maximum distance index value of the matching position between a not perfectly match the original index and a new index that best satisfies the equation abs (index [indexer] -target) <= tolerance.
The second way is to pass the index of a scalar value, the scalar value in the return position index:
Index.get_loc(self, key, method=None, tolerance=None)
Seven other types of indexes
- 1, an integer index
- 2. Classification Index
- 3, the index date
Reference documents: