Machine learning -- pandas is easy to use

Pandas

Data Processing Tools in Machine Learning

In [1]:
# import pandas
import pandas as pd
print (pd.__version__)
0.22.0

main data structure

  • DateFrame data frame, which can be considered as a two-dimensional data table
  • Series data sequence, which can be considered one-dimensional
In [2]:
# Series
s = pd.Series(['a', 'b', 'c'])
print(s)
0    a
1    b
2    c
dtype: object
In [7]:
# DataFrame
score = pd.Series([2,6,2])
df = pd.DataFrame({"s_name": s, "s_score": score})
print (df)
# df.describe()
df.head(2)
  s_name  s_score
0      a        2
1      b        6
2      c        2
Out[7]:
  s_name s_score
0 a 2
1 b 6
In [10]:
 
           
# pandas plot
import matplotlib.pyplot as plt
cc = pd.read_csv("california_housing_train.csv", sep=',')
cc.hist("housing_median_age")
Out[10]:
array([[<matplotlib.axes._subplots.AxesSubplot object at 0x0000000009617B38>]],
      dtype=object)
In [14]:
score/1.5
Out[14]:
0    1.333333
1    4.000000
2    1.333333
dtype: float64
In [25]:
 
           
import numpy as np
s.index
s.reindex(np.random.permutation(s.index))
Out[25]:
2    c
0    a
1    b
dtype: object
In [24]:
s.reindex(np.random.permutation(4))
Out[24]:
0      a
2      c
1      b
3 NaN
dtype: object
In [ ]:

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=325920446&siteId=291194637