一、数据集 MPG

使用经典的 Auto MPG 数据集，构建了一个用来预测70年代末到80年代初汽车燃油效率的模型。为了做到这一点，我们将为该模型提供许多那个时期的汽车描述。这个描述包含：气缸数，排量，马力以及重量。
引入依赖库

from __future__ import absolute_import, division, print_function, unicode_literals

# 使用 seaborn 绘制矩阵图 (pairplot)
# !pip install -q seaborn

import pathlib

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

import tensorflow as tf

from tensorflow import keras
from tensorflow.keras import layers

print(tf.__version__)

1.1、下载数据集



# 下载数据集
dataset_path = keras.utils.get_file('auto-mpg.data',
                                    'http://archive.ics.uci.edu/ml/machine-learning-databases/auto-mpg/auto-mpg.data')

print(dataset_path)

1.2、导入数据集


# 使用pands导入数据集
column_names = ['MPG','Cylinders','Displacement','Horsepower','Weight',
                'Acceleration', 'Model Year', 'Origin']
raw_dataset = pd.read_csv(dataset_path, names=column_names,
                          na_values = "?", comment='\t',
                          sep=" ", skipinitialspace=True)

dataset = raw_dataset.copy()
# 显示最后几行
dataset.tail()

      MPG  Cylinders  Displacement  ...  Acceleration  Model Year  Origin
393  27.0          4         140.0  ...          15.6          82       1
394  44.0          4          97.0  ...          24.6          82       2
395  32.0          4         135.0  ...          11.6          82       1
396  28.0          4         120.0  ...          18.6          82       1
397  31.0          4         119.0  ...          19.4          82       1
[5 rows x 8 columns]

1.3、数据清洗

print(dataset.isna().sum())
## print::::
## MPG             0
## Cylinders       0
## Displacement    0
## Horsepower      6		# 有错误的行
## Weight          0
## Acceleration    0
## Model Year      0
## Origin          0
## dtype: int64

# 删除异常值所在的行
dataset = dataset.dropna()

1.4、提取特征

# "Origin" 列实际上代表分类，而不仅仅是一个数字。所以把它转换为独热码 （one-hot）

origin = dataset.pop('Origin')

dataset['USA'] = (origin == 1)*1.0
dataset['Europe'] = (origin == 2)*1.0
dataset['Japan'] = (origin == 3)*1.0
print(dataset.tail())

#       MPG  Cylinders  Displacement  Horsepower  ...  Model Year  USA  Europe  Japan
# 393  27.0          4         140.0        86.0  ...          82  1.0     0.0    0.0
# 394  44.0          4          97.0        52.0  ...          82  0.0     1.0    0.0
# 395  32.0          4         135.0        84.0  ...          82  1.0     0.0    0.0
# 396  28.0          4         120.0        79.0  ...          82  1.0     0.0    0.0
# 397  31.0          4         119.0        82.0  ...          82  1.0     0.0    0.0
#
# [5 rows x 10 columns]

chbxw 博客专家

发布了784 篇原创文章 · 获赞 90 · 访问量 44万+

他的留言板关注

回归--预测燃油率

一、数据集 MPG

1.1、下载数据集

1.2、导入数据集

1.3、数据清洗

1.4、提取特征

猜你喜欢