Udacity--1--notMNIst Reshape your data either using array.reshape(-1, 1)

python3.6+pycharm
都说数据集太大,要先下下来,我真的是解压了好久。
链接什么的https://blog.csdn.net/zcf1784266476/article/details/70821417
这位敬爱的博主里面都有。
也不知道前面一大堆的数据预处理程序是给的还是博主自己写的,真是厉害。(也感觉很麻烦,自己又啥也不会,
只能勉强看懂,要改就完全不会了),但是博主前面好像少了一部分,
我是参考https://blog.csdn.net/u013698770/article/details/54645326这个给补上的
数据的提取吧。
 
 
def maybe_extract(filename, force=False):
    root=os.path.splitext(os.path.splitext(filename)[0])[0]
    data_folders=[os.path.join(root,d) for d in sorted(os.listdir(root))
        if os.path.isdir(os.path.join(root,d))]
    if len(data_folders)!=num_classes:
        raise Exception("expeption")
    return data_folders


#train_filename = 'E:\\pycharm\\notMNIST_large'
train_filename = 'E:\\pycharm\\notMNIST_large'
train_folders = maybe_extract(train_filename)
test_filename = 'E:\\pycharm\\notMNIST_small'
test_folders = maybe_extract(test_filename)

其实主要的还是逻辑回归的调用吧。调用真的很简单。
from sklearn.linear_model import LogisticRegression
size=100
with open('E:\\pycharm.pickle','rb')as f:
    data=pickle.load(f)
train_dt=data['train_dataset']
length=train_dt.shape[0]
train_dt=train_dt.reshape(length,image_size*image_size)
train_lb=data['train_label']
test_dt=data['test_dataset']
length=test_dt.shape[0]
test_dt=test_dt.reshape(length,image_size*image_size)
test_lb=data['test_label']

def train_linear_logistic(tdata,tlable):
    model=LogisticRegression(C=1.0,penalty='l1')
    print('initializing size is{}'.format(size))
    model.fit(tdata[:size,:],tlable[:size])
    print('testing model')
    y_out=model.predict(test_dt)
    print('accuracce {}is{}'.format(size,np.sum(y_out==test_lb)*1.0/len(y_out)))
    return None
train_linear_logistic(train_dt,train_lb)
一开始会出现
 
 

Reshape your data either using array.reshape(-1, 1)

在print后面又加了一个‘)’才好使,据说是数组的原因。就这样吧……

还有我的plt.show()第一个可以显示出来,第二个怎么也不显示。

还有一个问题就是ipython调用显示图像没有成功,改天要好好研究一下。

猜你喜欢

转载自blog.csdn.net/qq_40242410/article/details/80558528
今日推荐