谷歌cola上加载谷歌云盘文件

https://mikulskibartosz.name/how-to-load-data-from-google-drive-to-pandas-running-in-google-colaboratory-a7f6a033c997

1、使用下面代码,安装必要的文件

!pip install -U -q PyDrive
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
import os
import pandas as pd
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)

2、获取ID,谷歌云盘上每个文件夹都有唯一的ID,网址的后面一串字符就是,使用这个ID打印出文件夹下面文件的ID

listed = drive.ListFile({'q': "title contains 'test_set.csv'and '1cBRHd3JDA7sbApZ2C912xYoA-z1YM__f' in parents"}).GetList()
for file in listed:
  print('title {}, id {}'.format(file['title'], file['id']))

listed = drive.ListFile({'q': "title contains 'train_set.csv'and '1cBRHd3JDA7sbApZ2C912xYoA-z1YM__f' in parents"}).GetList()
for file in listed:
  print('title {}, id {}'.format(file['title'], file['id']))

3、从谷歌云盘复制文件到谷歌cola上

download_path = os.path.expanduser('~/data')
try:
  os.makedirs(download_path)
except FileExistsError:
  pass


file_dict = {'train_set.csv':'1uAYar1TBFNnuQWu618wXvaDC0djKn4Z0', 'test_set.csv':'1ZB389qN0pNAxfZe1UQS3rLn_HBWegmqn'}
file_path_list = []
for file_name, file_id in file_dict.items():
    output_file = os.path.join(download_path, file_name)
    temp_file = drive.CreateFile({'id': file_id})    
    temp_file.GetContentFile(output_file)
    print(output_file)
    file_path_list.append(output_file)
print(file_path_list)
df_train = pd.read_csv(file_path_list[0])
df_test = pd.read_csv(file_path_list[1])

猜你喜欢

转载自blog.csdn.net/weixin_42007359/article/details/82456156