1 format()
形如str.format()
, The contents of the format used in place of str {} 和
:
This function can be used when a large number of its file to be read is stored separately
>>>. 1 " {} {} " .format ( " Hello ", " World ") # is not provided to specify the location, default order 2 ' Hello World ' . 3 . 4 >>> " {{0}}. 1 " .format ( " Hello ", " World ") # set the specified position . 5 ' Hello World ' . 6 . 7 >>> " {}. 1. 1} {0} { " .format ( " Hello ", " World ") # Set the specified position . 8 ' World Hello World'
Reference: https://blog.csdn.net/zhchs2012/article/details/84328742
2 storage file is stored in the default directory of the current program is located, is also available under .//newfilewalk representation stored in the current directory newfilewalk
If there is no newfilewalk folder will complain, to create a new advance
data_xls.to_csv('.\\chengji\\化学.csv', encoding='utf-8')
If the document 3 to be read back a program previously stored in, to add index = False At this time the front () storing to_csv, this time will not store the index column, or prone to the number of columns does not correspond to the program later mistake
4 to_csv () function
Parameters: mode: When it is desired mode with later data additional writing file = 'a +', must pay attention to this header = None, or will be written to the file name of the column, the subsequent process is prone to error, the error It can only be used to find a dichotomy.
app_actived_train = pd.read_csv('./labelencoder_file/app_actived_train.csv', iterator=True) pieceID = 1 loop = True while loop: try: df = app_actived_train.get_chunk(100000) # 10万 df.columns = ['index', 'uid', 'appid'] df1 = df['uid'] df1 = pd.DataFrame(df1) df = df['appid'].str.split('#', expand=True).stack().reset_index(level=1, drop=True).rename('appid') df = {'index': df.index, 'appid': df.values} df = pd.DataFrame(df) df = pd.merge(df1, df, left_index=True, right_on='index', how='left') df.to_csv('./labelencoder_file/app_actived_train.csv', mode='a+', index=False, header=None) print(pieceID * 100000) pieceID += 1 del df, df1 except StopIteration: loop = False print('imps_log process finish!')