python data segmentation_python data extraction and splitting

K line data extraction

According to the original data set format, generate a new table as required:

1. The first, last, maximum and minimum values ​​of close data per minute,

2. The growth of vol data per minute (the last data of vol per minute minus the first data)

3. Summarize this information to generate a new table

(field names: ['time','open','close','high','low','vol'])

import pandas as pd

import time

start=time.time()

df=pd.read_csv(‘data.csv‘)

df=df.drop('id',axis=1) #delete the id column

df1=pd.DataFrame(columns=['time','open','close','high','low','vol'])#Create a new target data table

for i in df.groupby('time'): #Group by time

new_df=pd.DataFrame(columns=['time','open','close','high','low','vol']) #Create an empty table for temporary dumping of required data

new_df.time=i[1].time[0:1] # Take each group of time as the new table time

new_df.open=i[1].close[0:1] #Take the first close data of each group as the new table open data

new_df.close=i[1]['close'].iloc[-1] # Take the last close data of each group as the new table close data

new_df.high=i[1]['close'].max() # Take the maximum value of each group of close data as the new table high data

new_df.low=i[1]['close'].min() # Take the minimum value of each group of close data as the new table low data

new_df.vol=i[1]['vol'].iloc[-1] - i[1]['vol'].iloc[0] #Subtract the minimum value from the maximum value of each set of vol data for the new table vol data

df1=pd.concat([new_df,df1],axis=0) #纵向合并数据到目标数据表

df2=df1.sort_values(‘time‘) #按time列值进行排序df2.reset_index(inplace=True, drop=True) #重置行索引print(df2) #打印目标数据表stop=time.time() #查看耗时print(‘共计耗时:{}秒‘.format(stop-start))

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=324108109&siteId=291194637