Fill in missing date values and populate second column based on previous row

jeffry :

I have a csv with 2 columns, one with a date populated in it, the second column with a rate value. The file contains some missing rows based on the date column.

I would like some python code that can can fill in the missing dates between the first row and the last row (between 01/01/2019 and 14/01/2019), the second task is to then fill in the missing rate with the previous days rate.

For example, 04 and 05 of Jan are missing, these rows need to be created and the previous days rate is on 03 Jan - 1.12 so that rate needs to be populated in for 04 and 05 Jan.

The code needs to be dynamic, so the first and last row will not always be the same for each file. For example, a second file can have first row and last row values of 03/02/2019 and 25/02/2019. The same code needs to be able to run on each file if possible.

The input will be a csv and the output also needs to be a csv file.

enter image description here

Input -

Date,Rate
01/01/2019,1.12
02/01/2019,1.13
03/01/2019,1.12
06/01/2019,1.11
07/01/2019,1.13
08/01/2019,1.14
09/01/2019,1.13
10/01/2019,1.11
12/01/2019,1.12
13/01/2019,1.13
14/01/2019,1.14

Please let me know if you have any questions.

Quang Hoang :

First you need to make sure your date is datetime type, and you can use resample:

# resample
df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)

new_df = df.set_index('Date').resample('D').ffill().reset_index()

Output:

         Date  Rate
0  2019-01-01  1.12
1  2019-01-02  1.13
2  2019-01-03  1.12
3  2019-01-04  1.12
4  2019-01-05  1.12
5  2019-01-06  1.11
6  2019-01-07  1.13
7  2019-01-08  1.14
8  2019-01-09  1.13
9  2019-01-10  1.11
10 2019-01-11  1.11
11 2019-01-12  1.12
12 2019-01-13  1.13
13 2019-01-14  1.14

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=16602&siteId=1