Utkarsh :
Date,hrs,Count,Status 2018-01-02,4,15,SFZ 2018-01-03,5,16,ACZ 2018-01-04,3,14,SFZ 2018-01-05,5,15,SFZ 2018-01-06,5,18,ACZ
This is the fraction of data to what I've been working on. The actual data is in the same format with around 1000 entries of each date in it. I am taking the start_date and end_date as inputs from user:
start_date=dt.date(2018, 1, 2)
end_date=dt.date(2018, 1, 23)
Now, I have to display a total for hrs and the count within the selected date range, on the output. I am able to do so by entering the dates directly into between clause, using this snippet:
df = df.loc[df['Date'].between('2018-01-02','2018-01-06'), ['hrs','Count']].sum()
print (df)
Output:
hrs 22 Count 78 dtype: int64
I am using pandas and datetime library. But, I want to pass them using the variables start_date and end_date as they might change everytime. I tried replacing it, it dosen't gives me an error, but the total shows 0.
df = df.loc[df['Date'].between('start_date','end_date'), ['hrs','Count']].sum()
print (df)
Output:
Duration_hrs 0 Reject_Count 0 dtype: int64
Serge Ballesta :
You only need to convert all the values to a compatible type, pd.Timestamp
:
df = df.loc[pd.to_datetime(df['Date']).between(pd.Timestamp(start_date),
pd.Timestamp(end_date)),
['hrs','Count']].sum()
Guess you like
Origin http://10.200.1.11:23101/article/api/json?id=398525&siteId=1