Problem Description
For the date dataframe as shown in the figure below, the start date and end date need to be spliced together
Original dataframe
start date | End date |
---|---|
2020-08-03 | 2020-08-09 |
2020-08-10 | 2020-08-16 |
2020-08-17 | 2020-08-23 |
2020-08-24 | 2020-08-30 |
2020-08-31 | 2020-09-06 |
The spliced dataframe
start date | End date | Insert date |
---|---|---|
2020-08-03 | 2020-08-09 | 2020-08-03 ~ 2020-08-09 |
2020-08-10 | 2020-08-16 | 2020-08-10 ~ 2020-08-16 |
2020-08-17 | 2020-08-23 | 2020-08-17 ~ 2020-08-23 |
2020-08-24 | 2020-08-30 | 2020-08-24 ~ 2020-08-30 |
2020-08-31 | 2020-09-06 | 2020-08-31 ~ 2020-09-06 |
solution
Option 1: apply mapping
# 方案1
date_xl['插入日期']=date_xl.apply(lambda x:x['开始日期']+" ~ "+x['结束日期'],axis=1)
# 方案2
date_xl['插入日期']=date_xl.apply(lambda x:" ~ ".join(x.values),axis=1)
The above two methods are basically the same in principle
When it encounters a Null value, an error will be reported, because none cannot be operated with str. The
solution is as follows, just add if judgment
df = pd.DataFrame([list("ABCDEF"),
list("ABCDE")]).T
df.columns=list('XY')
df.apply(lambda x:" ~ ".join(x.values) if (x.values[0]!= None) &(x.values[1] != None) else np.nan,axis=1)
Option 2: Convert to a nested array/list
# 转换成嵌套数组
df.values
np.array(df)
#转换成嵌套列表
df.values.tolist()
np.array(df).tolist()
# 拼接
pd.DataFrame([" ~ ".join(i) if (i[0]!= None) &(i[1] != None) else np.nan for i in np.array(df).tolist()])