Why do my data frames create new rows when concatenated?

Marcus :

I am trying to merge tow data frames. One has shape of 1725 rows x 3 columns and the other has 1725 rows x 8 columns.

I merge them with res = pd.concat([dataSet, onehotDataFrame], axis=1) and get a data frame of shape 1810 rows x 11 columns.

The columns seem ok but why are there 85 extra rows in the result?

It is important to note that the original data has shape (1810, 7) and I use

extractedCols = remove_columns(originalDF, remove_from_all)
noDuplacates = extractedCols.drop_duplicates() 

to get a (1725, 4) data frame I then remove another column before the merge.

jezrael :

Problem is different index values, so you need same by DataFrame.reset_index with drop=True:

df = pd.concat([dataSet.reset_index(drop=True),
                onehotDataFrame.reset_index(drop=True)], axis=1)

Another idea is set one index by another before concat:

dataSet.index = onehotDataFrame.index
df = pd.concat([dataSet, onehotDataFrame], axis=1)

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=11819&siteId=1