Converting $1B into number

Hotone :

I retrieve a csv file from nasdaq website with a few columns (Ticker, MarketCap...). I use read_csv from pandas to get a dataframe. My problem is that I can't convert the format of the MarketCap column into a number. This is how the MarketCap column looks like:

MarketCap
$5.54B
$526.85M
$28.41M
nan
nan

Ideally I would want to drop the $ sign and convert B into 1'000'000'000 and M into 1'000'000 The replace/to_replace functions in pandas don't seem to work here. I would like to update my dataframe as follow:

MarketCap
5'540'000'000'000
526'850'000'000
28'410'000'000
nan
nan

(I used ' as thousand separator just for clarity). I don't care about the nan values, so this can't be dropped/ignored for now.

I tried to use the replace method from pandas as follow:

df['MarketCap].replace(to_replace= ['B', 'M'], values= ['*1000000000', '*1000000'], inplace=True)

unfortunately since the column is of string format the above doesn't apply the multiplication.

jezrael :

Use Series.str.strip with Series.str.extract, then multiple first column converted to floats and second mapped by Series.map:

df1 = df['MarketCap'].str.strip('$').str.extract(r'(\d+\.\d+)([BM]+)')
df['MarketCap'] = df1[0].astype(float) * df1[1].map({'B': 1000000000, 'M':1000000})

print (df)
      MarketCap
0  5.540000e+09
1  5.268500e+08
2  2.841000e+07
3           NaN
4           NaN

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=297346&siteId=1