Having searched around for a bit (e.g. here, here, and here), I'm at a loss. How do I get Python 3.7 to use more than 2 GB memory?
Info about my setup:
I'm running 64-bit PyCharm (2019.2.6) with 64-bit Python 3.7.5, and I've set my -Xms=8g
and -Xmx=16g
in pycharm.vmoptions (as this suggests to set Xms
to half of Xms
). This is running on macOS Catalina 10.15.3, on a machine with 40 GB ram (2*4 + 2*32).
What I'm trying to do, and why I want to increase memory use: I'm reading relatively large timeseries (200-400 columns, around 70 000 rows) into Pandas (v. 0.25.3) dataframes from .txt-files (file size ranges from 0.5 GB to 1.5 GB), and working with 10-15 of these files at a time. As I'm reading in the files I see the python3.7 process increase memory up to around 2 GB (sometimes 2.05 GB), before memory use is decreased to a few hundred MBs and increased up towards 2 GB again (and repeat).
When I'm working with these timeseries [slicing, plotting, etc.], everything takes a relatively long time (a few minutes). I'm hoping this can be improved by increasing memory usage. However, if I am wrong in my assumption that increased RAM usage in the python process would improve performance then please let me know
Thank's to many helpful comments (geckos, jammin0921, Óscar López, and Heap Overflow) it looks like what I was observing was not a limitation of Python, but rather that the apparently clever data-management by Python/Pandas meant that once the 12 GB of .txt-files had been read into DataFrames, their total size was actually below 2 GB, by looking at the memory usage of the dataframe (df): df.memory_usage(True, True).sum()
which gave 1.9 GB
Having tried to manipulate this further by further increasing size of the data I read in, I do see RAM usage above 2 GB from python3.7 process.