How does pyplot know what was plotted by the output of pandas.DataFrame(...).cumprod().plot()

user2109254 :

I am coming from a background of C# programming to learn python, and trying to wrap my head around how things work. There seems to be a lot of "magic" in getting the result you want in python.

For example, take the following code:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

returns = pd.DataFrame(np.random.normal(1.0, 0.03, (100, 10)))
prices = returns.cumprod()
prices.plot()
plt.title('Randomly-generated Prices')
plt.xlabel('Time')
plt.ylabel('Price')
plt.legend(loc=0);

This produces a nifty line plot, very nice. So we pass the output of numpy.random.normal(...) into pandas.DataFrame(...) to get, according to the docostring, the following assigned to the variable returns: Two-dimensional size-mutable, potentially heterogeneous tabular data structure with labeled axes (rows and columns)

Then somehow that data structure contains a method returns.cumprod() which is called to get, according to the docstring, the following assigned to the variable prices: Return cumulative cumprod over requested axis.

The next four lines of code, (how good would it be if notebook had line numbers), are calling methods of the matplotlib.pyplot lib.

So my question is:

As these are completely separate objects in separate namespaces, how does the matplotlib.pyplot know anything about what pandas.DataFrame(...).comprod(...).plot() just did.

Like in C# one object in one namespace knows nothing about another object in a separate namespace that it hasn't been explicitly told, via input methods.

So just trying to get my head around how this all works, so I can build things with a good design, instead of just copying and pasting stuff I google to find that look like they get the output I want. Like how does scope work with the above code. Lets say I do two plots, how does it know which one to assign the title to?

Thanks for your time/patience,

Kind Regards.

Bruno Mello :

When you run for example plt.legend() it uses the function plt.gca() (get current axis) and since the pandas plot was the last axis plotted it knows where to put the legend. If you do, for example:

pd.DataFrame({'x': [1,2], 'y': [2,3]}).plot.line()

pd.DataFrame({'x': [1,2], 'y': [2,43]}).plot.line()

plt.legend([1])

It will plot the legend in the second axis since it was the current axis.

Guess you like

Origin http://10.200.1.11:23101/article/api/json?id=379959&siteId=1