[Python Fennel Bean Series] PANDAS Gets the number of rows in DataFrame

[Python Fennel Bean Series] PANDAS Gets the number of rows in DataFrame

Programming in Python, using different methods to accomplish the same goal, is sometimes a very interesting thing. This reminds me of Kong Yiji in Lu Xun's works. Kong Yiji has done a lot of research on the four writing styles of fennel beans. I dare not compare myself to Kong Yiji, here are some Python fennel beans, for all coders.

How many pieces of data are there in total? This is probably the most basic content of data analysis work.
Here, let's talk about how to get the number of rows of DataFrame in Pandas.
First prepare a DataFrame for testing. This DataFrame has 3 columns named a, b and c:

>>> import numpy as np
>>> import pandas as pd
>>> df = pd.DataFrame({
    
    'a':[None,2,3], 'b':[4,5,6], 'c':[7,8,9]})
>>> df
     a  b  c
0  NaN  4  7
1  2.0  5  8
2  3.0  6  9

One count of fennel beans:

There is one SQL statement SELECT count (*) FROM some_table, and
the DataFrame also has a countfunction that can be used for counting. The example is as follows:

>>> df['a'].count()
2

Wait, how can it be 2? The result should be three! It turns out countthat will be NaNremoved and is present in column a NaN, so the result is incorrect. Let's look at column b and it will be correct:

>>> df['b'].count()
3

However, we cannot guarantee that every time we encounter column b, there will be no null value, so we create a column ourselves:

>>> df['aa'] = 1
>>> df
     a  b  c  aa
0  NaN  4  7   1
1  2.0  5  8   1
2  3.0  6  9   1
>>> df['aa'].count()
3

Well, so far, mission accomplished, but...a little ugly.

Fennel bean shape:

After painstaking study, I discovered that DataFrame has a shape function. This is a fantastic function, an example is as follows:

>>> df.shape
(3, 3)

So, you can get the result like this:

>>> df.shape[0]
3

It's awesome, it's amazing.
But shape gets two numbers, and we only need one number. Isn’t it a bit wasteful here?

Fennel beans three len:

Python has a built-in len, and generally speaking, built-in things are always more advanced. Let’s try it:

>>> len(df)
3

So what is behind this len? Check it out in IPython:

In [1]: df.__len__??
Signature: df.__len__() -> int
Source:
    def __len__(self) -> int:
        """
        Returns length of info axis, but here we use the index.
        """
        return len(self.index)

What is the shape above?

In [2]: df.shape??
Type:        property
Source:
# df.shape.fget
@property
def shape(self) -> Tuple[int, int]:
    """
    Return a tuple representing the dimensionality of the DataFrame.
    ......
    """
    return len(self.index), len(self.columns)

Fennel beans four index:

As can be seen from the above two source codes, we should use len like this:

>>> len(df.index)
3

Fennel beans five and three more:

Outside the mountains and green mountains, there are always fennel beans in Python. Three more:

df.index.size
len(df.axes[0])
df.pipe(len)

Guess you like

Origin blog.csdn.net/mouse2018/article/details/113619187