What are the differences from list(df['column']) and df['column'].to_list()?

igorkf :

When I want a list from a DataFrame column (pandas 1.0.1), I can do:

 df['column'].to_list()

or I can use:

list(df['column'])

The two alternatives works well, but what are the differences between them?
Is one method better than the other?

rafaelc :

list receives an iterable and returns a pure python list. It is a built-in python way to convert any iterable into a pure python list.

to_list is a method from the core pandas object classes which converts their objects to pure python lists. The difference is that the implementation is done by pandas core developers, which may optimize the process according to their understanding, and/or add extra functionalities in the conversion that a pure list(....) wouldn't do.

For example, the source_code for this piece is:

def tolist(self):
    '''(...)
    '''
    if self.dtype.kind in ["m", "M"]:
        return [com.maybe_box_datetimelike(x) for x in self._values]
    elif is_extension_array_dtype(self._values):
        return list(self._values)
    else:
        return self._values.tolist()

Which basically means to_list will likely end up using either a normal list comprehension - analogous to list(...) but enforcing that the final objects are of panda's datetime type instead of python's datetime -; a straight pure list(...) conversion; or using numpy's tolist() implementation.

The differences between the latter and python's list(...) can be found in this thread.

Guess you like

Origin http://43.154.161.224:23101/article/api/json?id=198033&siteId=1