Python | Get unique values from a list

https://www.geeksforgeeks.org/python-get-unique-values-list/

If we need to keep the elements order, how about this:

used = set()
mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
unique = [x for x in mylist if x not in used and (used.add(x) or True)]

And one more solution using reduce and without the temporary used var.

mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
unique = reduce(lambda l, x: l.append(x) or l if x not in l else l, mylist, [])

UPDATE - March, 2019

And a 3rd solution, which is a neat one, but kind of slow since .index is O(n).

mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
unique = [x for i, x in enumerate(mylist) if i == mylist.index(x)]

UPDATE - Oct, 2016

Another solution with reduce, but this time without .append which makes it more human readable and easier to understand.

mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
unique = reduce(lambda l, x: l+[x] if x not in l else l, mylist, [])
#which can also be writed as:
unique = reduce(lambda l, x: l if x in l else l+[x], mylist, [])

NOTE: Have in mind that more human-readable we get, more unperformant the script is.

import timeit

setup = "mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']"

#10x to Michael for pointing out that we can get faster with set()
timeit.timeit('[x for x in mylist if x not in used and (used.add(x) or True)]', setup='used = set();'+setup)
0.4188511371612549

timeit.timeit('[x for x in mylist if x not in used and (used.append(x) or True)]', setup='used = [];'+setup)
0.6157128810882568

timeit.timeit('reduce(lambda l, x: l.append(x) or l if x not in l else l, mylist, [])', setup=setup)
1.8778090476989746

timeit.timeit('reduce(lambda l, x: l+[x] if x not in l else l, mylist, [])', setup=setup)
2.13108491897583

timeit.timeit('reduce(lambda l, x: l if x in l else l+[x], mylist, [])', setup=setup)
2.207760810852051

timeit.timeit('[x for i, x in enumerate(mylist) if i == mylist.index(x)]', setup=setup)
2.3621110916137695

ANSWERING COMMENTS

Because @monica asked a good question about "how is this working?". For everyone having problems figuring it out. I will try to give a more deep explanation about how this works and what sorcery is happening here ;)

So she first asked:

I try to understand why unique = [used.append(x) for x in mylist if x not in used] is not working.

Well it's actually working

>>> used = []
>>> mylist = [u'nowplaying', u'PBS', u'PBS', u'nowplaying', u'job', u'debate', u'thenandnow']
>>> unique = [used.append(x) for x in mylist if x not in used]
>>> print used
[u'nowplaying', u'PBS', u'job', u'debate', u'thenandnow']
>>> print unique
[None, None, None, None, None]

The problem is that we are just not getting the desired results inside the unique variable, but only inside the used variable. This is because during the list comprehension .append modifies the used variable and returns None.

So in order to get the results into the unique variable, and still use the same logic with .append(x) if x not in used, we need to move this .append call on the right side of the list comprehension and just return x on the left side.

But if we are too naive and just go with:

>>> unique = [x for x in mylist if x not in used and used.append(x)]
>>> print unique
[]

We will get nothing in return.

Again, this is because the .append method returns None, and it this gives on our logical expression the following look:

x not in used and None

This will basically always:

  1. evaluates to False when x is in used,
  2. evaluates to None when x is not in used.

And in both cases (False/None), this will be treated as falsy value and we will get an empty list as a result.

But why this evaluates to None when x is not in used? Someone may ask.

Well it's because this is how Python's short-circuit operators works.

The expression x and y first evaluates x; if x is false, its value is returned; otherwise, y is evaluated and the resulting value is returned.

So when x is not in used (i.e. when its True) the next part or the expression will be evaluated (used.append(x)) and its value (None) will be returned.

But that's what we want in order to get the unique elements from a list with duplicates, we want to .append them into a new list only when we they came across for a fist time.

So we really want to evaluate used.append(x) only when x is not in used, maybe if there is a way to turn this None value into a truthy one we will be fine, right?

Well, yes and here is where the 2nd type of short-circuit operators come to play.

The expression x or y first evaluates x; if x is true, its value is returned; otherwise, y is evaluated and the resulting value is returned.

We know that .append(x) will always be falsy, so if we just add one or next to him, we will always get the next part. That's why we write:

x not in used and (used.append(x) or True)

so we can evaluate used.append(x) and get True as a result, only when the first part of the expression (x not in used) is True.

Similar fashion can be seen in the 2nd approach with the reduce method.

(l.append(x) or l) if x not in l else l
#similar as the above, but maybe more readable
#we return l unchanged when x is in l
#we append x to l and return l when x is not in l
l if x in l else (l.append(x) or l)

where we:

  1. Append x to l and return that l when x is not in l. Thanks to the or statement .append is evaluated and l is returned after that.
  2. Return l untouched when x is in l


Given a list, print all the unique numbers in any order.

Examples:

Input : 10 20 10 30 40 40
Output : 10 20 30 40 

Input : 1 2 1 1 3 4 3 3 5 
Output : 1 2 3 4 5 

Recommended: Please try your approach on {IDE} first, before moving on to the solution.

Method 1 : Traversal of list

Using traversal, we can traverse for every element in the list and check if the element is in the unique_list already if it is not over there, then we can append it in the unique_list. This is done using one for loop and other if statement which check if the value is in the unique list or not which is equivalent to another for loop.

filter_none

edit

play_arrow

brightness_4

# Python program to check if two 

# to get unique values from list

# using traversal 

  

# function to get unique values

def unique(list1):

  

    # intilize a null list

    unique_list = []

      

    # traverse for all elements

    for x in list1:

        # check if exists in unique_list or not

        if x not in unique_list:

            unique_list.append(x)

    # print list

    for x in unique_list:

        print x,

      

    

  

# driver code

list1 = [10, 20, 10, 30, 40, 40]

print("the unique values from 1st list is")

unique(list1)

  

  

list2 =[1, 2, 1, 1, 3, 4, 3, 3, 5]

print("\nthe unique values from 2nd list is")

unique(list2)

Output:

the unique values from 1st list is
10 20 30 40 
the unique values from 2nd list is
1 2 3 4 5

Method 2 : Using Set

Using set() property of Python, we can easily check for the unique values. Insert the values of the list in a set. Set only stores a value once even if it is inserted more then once. After inserting all the values in the set by list_set=set(list1), convert this set to a list to print it.

filter_none

edit

play_arrow

brightness_4

# Python program to check if two 

# to get unique values from list

# using set 

  

# function to get unique values

def unique(list1):

      

    # insert the list to the set

    list_set = set(list1)

    # convert the set to the list

    unique_list = (list(list_set))

    for x in unique_list:

        print x,

      

  

# driver code

list1 = [10, 20, 10, 30, 40, 40]

print("the unique values from 1st list is")

unique(list1)

  

  

list2 =[1, 2, 1, 1, 3, 4, 3, 3, 5]

print("\nthe unique values from 2nd list is")

unique(list2)

Output:

the unique values from 1st list is
40 10 20 30 
the unique values from 2nd list is
1 2 3 4 5

Method 3 : Using numpy.unique

Using Python’s import numpy, the unique elements in the array are also obtained. In first step convert the list to x=numpy.array(list) and then use numpy.unique(x) function to get the unique values from the list. numpy.unique() returns only the unique values in the list.

filter_none

edit

play_arrow

brightness_4

#Ppython program to check if two 

# to get unique values from list

# using numpy.unique 

import numpy as np

  

# function to get unique values

def unique(list1):

    x = np.array(list1)

    print(np.unique(x))

      

  

# driver code

list1 = [10, 20, 10, 30, 40, 40]

print("the unique values from 1st list is")

unique(list1)

  

  

list2 =[1, 2, 1, 1, 3, 4, 3, 3, 5]

print("\nthe unique values from 2nd list is")

unique(list2)

Output

the unique values from 1st list is
[10 20 30 40]

the unique values from 2nd list is
[1 2 3 4 5]
发布了36 篇原创文章 · 获赞 21 · 访问量 2万+

猜你喜欢

转载自blog.csdn.net/weixin_39833509/article/details/105174941