set objects

今天来梳理一下Python set 类型的知识点,特此申明下面信息均参考自公司培训课PPT Nagiza F. Samatova, NC State Univ. All rights,主要涉及到下面几个点:
在这里插入图片描述

  • • Mutable unordered collection of objects
  • • Items: mutable, heterogeneous type & unique
  • • Operations and methods: create, update, access, query, remove
  • • Traversal: by item with in-operator

Set is a mutable collection of immutable objects.
set是由不可修改的元素组合成的一个可修改的集合

set_obj = {
    
    1,2,3}
tuple_set_obj = {
    
    (1,2,3), (4,5,6)}
print('set_obj: {}'.format(set_obj))
print('tuple_set_obj: {}'.format(tuple_set_obj))
list_set_obj = {
    
    [1,2,3]}
print('list_set_obj: {}'.format(list_set_obj))

set中不能有list类型的object,因为list是mutable,不可hash
Items must be hashable (int, float, str, tuple)

# output:
set_obj: {
    
    1, 2, 3}
tuple_set_obj: {
    
    (1, 2, 3), (4, 5, 6)}

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-563-4e03cceaee62> in <module>
      3 print('set_obj: {}'.format(set_obj))
      4 print('tuple_set_obj: {}'.format(tuple_set_obj))
----> 5 list_set_obj = {
    
    [1,2,3]}
      6 print('list_set_obj: {}'.format(list_set_obj))

TypeError: unhashable type: 'list'

Set: No Duplicate Objects
可以参考文章《set去重原理

Each set item is being hashed:
• i.e., mapped to a hash value
• for fast membership checks

Duplicates get the same hash value
• i.e., eliminated

set_obj = {
    
    1,1,1}
tuple_set_obj = {
    
    (1,2,3), (1,2,3)}
print('set_obj: {}'.format(set_obj))
print('tuple_set_obj: {}'.format(tuple_set_obj))

#output:
set_obj: {
    
    1}
tuple_set_obj: {
    
    (1, 2, 3)}

Set is an unordered collection
所以set是不支持随机index访问的

CREATE A SET OBJECT

empty

s = {}
s= set()

via coercionof another—强制类型转换

s = { any_type_immutable_object }
s = set (any_type_mutable_or_immutable_object )

str to set

str_obj = 'python'
set_via_symbol = {
    
    str_obj}
set_via_constructor = set(str_obj)
print('set_via_symbol: {}'.format(set_via_symbol))
print('set_via_constructor: {}'.format(set_via_constructor))

# output:
set_via_symbol: {
    
    'python'}
set_via_constructor: {
    
    'h', 'o', 'n', 't', 'y', 'p'}

tuple to set

tuple_obj = ('one', 'two')
set_via_symbol = {
    
    tuple_obj}
set_via_constructor = set(tuple_obj)
print('set_via_symbol: {}'.format(set_via_symbol))
print('set_via_constructor: {}'.format(set_via_constructor))

# output:
set_via_symbol: {
    
    ('one', 'two')}
set_via_constructor: {
    
    'two', 'one'}

list to set
注意list不支持{list}->set,因为set不能有mutable object

list_obj = ['one', 'two']
# set_via_symbol = {
    
    list_obj} 
set_via_constructor = set(list_obj)
# print('set_via_symbol: {}'.format(set_via_symbol))
print('set_via_constructor: {}'.format(set_via_constructor))

# output:
set_via_constructor: {
    
    'two', 'one'}

dict to set
注意dict不支持{dict}->set,因为set不能有mutable object

dict_obj = {
    
    'script': 'python', 'version': '3.8'}
# set_via_symbol = {
    
    dict_obj} 
set_via_constructor = set(dict_obj)
# print('set_via_symbol: {}'.format(set_via_symbol))
print('set_via_constructor: {}'.format(set_via_constructor))

# output:
set_via_constructor: {
    
    'version', 'script'}

Set Operations

UNION, INTERSECTION, DIFFERENCE, SYMMETRIC DIFFERENCE

在这里插入图片描述

set_obj = {
    
    1,2,3,4,5,6,7,8}
set_obj2 = {
    
    1,3,5,7,9}
set_difference = set_obj - set_obj2
set_union = set_obj | set_obj2
set_intersection = set_obj & set_obj2
set_symmetric_difference = set_obj ^ set_obj2
print('set_difference: {}'.format(set_difference))
print('set_union: {}'.format(set_union))
print('set_intersection: {}'.format(set_intersection))
print('set_symmetric_difference: {}'.format(set_symmetric_difference))

Sets are Mutable: Can change their value

# output:
set_difference: {
    
    8, 2, 4, 6}
set_union: {
    
    1, 2, 3, 4, 5, 6, 7, 8, 9}
set_intersection: {
    
    1, 3, 5, 7}
set_symmetric_difference: {
    
    2, 4, 6, 8, 9}

Sets: Other Methods

s = {
    
    1, 3, 7, 9} 
s2 = {
    
    3, 7, 8, 12} 
Operation Description Output
s.remove(7) Remove element {1, 3, 9}
s.copy() Return copy of the set {1, 3, 7, 9}
s.add(10) Add given element to the set {1, 3, 7, 9, 10}
s.isdisjoint(s2) Return True if sets have NO common elements False
s.intersection(s2) Return new set with common elements to another one {3, 7}
s = {
    
    1,2,3,4,5,7}
print('original s:\n id: {}\t set content:{}'.format(id(s), s))
print('copy s:\n id: {}\t set content:{}'.format(id(s.copy()), s.copy()))
s.remove(2)
print('after remove 2:\nid: {}\t set content:{}'.format(id(s), s))
s.add(8)
print('after add 8:\nid: {}\t set content:{}'.format(id(s), s))

Sets are Mutable: Can change their value
remove, add后set的id依然不变
copy是返回另外一个set对象

# output:
original s:
 id: 2239126441536	 set content:{
    
    1, 2, 3, 4, 5, 7}
copy s:
 id: 2239126982720	 set content:{
    
    1, 2, 3, 4, 5, 7}
after remove 2:
id: 2239126441536	 set content:{
    
    1, 3, 4, 5, 7}
after add 8:
id: 2239126441536	 set content:{
    
    1, 3, 4, 5, 7, 8}

QUERY AND ITERATE
in operator

● to check if an item is in a set
● Does NOT require traversal of the entire set (fast operation)

s = {
    
    'python'} 
print('python is in set: {}'.format('python' in s))
print('java is in set: {}'.format('java' in s))

# output:
python is in set: True
java is in set: False

Unsupported Operations

Concatenation with (+)

s1 = {
    
    1,2,3}
s2 = {
    
    4,5}
s = s1 + s2
# output:
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-590-a84a0ef792a5> in <module>
      1 s1 = {
    
    1,2,3}
      2 s2 = {
    
    4,5}
----> 3 s = s1 + s2

TypeError: unsupported operand type(s) for +: 'set' and 'set'

Replication with (*)

s1 = {
    
    1,2,3}
s = s1 * 2
# output:
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-591-75b9dae5da0b> in <module>
      1 s1 = {
    
    1,2,3}
----> 2 s = s1 * 2

TypeError: unsupported operand type(s) for *: 'set' and 'int

Set Traversal

Approaches: by item

s1 = {
    
    1,2,3}
for item in s1:
    print(item)
# output:
1
2
3

PERFORMANCE

Optimized for

● Store a unique collection of objects
● Add new items
● Remove / discard any item: fast for sets compared to lists / tuples
● Remove duplicate items: fast for sets compared to lists
● Check membership using in operator: faster compared to lists / tuples
● Data wrangling operations fast for sets compared to lists / tuples: Unions, Intersections, Symmetric difference

猜你喜欢

转载自blog.csdn.net/wumingxiaoyao/article/details/109003456
set