A collection of things you need to know about machine learning

Building a collection using Python

Sets are composed of elements, and the basic concept is that they are unordered and each element is unique.

Use {} to create a collection

>>> lang={'Python','C','Java'}
>>> lang
{'C', 'Python', 'Java'}
>>> A={1,2,3,4,5}
>>> A
{1, 2, 3, 4, 5}
>>>

Collection elements are unique

Because set elements are unique, even if there are duplicate elements when creating the set, only one copy will be retained.

>>> A={1,1,2,2,3,3,3}
>>> A
{1, 2, 3}
>>>

Use set() to create a collection

The parameter of the set() function can only have one element, and the content of this element can be a string, a list, a tuple, a dictionary, etc.

>>> A=set('Deepmind')
>>> A
{'e', 'n', 'm', 'd', 'D', 'p', 'i'}
>>> A=set(['Python','Java','C'])
>>> A
{'C', 'Python', 'Java'}
>>>

cardinality of a set

The so-called cardinality of a set refers to the number of elements in the set, which can be obtained using the len() function.

>>> A={1,3,5,7,9}
>>> len(A)
5
>>>

To create an empty collection, use set()

If {} is used, an empty dictionary will be created. Set() must be used to create an empty collection.

>>> empty_dict={}
>>> print("打印类=",type(empty_dict))
打印类= <class 'dict'>
>>> empty_set=set()
>>> print("打印类=",type(empty_set))
打印类= <class 'set'>
>>>

code show as below:

empty_dict = {}                      # 这是建立空字典
print("打印类 = ", type(empty_dict))
empty_set = set()                    # 这是建立空集合
print("打印类 = ", type(empty_set))

The running results are as follows: 

[Running] python -u "c:\Users\a-xiaobodou\OneDrive - Microsoft\Projects\tempCodeRunnerFile.py"
打印类 =  <class 'dict'>
打印类 =  <class 'set'>

[Done] exited with code=0 in 0.286 seconds

Big data and collection applications

Remove duplicate data from the list.

fruits1 = ['apple', 'orange', 'apple', 'banana', 'orange']
x = set(fruits1)                # 将列转成集合
fruits2 = list(x)               # 将集合转成列表
print("原先列表数据fruits1 = ", fruits1)
print("新的列表资料fruits2 = ", fruits2)

The running results are as follows:

[Running] python -u "c:\Users\a-xiaobodou\OneDrive - Microsoft\Projects\ch11_2.py"
原先列表数据fruits1 =  ['apple', 'orange', 'apple', 'banana', 'orange']
新的列表资料fruits2 =  ['apple', 'orange', 'banana']

[Done] exited with code=0 in 0.207 seconds

Collection operations

Python symbols illustrate method
& intersection intersection()
| Union union()
- difference set difference()
^ Symmetric difference set symmetric_difference()

intersection

There are two sets A and B, if the same elements are obtained, intersection is used.

The mathematical symbol for intersection is \cap. The intersection symbol in the Python language is &, and you can also use the intersection() method to complete this work.

There are two summer camps in mathematics and physics. This app will list the members who participate in these two summer camps at the same time.

math = {'Kevin', 'Peter', 'Eric'}       # 设定参加数学夏令营成员
physics = {'Peter', 'Nelson', 'Tom'}    # 设定参加物理夏令营成员
both1 = math & physics
print("同时参加数学与物理夏令营的成员 ",both1)
both2 = math.intersection(physics)
print("同时参加数学与物理夏令营的成员 ",both2)

Results of the:

[Running] python -u "c:\Users\a-xiaobodou\OneDrive - Microsoft\Projects\tempCodeRunnerFile.py"
同时参加数学与物理夏令营的成员  {'Peter'}
同时参加数学与物理夏令营的成员  {'Peter'}

[Done] exited with code=0 in 0.663 seconds

Union

There are two sets A and B. If all elements are obtained, the union is used.

The mathematical symbol for union is \cup. The union symbol in the Python language is |, and you can also use the union() method to complete this work.

There are 2 summer camps in Mathematics and Physics. This app will list the members who attended the Mathematics or Physics summer camp.

math = {'Kevin', 'Peter', 'Eric'}       # 设定参加数学夏令营成员
physics = {'Peter', 'Nelson', 'Tom'}    # 设定参加物理夏令营成员
allmember1 = math | physics
print("参加数学或物理夏令营的成员 ",allmember1)
allmember2 = math.union(physics)
print("参加数学或物理夏令营的成员 ",allmember2)

Results of the:

[Running] python -u "c:\Users\a-xiaobodou\OneDrive - Microsoft\Projects\tempCodeRunnerFile.py"
参加数学或物理夏令营的成员  {'Kevin', 'Eric', 'Peter', 'Nelson', 'Tom'}
参加数学或物理夏令营的成员  {'Kevin', 'Eric', 'Peter', 'Nelson', 'Tom'}

[Done] exited with code=0 in 1.882 seconds

difference

There are two sets A and B. If you obtain elements that belong to set A but do not belong to set B, use difference set (AB). If you obtain elements that belong to set B but do not belong to set A, use difference set (BA).

The difference symbol in the Python language is -, and you can also use the difference() method to complete this work.

There are 2 summer camps, Mathematics and Physics. This app will list all the members who attended the Mathematics summer camp but did not attend the Physics summer camp. Also listed are all members who attended physics camp but did not attend math camp.

math = {'Kevin', 'Peter', 'Eric'}       # 设定参加数学夏令营成员
physics = {'Peter', 'Nelson', 'Tom'}    # 设定参加物理夏令营成员
math_only1 = math - physics
print("参加数学夏令营同时没有参加物理夏令营的成员 ",math_only1)
math_only2 = math.difference(physics)
print("参加数学夏令营同时没有参加物理夏令营的成员 ",math_only2)
physics_only1 = physics - math
print("参加物理夏令营同时没有参加数学夏令营的成员 ",physics_only1)
physics_only2 = physics.difference(math)
print("参加物理夏令营同时没有参加数学夏令营的成员 ",physics_only2)

Results of the:

[Running] python -u "c:\Users\a-xiaobodou\OneDrive - Microsoft\Projects\tempCodeRunnerFile.py"
参加数学夏令营同时没有参加物理夏令营的成员  {'Kevin', 'Eric'}
参加数学夏令营同时没有参加物理夏令营的成员  {'Kevin', 'Eric'}
参加物理夏令营同时没有参加数学夏令营的成员  {'Nelson', 'Tom'}
参加物理夏令营同时没有参加数学夏令营的成员  {'Nelson', 'Tom'}

[Done] exited with code=0 in 0.603 seconds

Symmetric difference

There are two sets A and B. If you obtain elements that belong to set A or set B but do not have attributes A and B at the same time, use a symmetric difference set.

The symmetric difference symbol in the Python language is ^, and you can also use the symmetric_difference() method to complete this work.

There are 2 summer camps in Mathematics and Physics. This app will list members who did not attend both summer camps.

math = {'Kevin', 'Peter', 'Eric'}       # 设定参加数学夏令营成员
physics = {'Peter', 'Nelson', 'Tom'}    # 设定参加物理夏令营成员
math_sydi_physics1 = math ^ physics
print("没有同时参加数学和物理夏令营的成员 ",math_sydi_physics1)
math_sydi_physics2 = math.symmetric_difference(physics)
print("没有同时参加数学和物理夏令营的成员 ",math_sydi_physics2)

Results of the:

[Running] python -u "c:\Users\a-xiaobodou\OneDrive - Microsoft\Projects\tempCodeRunnerFile.py"
没有同时参加数学和物理夏令营的成员  {'Nelson', 'Tom', 'Eric', 'Kevin'}
没有同时参加数学和物理夏令营的成员  {'Nelson', 'Tom', 'Eric', 'Kevin'}

[Done] exited with code=0 in 0.37 seconds

Subsets, supersets and complements

The content of set A is {1, 2, 3, 4, 5, 6}, and the content of set B is {1, 3, 5}.

Subset

All elements of set B are in set A. We call set B a subset of set A. The mathematical representation is as follows:

A\supset B Or B\subset A        #B is included in A, you can use the A>B syntax

A\supseteq BOr B\subseteq A        #B is included in or equal to A, you can use the A>=b syntax

The empty set is a subset of any set, and a set is a subset of itself.

>>> A={1,2,3}
>>> B=set()
>>> B<=A
True
>>> A<=A
True
>>>

Use the <= symbol or the issubset() function to test whether B is a subset of A. If so, return True, otherwise return False.

>>> A={1,2,3,4,5,6}
>>> B={1,3,5}
>>> B<=A
True
>>> B.issubset(A)
True
>>>

superset

All elements of set B are in set A, and set A is a superset of set B.

Use the >= symbol or the issuperset() function to test whether A is a superset of B. If so, return True, otherwise return False.

>>> A={1,2,3,4,5,6}
>>> B={1,3,5}
>>> A>=B
True
>>> A.issuperset(B)
True
>>>

Complement

The elements whose attributes are in the set A but are not in the set B are called the complement of B in A.

Use AB to get the result.

>>> A={1,2,3,4,5,6}
>>> B={1,3,5}
>>> A-B
{2, 4, 6}
>>>

Adding and removing collection elements

method illustrate Example
add() Add elements A.add('element')
remove() Delete element A.remove('element')
pop() Randomly remove elements and post back A.pop()
clear() Remove all elements A.clear()

Example of adding elements:

>>> A={1,2,5}
>>> A.add(3)
>>> A
{1, 2, 3, 5}
>>>

Example of removing elements:

>>> A={1,2,3}
>>> A.remove(2)
>>> A
{1, 3}
>>>

Example of randomly deleting elements and returning them:

>>> A={1,2,3}
>>> ret=A.pop()
>>> A
{2, 3}
>>> ret
1
>>>

Remove all elements:

>>> A={1,2,3}
>>> A.clear()
>>> A
set()
>>>

Power set and sympy module

A power set refers to a set composed of all subsets of a set.

sympy modules and collections

The sympy module can create collections. Before use, you need to import the collection-related methods of this module:

from sympy import FiniteSet

The FiniteSet() method can create a set. The following is an example of creating a set {1, 2, 3}.

>>> from sympy import FiniteSet
>>> A=FiniteSet(1,2,3)
>>> A
{1, 2, 3}
>>>

Build a power set

You can use powerset() to establish the power set of the set and continue the previous example to perform the following operations.

>>> a=A.powerset()
>>> a
FiniteSet(EmptySet, {1}, {2}, {3}, {1, 2}, {1, 3}, {2, 3}, {1, 2, 3})
>>>

The number of elements in a power set

If a set has n elements, the number of elements in the power set of this set is 2^n.

Cartesian product

Set multiplication

The so-called Cartesian product refers to all possible sets composed of extracting an element from each set. To establish a Cartesian product, you can use the multiplication symbol *. At this time, the content of the element created is a tuple. ).

There are 2 sets, both of which have 2 elements. Create this Cartesian product.

from sympy import *
A = FiniteSet('a', 'b')
B = FiniteSet('c', 'd')
AB = A * B
for ab in AB:
    print(type(ab), ab)

operation result:

[Running] python -u "c:\Users\a-xiaobodou\OneDrive - Microsoft\Projects\tempCodeRunnerFile.py"
<class 'tuple'> (a, c)
<class 'tuple'> (b, c)
<class 'tuple'> (a, d)
<class 'tuple'> (b, d)

[Done] exited with code=0 in 5.545 seconds

There are 2 sets, these 2 sets have 5 elements and 2 elements respectively, establish this Cartesian product.

from sympy import *
A = FiniteSet('a', 'b', 'c', 'd', 'e')
B = FiniteSet('f', 'g')
AB = A * B
print('The length of Cartesian product', len(AB))
for ab in AB:
    print(ab)

operation result:

[Running] python -u "c:\Users\a-xiaobodou\OneDrive - Microsoft\Projects\tempCodeRunnerFile.py"
The length of Cartesian product 10
(a, f)
(b, f)
(a, g)
(c, f)
(b, g)
(d, f)
(c, g)
(e, f)
(d, g)
(e, g)

[Done] exited with code=0 in 12.589 seconds

Set raised to the nth power

Assume that the set A has 2 elements. If the Cartesian product of cubes is required, the number of elements to be established is 2^3. The nth power represents a tuple composed of n elements. The number of elements created at this time is 2^n.

Constructs the Cartesian product of cubes.

from sympy import *
A = FiniteSet('a', 'b')
AAA = A**3
print('The length of Cartesian product', len(AAA))
for a in AAA:
    print(a)

The running results are as follows:

[Running] python -u "c:\Users\a-xiaobodou\OneDrive - Microsoft\Projects\tempCodeRunnerFile.py"
The length of Cartesian product 8
(a, a, a)
(b, a, a)
(a, b, a)
(b, b, a)
(a, a, b)
(b, a, b)
(a, b, b)
(b, b, b)

[Done] exited with code=0 in 6.545 seconds

Guess you like

Origin blog.csdn.net/DXB2021/article/details/127162698