python——zip()、map()、enumerate()、filter()、apply函数

文章目录

1.zip()函数—返回元祖对象

描述:
zip() 函数用于将可迭代的对象作为参数，将对象中对应的元素打包成一个个元组，然后返回由这些元组组成的对象，
这样做的好处是节约了不少的内存。
我们可以使用 list() 转换来输出列表。
如果各个迭代器的元素个数不一致，则返回列表长度与最短的对象相同，利用 * 号操作符，可以将元组解压为列表

zip语法:
zip([iterable, ...])

参数:
iterabl -- 一个或多个迭代器

返回值:
对象

实例

a = [1,2,3]
>>> b = [4,5,6]
>>> c = [4,5,6,7,8]
>>> zipped = zip(a,b)     # 返回一个对象
>>> zipped
<zip object at 0x103abc288>
>>> list(zipped)  # list() 转换为列表
[(1, 4), (2, 5), (3, 6)]
>>> list(zip(a,c))              # 元素个数与最短的列表一致
[(1, 4), (2, 5), (3, 6)]
 
>>> a1, a2 = zip(*zip(a,b))          # 与 zip 相反，zip(*) 可理解为解码，返回二维矩阵式
>>> list(a1)
[1, 2, 3]
>>> list(a2)
[4, 5, 6]

zip进行数据处理

import random

X = [1, 2, 3, 4, 5, 6]
y = [0, 1, 0, 0, 1, 1]
zipped_data = list(zip(X, y))
# 将样本和标签一 一对应组合起来,并转换成list类型方便后续打乱操作

random.shuffle(zipped_data)
# 使用random模块中的shuffle函数打乱列表，原地操作，没有返回值

print('zipped_data:', zipped_data)

new_zipped_data = list(map(list, zip(*zipped_data)))
# zip(*)反向解压，map()逐项转换类型，list()做最后转换

new_X, new_y = new_zipped_data[0], new_zipped_data[1]
# 返回打乱后的新数据

print('X:', X, '\n', 'y:', y)
print('new_X:', new_X, '\n', 'new_y:', new_y)

zipped_data: [(4, 0), (2, 1), (5, 1), (1, 0), (3, 0), (6, 1)]
X: [1, 2, 3, 4, 5, 6] 
 y: [0, 1, 0, 0, 1, 1]
new_X: [4, 2, 5, 1, 3, 6] 
 new_y: [0, 1, 1, 0, 0, 1]

2.map()—返回迭代器对象

描述:
map() 会根据提供的函数对指定序列做映射。第一个参数 function 以参数序列中的每一个元素调用 function 函数
即将function作用于要被遍历的序列，返回map对象

map() 函数语法:
map(function, iterable, ...)

参数:
function -- 函数
iterable -- 一个或多个序列

返回值:
返回迭代器

实例

def square(x) :            
    return x ** 2
map(square, [1,2,3,4,5])
>>>
<map at 0x1acd1c94be0>

用list转换map对象

def square(x) :            # 计算平方数
    return x ** 2
A=list(map(square, [1,2,3,4,5]))   # 计算列表各个元素的平方
print(A)
>>>
[1, 4, 9, 16, 25]

print(map(lambda x: x ** 2, [1, 2, 3, 4, 5]))  # 使用 lambda 匿名函数
print(list(map(lambda x: x ** 2, [1, 2, 3, 4, 5])))
# 提供了两个列表，对相同位置的列表数据进行相加
print(map(lambda x, y: x + y, [1, 3, 5, 7, 9], [2, 4, 6, 8, 10]))
print(list(map(lambda x, y: x + y, [1, 3, 5, 7, 9], [2, 4, 6, 8, 10])))
<map object at 0x000001ACD1CA7668>
[1, 4, 9, 16, 25]
<map object at 0x000001ACD1CA7710>
[3, 7, 11, 15, 19]

3.enumerate()—返回 enumerate对象

描述:
enumerate() 函数用于将一个可遍历的数据对象(如列表、元组或字符串)组合为一个索引序列，同时列出数据和数据下标，一般用在 for 循环当中

语法:
enumerate(sequence, [start=0])

参数:
sequence -- 一个序列、迭代器或其他支持迭代对象。
start -- 下标起始位置

返回:
返回 enumerate对象

实例

>>> seasons = ['Spring', 'Summer', 'Fall', 'Winter']
>>> list(enumerate(seasons))
[(0, 'Spring'), (1, 'Summer'), (2, 'Fall'), (3, 'Winter')]
>>> list(enumerate(seasons, start=1))       # 小标从 1 开始
[(1, 'Spring'), (2, 'Summer'), (3, 'Fall'), (4, 'Winter')]

for 循环使用 enumerate

seq = ['one', 'two', 'three']
for i, element in enumerate(seq):
    print(i, element)

0 one
1 two
2 three

4.filter() 函数—返回一个迭代器对象

调用filter时并没有触发过滤过程，因为调用filter函数只是返回了一个iterator，它是惰性计算，只有next或者list的时候，才真正开始计算过程。

描述:
filter() 函数用于过滤序列，过滤掉不符合条件的元素，返回一个迭代器对象，如果要转换为列表，可以使用 list() 来转换。
该函数接收两个参数，第一个为函数，第二个为序列，序列的每个元素作为参数传递给函数进行判断，然后返回 True 或 False，最后将返回True的元素.
如果function传入None，则返回所有本身可以判断为True的元素

语法:
filter(function, iterable)

参数:
function -- 判断函数。
iterable -- 可迭代对象。

返回值:
返回一个迭代器对象

实例
过滤出列表中的所有奇数

def is_odd(n):
    return n % 2 == 1
 
tmplist = filter(is_odd, [1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
newlist = list(tmplist)
print(newlist)
[1, 3, 5, 7, 9]

判断函数为None

l = [x for x in range(0, 10)]
print(list(filter(None, l)))
[1,2,3,4,5,6,7,8,9]

filter高级用法

def _odd_iter():
    n = 1
    while True:
        n = n + 2
        yield n
        
def _not_divisible(n): 
    return lambda x : x%n>0
 
def primes():
    yield 2
    it = _odd_iter()
    ftr = filter(_not_divisible(2), it) #1
    while True:
        n = next(ftr )        #2
        yield n                
        ftr = filter(_not_divisible(n), ftr ) #3
        
for n in primes():
    if n < 100:
        print('now:',n)
    else:
        break

filter返回的是一个filter对象。#3行通过重复赋值，可以向filter中添加多个过滤器。例子中，就是通过每次执行#3行，把当前素数作为新的被除数条件加入到过滤器ftr 中，所以在for循环的后续迭代中，每次都增加一个素数条件进入过滤器。
通过这种重复赋值的方法，可以给filter添加多个过滤函数，极大的加强了过滤功能。

5.apply函数

描述:
是pandas中的函数，应用对象为pandas中的DataFrame或者Series

功能:
一是直接对DataFrame或者Series应用函数，
二是对pandas中的groupby之后的聚合对象apply函数

语法:
apply(function,axis)
 
参数:
function表明所使用的函数
axis表明对行或者列做运算

实例

import numpy as np
a = np.random.randint(low=0,high=4,size=(2,4))
data = pd.DataFrame(a)
data.apply(lambda x:x*10)

在这里插入图片描述