流畅的python，Fluent Python 第十章笔记

序列的修改、散列和切片。

书中讲了一些__getitem__还有__getattr__的一些使用等，前期我已经下了一些笔记，再次加强学习吧。

from array import array
import math
import reprlib


class Vector:
    typecode = 'd'

    def __init__(self, components):
        self._components = array(self.typecode, components)

    def __iter__(self):  # 返回一个迭代器，对象拥有__next__属性
        '''有了__iter__属性，不仅可以多变量取值，还可以被for循环使用'''
        return iter(self._components)

    def __repr__(self):
        components = reprlib.repr(self._components)    # 数量太多可以用...代替
        # print(components)
        components = components[components.find('['): -1]
        return f'{self.__class__.__name__}({components})'

    def __str__(self):
        return str(tuple(self))

    def __bytes__(self):
        return (bytes([ord(self.typecode)]) +
                bytes(self._components))

    def __eq__(self, other):
        return tuple(self) == tuple(other)

    def __abs__(self):    # abs返回一个直角三角形斜边长
        return math.sqrt(sum(x * x for x in self))

    def __bool__(self):    # 直接调用对象的abs值，然后用bool取值
        return bool(abs(self))

    @classmethod
    def frombytes(cls, octets):
        typecode = chr(octets[0])  # 先读取array的typecode
        menv = memoryview(octets[1:]).cast(typecode)
        print(menv)
        return cls(menv)



v = Vector([1, 2])

In [456]: v                                                                                        
Out[456]: array('d', [1.0, 2.0])
Vector([1.0, 2.0])

In [457]: v = Vector(range(100))                                                                   

In [458]: v                                                                                        
Out[458]: array('d', [0.0, 1.0, 2.0, 3.0, 4.0, ...])
Vector([0.0, 1.0, 2.0, 3.0, 4.0, ...])

In [459]: str(v)                                                                                   
Out[459]: '(0.0, 1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0, 10.0, 11.0, 12.0, 13.0, 14.0, 15.0, 16.0, 17.0, 18.0, 19.0, 20.0, 21.0, 22.0, 23.0, 24.0, 25.0, 26.0, 27.0, 28.0, 29.0, 30.0, 31.0, 32.0, 33.0, 34.0, 35.0, 36.0, 37.0, 38.0, 39.0, 40.0, 41.0, 42.0, 43.0, 44.0, 45.0, 46.0, 47.0, 48.0, 49.0, 50.0, 51.0, 52.0, 53.0, 54.0, 55.0, 56.0, 57.0, 58.0, 59.0, 60.0, 61.0, 62.0, 63.0, 64.0, 65.0, 66.0, 67.0, 68.0, 69.0, 70.0, 71.0, 72.0, 73.0, 74.0, 75.0, 76.0, 77.0, 78.0, 79.0, 80.0, 81.0, 82.0, 83.0, 84.0, 85.0, 86.0, 87.0, 88.0, 89.0, 90.0, 91.0, 92.0, 93.0, 94.0, 95.0, 96.0, 97.0, 98.0, 99.0)'

In [468]: v.frombytes(bytes(v))                                                                    
<memory at 0x108578940>
Out[468]: array('d', [0.0, 1.0, 2.0, 3.0, 4.0, ...])
Vector([0.0, 1.0, 2.0, 3.0, 4.0, ...])

In [469]:

10.3 协议和鸭子类型

Python中创建完善的序列类型无需使用继承，只需实现符合序列协议的方法。

在面向对象的编程中，协议是非正式的接口，只在文档中定义，在代码中不定义。

列如，Python的序列协议只需要__len__和__getitem__的两个方法。

任何类，只要使用标准的签名和语义实现了这两个方法，就能用在任何期待序列的地方。

10.4Vector类第二版：可切片的序列，切片原理。

    def __getitem__(self, index):
        return self._components[index]

    def __len__(self):
        return len(self._components)

添加两个方法，实现序列协议。

In [470]: v=Vector(range(10))                                                                      

In [471]: v                                                                                        
Out[471]: Vector([0.0, 1.0, 2.0, 3.0, 4.0, ...])

In [472]: v[3:5]                                                                                   
Out[472]: array('d', [3.0, 4.0])

In [473]: len(v)                                                                                   
Out[473]: 10

下面来了解切换的原理。

In [475]: s= MySeq()                                                                               

In [476]: s[1]                                                                                     
Out[476]: 1

In [477]: s[1:2]                                                                                   
Out[477]: slice(1, 2, None)

In [478]: s[1:5:2]                                                                                 
Out[478]: slice(1, 5, 2)

In [479]: s[1:2:2]                                                                                 
Out[479]: slice(1, 2, 2)

In [480]: s[1:1:2]                                                                                 
Out[480]: slice(1, 1, 2)

In [481]: s[1:2,3]                                                                                 
Out[481]: (slice(1, 2, None), 3)

In [482]: s[1:2,3:8]                                                                               
Out[482]: (slice(1, 2, None), slice(3, 8, None))

In [483]:

除了单个数字的时候，然会数字，另外的时候都返回了slice的实例。

In [483]: dir(slice)                                                                               
Out[483]: 
['__class__',
 '__delattr__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 'indices',
 'start',
 'step',
 'stop']

'indices',帮我们优雅地处理确实索引和负数索引,已经长度超过目标缩影的切片

In [484]: slice(None, 10, 2).indices(3)                                                            
Out[484]: (0, 3, 2)

In [485]: slice(-3,None).indices(10)                                                               
Out[485]: (7, 10, 1)

In [486]: slice(-3,None).indices(5)                                                                
Out[486]: (2, 5, 1)

In [1]: 'abcde'[:10:2]                                                                    
Out[1]: 'ace'

In [2]: 'abcde'[-3:]                                                                      
Out[2]: 'cde'

In [3]: 'abcde'[2:5:1]                                                                    
Out[3]: 'cde'

这么来看，我们很多使用切片时的参数没写，都是靠indices在默认工作，里面的参数应该是len(对象)的长度。

10.4.2 能处理切片的__getitem__方法。

前面通过切片返回的数组太low,现在先通过切片返回一个对象,单数字还是返回数值。

    def __getitem__(self, index):
        cls = type(self)
        if isinstance(index, slice):
            return cls(self._components[index])
        elif isinstance(index, numbers.Integral):
            return self._components[index]
        else:
            msg = '{cls.__name__} indices must be integers'
            raise TypeError(msg.format(cls = cls))

In [496]: v = Vector(range(10))                                                                    

In [497]: v[1:3]                                                                                   
Out[497]: Vector([1.0, 2.0])

In [498]: v[4]                                                                                     
Out[498]: 4.0

In [499]: v[4,5]                                                                                   
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-499-0f2464134463> in <module>
----> 1 v[4,5]

<ipython-input-495-3d153cb2c4de> in __getitem__(self, index)
     45         else:
     46             msg = '{cls.__name__} indices must be integers'
---> 47             raise TypeError(msg.format(cls = cls))
     48 
     49     def __len__(self):

TypeError: Vector indices must be integers

10.5Vector类第3版：动态存取属性

这里，老表又想通过v.x获取v[0],v.y获取v[1]等等

老表厉害，

Python对象获取属性

首先通过__getattriburte__查寻自身是否拥有该属性，

没有查找对象的类里面有没有这个属性，

没有的话，按照继承树继续查找，

妈的，还是找不到，就到了__getattr__里面来了。

    shortcut_name = 'xyzt'            # 定义在__getattr__里面也可以，定义在外面就可以修改了
    def __getattr__(self, index):
        if len(index) == 1:
            pos = self.shortcut_name.find(index)
            if 0 <= pos < len(self._components):
                return self._components[pos]
            msg = '{.__name__!r} object has no attribute {!r}'
            raise AttributeError(msg.format(self.__class__, index))

代码跟书中稍微有一点点不一样，装逼了一下，实例可以修改shortcut_name

In [517]: v = Vector(range(3))                                                                     

In [518]: v.x                                                                                      
Out[518]: 0.0

In [519]: v.y                                                                                      
Out[519]: 1.0

In [520]: v.t                                                                                      
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-520-ebe607c98c18> in <module>
----> 1 v.t

<ipython-input-508-4a33821354a0> in __getattr__(self, index)
     54                 return self._components[pos]
     55             msg = '{.__name__!r} object has no attribute {!r}'
---> 56             raise AttributeError(msg.format(self.__class__, index))
     57 
     58     def __len__(self):

AttributeError: 'Vector' object has no attribute 't'

In [521]: v.shortcut_name='abc'                                                                    

In [522]: v.b                                                                                      
Out[522]: 1.0

下面来给有意思的。

In [521]: v.shortcut_name='abc'                                                                    

In [522]: v.b                                                                                      
Out[522]: 1.0

In [523]: v.a                                                                                      
Out[523]: 0.0

In [524]: v.a = 10                                                                                 

In [525]: v.a                                                                                      
Out[525]: 10

In [526]: v                                                                                        
Out[526]: Vector([0.0, 1.0, 2.0])

在里面修改了v.a的值，但实例里面确没有改。

理解也比较简单，这里v.a只不过给v添加了一个属性而已，后续读取对象的v.a也不会调取___getattr__方法了。

通过__setattr__可以限制行为，本来我还想为什么不修改对象的值，因为这个对象是只读的，所以只能限制通过单个字符赋值属性。避免误解

还有就是如果实现了__getattr__方法，那么也要定义__setattr__方法，以防止对象的行为不一致。

from array import array
import math
import reprlib
import numbers


class Vector:
    typecode = 'd'

    def __init__(self, components):
        self._components = array(self.typecode, components)

    def __iter__(self):  # 返回一个迭代器，对象拥有__next__属性
        '''有了__iter__属性，不仅可以多变量取值，还可以被for循环使用'''
        return iter(self._components)

    def __repr__(self):
        components = reprlib.repr(self._components)    # 数量太多可以用...代替
        # print(components)
        components = components[components.find('['): -1]
        return f'{self.__class__.__name__}({components})'

    def __str__(self):
        return str(tuple(self))

    def __bytes__(self):
        return (bytes([ord(self.typecode)]) +
                bytes(self._components))

    def __eq__(self, other):
        return tuple(self) == tuple(other)

    def __abs__(self):    # abs返回一个直角三角形斜边长
        return math.sqrt(sum(x * x for x in self))

    def __bool__(self):    # 直接调用对象的abs值，然后用bool取值
        return bool(abs(self))

    def __getitem__(self, index):
        cls = type(self)
        if isinstance(index, slice):
            return cls(self._components[index])
        elif isinstance(index, numbers.Integral):
            return self._components[index]
        else:
            msg = '{cls.__name__} indices must be integers'
            raise TypeError(msg.format(cls = cls))

    shortcut_name = 'xyzt'            # 定义在__getattr__里面也可以，定义在外面就可以修改了
    def __getattr__(self, index):
        if len(index) == 1:
            pos = self.shortcut_name.find(index)
            if 0 <= pos < len(self._components):
                return self._components[pos]
            msg = '{.__name__!r} object has no attribute {!r}'
            raise AttributeError(msg.format(self.__class__, index))
        
    def __setattr__(self, key, value):
        cls = type(self)       # 取出类
        if len(key) == 1:    # 字符串不是一个字符都可以设置属性
            if key in cls.shortcut_name:   # 在定义的名单里面
                error = 'readonly attribute {attr_name!r}'
            elif key.islower():      # 小写字符不行
                error = "can't set attribute 'a' to 'z' in {cls_name!r}"
            else:        # 另外的就是大写字符可以的
                error = ''
            if error:
                msg = error.format(cls_name=cls, attr_name=key)
                raise AttributeError(msg)
        super(Vector, self).__setattr__(key, value)

    def __len__(self):
        return len(self._components)
    @classmethod
    def frombytes(cls, octets):
        typecode = chr(octets[0])  # 先读取array的typecode
        menv = memoryview(octets[1:]).cast(typecode)
        print(menv)
        return cls(menv)



v = Vector([1, 2])

这个测试不上了，测试过，可以运行。

10.6 Ventor第4版：散列好快速等值测试

前面两维的时候，通过两个数值的哈希值异或取得

hash(self._x) ^ hash(self._y)

这次要把多维数组里面的每个元素哈希后，一个接着一个异或。

要用到reduce了。

renduce(func,list)

就是res = func(list[0],list[1])

然后res = func(res,list[2])

然后一直这么下去，知道元素取完，返回最后的返回值。

In [535]: from functools import reduce                                                             

In [536]: reduce(lambda a,b:a^b,range(10))                                                         
Out[536]: 1

In [537]: from operator import xor                                                                 

In [538]: reduce(xor,range(10))                                                                    
Out[538]: 1

operator模块以函数的形式提供了python的全部中缀远算符，从而减少使用lambda表达式。（我不知道作者为什么这么不喜欢lambda函数）

reduce最好在最后一个参数给一个初始值 initializer，因为如果reduce后面的可迭代对象为只有一个元素，会报错。

所以在+、|、^设置初始值为0，(加，或，异或)

在* 、&（乘号、于）运算初始值为1

顺便把__eq__也修改了，因为两个list对比，如果是很长的元素，速度太慢了。

 def __hash__(self):
        return functools.reduce(xor, (hash(i) for i in self._components), 0)

    def __eq__(self, other):
        return len(self) == len(other) and all(a == b for a, b in zip(self, other))
        # 这个写的很漂亮，先判断len，在判断里面的每组元素，都用到了Python的短路原则

测试了代码reduce里面用operator.xor，不填初始值也没问题。书中就是不填写的。

zip函数在itertools里面有个zip_longest函数，还是蛮有意思的，上来测试下。

In [5]: from itertools import zip_longest                                                 

In [6]: list(zip_longest(range(1,4),'abc',[5,4,3,2,1],))                                  
Out[6]: [(1, 'a', 5), (2, 'b', 4), (3, 'c', 3), (None, None, 2), (None, None, 1)]

In [7]: list(zip_longest(range(1,4),'abc',[5,4,3,2,1],fillvalue=-1))                      
Out[7]: [(1, 'a', 5), (2, 'b', 4), (3, 'c', 3), (-1, -1, 2), (-1, -1, 1)]

In [8]: list(zip(range(1,4),'abc',[5,4,3,2,1],))                                          
Out[8]: [(1, 'a', 5), (2, 'b', 4), (3, 'c', 3)]

会按照最长的迭代对象打包，如果缺少，默认是None，也可用通过fillvalue填写默认参数。

10.7format输出

由于使用了一些数学公式，我数学忘记的太多了，就不上了。

流畅的python，Fluent Python 第十章笔记

猜你喜欢