python source code interpretation third-party libraries Faker

Source background

Faker is a Python third party libraries, GITHUB open source projects, mainly used to create the data dummy data created contain geographic information class, basic information like personal account information, networking, basic information like browser type information, file type information, digital class text encryption type, time type information, such as other categories.

Source Address: https://github.com/joke2k/faker
Function Quick collected: https://blog.csdn.net/qq_41545431/article/details/105006681

Source Reading

The main source Interpretation

By clicking directly into the class initialization of the class initialization module

fake = Faker(locale='zh_CN')

Handling the core source code as follows:

proxy.py文件
from __future__ import absolute_import, unicode_literals
from collections import OrderedDict
import random
import re
import six
from faker.config import DEFAULT_LOCALE
from faker.factory import Factory
from faker.generator import Generator
from faker.utils.distribution import choices_distribution

class Faker(object):
    """Proxy class capable of supporting multiple locales"""

    cache_pattern = re.compile(r'^_cached_\w*_mapping$')
    generator_attrs = [
        attr for attr in dir(Generator)
        if not attr.startswith('__')
        and attr not in ['seed', 'seed_instance', 'random']
    ]
    def __init__(self, locale=None, providers=None,
                 generator=None, includes=None, **config):
        self._factory_map = OrderedDict()
        self._weights = None

        if isinstance(locale, six.string_types):
            locales = [locale.replace('-', '_')]

        # This guarantees a FIFO ordering of elements in `locales` based on the final
        # locale string while discarding duplicates after processing
        elif isinstance(locale, (list, tuple, set)):
            assert all(isinstance(l, six.string_types) for l in locale)
            locales = []
            for l in locale:
                final_locale = l.replace('-', '_')
                if final_locale not in locales:
                    locales.append(final_locale)

        elif isinstance(locale, OrderedDict):
            assert all(isinstance(v, (int, float)) for v in locale.values())
            odict = OrderedDict()
            for k, v in locale.items():
                key = k.replace('-', '_')
                odict[key] = v
            locales = list(odict.keys())
            self._weights = list(odict.values())

        else:
            locales = [DEFAULT_LOCALE]

        for locale in locales:
            self._factory_map[locale] = Factory.create(locale, providers, generator, includes, **config)

        self._locales = locales
        self.AQ_factories = list(self._factory_map.values())

The main class file, and introduced inside collections.OrderedDict, random, six other outer cladding.
The following description of the corresponding key to be outside of the package:

1, collections.OrderedDict achieved dictionary object sorting elements, since python dictionary is stored in accordance with the hash value so, a disorder resulting in the dictionary, the dictionary OrderedDict achieve the object sorting elements
2, six name comes from 6 = 2 x 3, which produces mainly to solve Python2 code compatibility and Python3

Beginning initialization according to certain rules the class attributes stored in the Generator generator_attrs, the method for subsequent calls. In the init specifies the class into which initialization parameters? At the same time nominal defines two private variables, class methods to prevent external calls, said he was nominally private variables in python is because there is no real privatization, regardless of the method or property, in order to be programmed, the agreement added underscore _ properties and methods do not belong to the API, should not be accessed outside the class, will not be introduced from M import *. Note, however you want to call can also call. It is stored self._factory_map OrderedDict () to instantiate the object; self._weights order to ensure that the class is called, the initialization value of None given. The next parameter is the locale of the determination condition, a schematic diagram is basically as follows:

proxy.py文件
if isinstance(locale, six.string_types):
    如果入参的locale是字符串,则替换-线为_线,保存在locales中

elif isinstance(locale, (list, tuple, set)):
    如果入参的locale是列表,元祖,集合,则遍历入参判断元素为字符串后将元素替换-线为_线保存在locales中
elif isinstance(locale, OrderedDict):
    如果入参的locale是有序字典,则遍历入参判断键为字符串后将键替换-线为_线保存在locales中,将键的值保存在之前定义的self._weights中
    locales = list(odict.keys())
    self._weights = list(odict.values())
else:
    以上条件都不满足时,将配置文件中自定义的locale保存到列表中赋值给locales

Why in this locale to the Senate to do so many check it, because the initialization is locale done a very important thing, but it is very high demands on the locale specific point of view Source:

proxy.py文件
for locale in locales:
        self._factory_map[locale] = Factory.create(locale, providers, generator, includes, **config)

The main source here made to create a map for each language dictionary, which involves the creation method in the Factory factory mode, the basic parameters for the current class to the Senate. So Faker addition to the locale into the reference check conducted outside, there is no other verification do it? The answer is yes in on key attributes self._weights, self._factories, self._factory_map.items (), made read-only check, modified by external @property decorator at the object.

Reading magic method

Need Example no properties of instantiated class, then in order to increase its scalability added getitem magic method so we can Fake () [ 'pujen'] operation, then Faker in Fake () [ 'pujen '] will return Shane, the operation results in a KeyError source, because of course there is no pujen Faker this language pack.

proxy.py文件
def __getitem__(self, locale):
    return self._factory_map[locale.replace('-', '_')]
fake = Faker(locale='zh_CN')
print(fake['zh_CN'])

>>>    <faker.generator.Generator object at 0x0000021AEE18FDD8>

Next, look at an example to better understand what is going getitem magic method

class Fake(object):
    def __init__(self):
        self.name = 'jack'

    def __getitem__(self,item):
        if item in self.__dict__:       # item = key,判断该key是否存在对象的 __dict__ 里,
            return self.__dict__[item]  # 返回该对象 __dict__ 里key对应的value

    def __setitem__(self, key, value):
        self.__dict__[key] = value      # 在对象 __dict__ 为指定的key设置value

    def __delitem__(self, key):
        del self.__dict__[key]          # 在对象 __dict__ 里删除指定的key

f1 = Fake()
print(f1['name'])   # jack
f1['age'] =10       
print(f1['age'])    # 10
del f1['name']
print(f1.__dict__)  # {'age': 10}

Then look at the getattribute__ method, which appears in this class is mainly because prevent seed () method is called directly but to the form Faker.seed () such calls, the Faker source of seed () actually Generator.seed () function is a stochastic seed. Assuming that the method call class is not seed () but such other non-method, then performs __getattr method, which is mainly what is done in Faker inside it:

proxy.py文件
def __getattr__(self, attr):
    """
    Handles cache access and proxying behavior
    :param attr: attribute name
    :return: the appropriate attribute
    """
    条件语句判断异常情况,最后走如下代码
        factory = self._select_factory(attr)
        return getattr(factory, attr)
Factory Pattern

In the initialization, we find that the core content of the final are from Factory.create factory pattern to create the next look at this factory function (). Factory in the create () method is a static class to reflect

factory.py文件
@classmethod
    def create(
            cls,
            locale=None,
            providers=None,
            generator=None,
            includes=None,
            **config):
        if includes is None:
            includes = []

        # fix locale to package name
        locale = locale.replace('-', '_') if locale else DEFAULT_LOCALE
        locale = pylocale.normalize(locale).split('.')[0]#返回规范化的语言环境代码
        if locale not in AVAILABLE_LOCALES:
            msg = 'Invalid configuration for faker locale `{0}`'.format(locale)
            raise AttributeError(msg)

        config['locale'] = locale
        providers = providers or PROVIDERS#排序的集合

        providers += includes

        faker = generator or Generator(**config)

        for prov_name in providers:
            if prov_name == 'faker.providers':
                continue

            prov_cls, lang_found = cls._get_provider_class(prov_name, locale)#prov_cls=faker.providers,lang_found语言包名称
            provider = prov_cls(faker)#继承在Generator类中
            provider.__provider__ = prov_name
            provider.__lang__ = lang_found
            faker.add_provider(provider)#增加类的方法和属性
        return faker

From the above code can sort it out, the basic method is to increase and regulate the class about the language pack. We sort out some of the details of the code inside of it:

factory.py文件
1、
providers += includes

providers是一个空列表
includes是一个集合数据

那么假设providers=[],includes={1,2,3,4}
则providers += includes运行结果,会使的providers=[1,2,3,4],实际这段代码就是将集合的数据放到空列表中。
2、
faker = generator or Generator(**config)
provider = prov_cls(faker)

这里faker是generator类,prov_cls实际上是一个类,那么prov_cls(faker)实际就是继承了Generator类
3、
provider.__provider__ = prov_name
provider.__lang__ = lang_found
faker.add_provider(provider)#增加类的方法和属性

给这些类赋予方法名和语言包,同时通过魔法方法增加类的方法和属性,这里面涉及到Generator.add_provider()方法
Faker hide the main method in class

Above the factory model create () method of the main function also introduced substantially over other methods of temporary internal class, but more research. Next, look at the source () involved in the create Generator.add_provider () method, the following method:

generator.py文件
def  add_provider(self, provider):

    if isinstance(provider, type):
        provider = provider(self)

    self.providers.insert(0, provider)#将provider插入到0索引位置

    for method_name in dir(provider):
        # skip 'private' method
        if method_name.startswith('_'):
            continue

        faker_function = getattr(provider, method_name)#动态运行函数

        if callable(faker_function):#函数用于检查一个对象是否是可调用的
            # add all faker method to generator
            self.set_formatter(method_name, faker_function)

Do some basic instructions for the use of the following, we follow when you can write the code as a reference

if isinstance(provider, type):

Description : If the object is an instance of the parameter classinfo parameters, or it is an example of a (direct, indirect or virtual) subclass, True is returned. If the object is not an object of a given type, the function always returns False. If classinfo tuple (or recursively, similar tuples) type of the object, if the object is an instance of any type, it returns True. If classinfo not the type or types of tuples, and these are not the type of tuples of tuples, the type of error exception is thrown.

for method_name in dir(provider):

dir's instructions, if the provider class or module does not define dir method returns to process module class or attribute

Next, look at these two methods, mainly used to dynamically call the function returns an object runs

faker_function = getattr(provider, method_name)#动态运行函数

if callable(faker_function):#函数用于检查一个对象是否是可调用的

So far, Generator class introduces the core methods to complete!

Faker in how to run the internal logic

When we write methods intend to take a look at pycharm class function inside, Ctrl + left-click. Strange things happened, and did not enter into it corresponding method to simultaneously pycharm our smart tips:

fake = Faker(locale='zh_CN')
fake.random_digit_not_null()


Can also be very clear findings by the above source code parsing, Faker methods and properties as we always written in the same class the following class, did not see the full text of the basic analytic methods and properties directly create dummy data. Then take a look at the following basic operation of the internal logic of the method of implementation.

 

As shown, the internal operating logic actually calls Generator.add_provider method under generator.py content of the document, there is a need to pay particular attention to is the law, we also mentioned above, there is a method in add_provider needs special attention is

for method_name in dir(provider):

This basic cycle through all the methods and properties loaded into the corresponding language pack, it said Faker properties and methods is actually in another place to store the, when used in the take over, the Faker of doing so class itself looks simple. So what form external to store it?


It can be seen outside a provider packet, the packet which corresponds to a number of packet classification method, the method is to lower the internal level corresponding to each language pack. Look at the internal representation of how specific method of

 


Basic can be found based on the raw data stored in the ancestral way, we end result is the way to run from this, then the function method and finally how to run it? In fact, the top of the source code to parse already mentioned, is the use of the init () in

 

return getattr(factory, attr)

Specific to each implementation of the method or function due to the too much not recite a large range of this is the use of random basic library to achieve.

Original article first appeared in the public micro-channel number of the software testing micro classroom

Guess you like

Origin www.cnblogs.com/pujenyuan/p/12615835.html