Simple tutorial of python library pydantic

1. Introduction

The pydantic library is a library for data interface definition checking and settings management in python.

pydantic enforces type hinting at runtime and provides friendly errors when data is invalid.

It has the following advantages:

  1. Works perfectly with IDE/linter, no need to learn new patterns, just use type annotations to define instances of classes
  2. Multipurpose, BaseSettings can both validate request data and read system settings from environment variables
  3. fast
  4. Can verify complex structures
  5. Extensible, validatorvalidation can be extended using methods on models decorated with decorators
  6. In addition to dataclass integration, BaseModelpydantic also provides a dataclass decorator which creates plain Python dataclasses with input data parsing and validation.

2. Installation

pip install pydantic
复制代码

To test that pydantic has been compiled, run:

import pydantic
print('compiled:', pydantic.compiled)
复制代码

Support using dotenv file to get configuration, need to install python-dotenv

pip install pydantic[dotenv]
复制代码

3. Common Models

Objects are defined in pydantic through models, and you can think of models as types in a type language.

1. BaseModel basic model

from pydantic import BaseModel

class User(BaseModel):
    id: int
    name = 'Jane Doe'
复制代码

The above example defines a User model, which inherits from BaseModel, has 2 fields, idis an integer and is required, nameis a string with a default value and is not required

Instantiate using:

user = User(id='123')
复制代码

The instantiation will perform all parsing and validation, and if there is an error, a ValidationError will be raised.

Models have the following properties:

  • dict() a dictionary of model fields and values

  • json() JSON string representationdict()

  • copy() a copy of the model (default is a shallow copy)

  • parse_obj() parses data using dict

  • parse_raw takes str or bytes and parses it to json, then passes the result toparse_obj

  • parse_file The file path, read the file and pass the content to parse_raw. If content_typeomitted, inferred from the file's extension

  • from_orm() 从ORM 对象创建模型

  • schema() 返回模式的字典

  • schema_json() 返回该字典的 JSON 字符串表示

  • construct() 允许在没有验证的情况下创建模型

  • __fields_set__ 初始化模型实例时设置的字段名称集

  • __fields__ 模型字段的字典

  • __config__ 模型的配置类

2、递归模型

可以使用模型本身作为注释中的类型来定义更复杂的数据结构。

from typing import List
from pydantic import BaseModel


class Foo(BaseModel):
    count: int
    size: float = None


class Bar(BaseModel):
    apple = 'x'
    banana = 'y'


class Spam(BaseModel):
    foo: Foo
    bars: List[Bar]
复制代码

3、GenericModel 通用模型(泛型):

使用 typing.TypeVar 的实例作为参数,传递给 typing.Generic,然后在继承了pydantic.generics.GenericModel 的模型中使用:

from typing import Generic, TypeVar, Optional, List

from pydantic import BaseModel, validator, ValidationError
from pydantic.generics import GenericModel

DataT = TypeVar('DataT')


class Error(BaseModel):
    code: int
    message: str


class DataModel(BaseModel):
    numbers: List[int]
    people: List[str]


class Response(GenericModel, Generic[DataT]):
    data: Optional[DataT]
    error: Optional[Error]

    @validator('error', always=True)
    def check_consistency(cls, v, values):
        if v is not None and values['data'] is not None:
            raise ValueError('must not provide both data and error')
        if v is None and values.get('data') is None:
            raise ValueError('must provide data or error')
        return v


data = DataModel(numbers=[1, 2, 3], people=[])
error = Error(code=404, message='Not found')

print(Response[int](data=1))
#> data=1 error=None
print(Response[str](data='value'))
#> data='value' error=None
print(Response[str](data='value').dict())
#> {'data': 'value', 'error': None}
print(Response[DataModel](data=data).dict())
"""
{
    'data': {'numbers': [1, 2, 3], 'people': []},
    'error': None,
}
"""
print(Response[DataModel](error=error).dict())
"""
{
    'data': None,
    'error': {'code': 404, 'message': 'Not found'},
}
"""
try:
    Response[int](data='value')
except ValidationError as e:
    print(e)
    """
    2 validation errors for Response[int]
    data
      value is not a valid integer (type=type_error.integer)
    error
      must provide data or error (type=value_error)
    """
复制代码

4、create_model 动态模型

在某些情况下,直到运行时才知道模型的结构。为此 pydantic 提供了create_model允许动态创建模型的方法。

from pydantic import BaseModel, create_model

DynamicFoobarModel = create_model('DynamicFoobarModel', foo=(str, ...), bar=123)
复制代码

四、常用类型

  • Nonetype(None)Literal[None]只允许None

  • bool 布尔类型

  • int 整数类型

  • float 浮点数类型

  • str 字符串类型

  • bytes 字节类型

  • list 允许list,tuple,set,frozenset,deque, 或生成器并转换为列表

  • tuple 允许list,tuple,set,frozenset,deque, 或生成器并转换为元组

  • dict 字典类型

  • set 允许list,tuple,set,frozenset,deque, 或生成器和转换为集合;

  • frozenset 允许list,tuple,set,frozenset,deque, 或生成器和强制转换为冻结集

  • deque 允许list,tuple,set,frozenset,deque, 或生成器和强制转换为双端队列

  • datetime 的date,datetime,time,timedelta 等日期类型

  • typing 中的 Deque, Dict, FrozenSet, List, Optional, Sequence, Set, Tuple, Union,Callable,Pattern等类型

  • FilePath,文件路径

  • DirectoryPath 目录路径

  • EmailStr 电子邮件地址

  • NameEmail 有效的电子邮件地址或格式

  • PyObject 需要一个字符串并加载可在该虚线路径中导入的 python 对象;

  • Color 颜色类型

  • AnyUrl 任意网址

  • SecretStr、SecretBytes 敏感信息,将被格式化为'**********'''

  • Json 类型

  • PaymentCardNumber 支付卡类型

  • 约束类型,可以使用con*类型函数限制许多常见类型的值

    • conlist
  1. item_type: Type[T]: 列表项的类型
  2. min_items: int = None: 列表中的最小项目数
  3. max_items: int = None: 列表中的最大项目数
  • conset
  1. item_type: Type[T]: 设置项目的类型
  2. min_items: int = None: 集合中的最小项目数
  3. max_items: int = None: 集合中的最大项目数
  • conint
  1. strict: bool = False: 控制类型强制
  2. gt: int = None: 强制整数大于设定值
  3. ge: int = None: 强制整数大于或等于设定值
  4. lt: int = None: 强制整数小于设定值
  5. le: int = None: 强制整数小于或等于设定值
  6. multiple_of: int = None: 强制整数为设定值的倍数
  • confloat
  1. strict: bool = False: 控制类型强制
  2. gt: float = None: 强制浮点数大于设定值
  3. ge: float = None: 强制 float 大于或等于设定值
  4. lt: float = None: 强制浮点数小于设定值
  5. le: float = None: 强制 float 小于或等于设定值
  6. multiple_of: float = None: 强制 float 为设定值的倍数
  • condecimal
  1. gt: Decimal = None: 强制十进制大于设定值
  2. ge: Decimal = None: 强制十进制大于或等于设定值
  3. lt: Decimal = None: 强制十进制小于设定值
  4. le: Decimal = None: 强制十进制小于或等于设定值
  5. max_digits: int = None: 小数点内的最大位数。它不包括小数点前的零或尾随的十进制零
  6. decimal_places: int = None: 允许的最大小数位数。它不包括尾随十进制零
  7. multiple_of: Decimal = None: 强制十进制为设定值的倍数
  • constr
  1. strip_whitespace: bool = False: 删除前尾空格
  2. to_lower: bool = False: 将所有字符转为小写
  3. strict: bool = False: 控制类型强制
  4. min_length: int = None: 字符串的最小长度
  5. max_length: int = None: 字符串的最大长度
  6. curtail_length: int = None: 当字符串长度超过设定值时,将字符串长度缩小到设定值
  7. regex: str = None: 正则表达式来验证字符串
  • conbytes
  1. strip_whitespace: bool = False: 删除前尾空格
  2. to_lower: bool = False: 将所有字符转为小写
  3. min_length: int = None: 字节串的最小长度
  4. max_length: int = None: 字节串的最大长度
  • 严格类型,您可以使用StrictStrStrictBytesStrictIntStrictFloat,和StrictBool类型,以防止强制兼容类型

五、验证器

使用validator装饰器可以实现自定义验证和对象之间的复杂关系。

from pydantic import BaseModel, ValidationError, validator


class UserModel(BaseModel):
    name: str
    username: str
    password1: str
    password2: str

    @validator('name')
    def name_must_contain_space(cls, v):
        if ' ' not in v:
            raise ValueError('must contain a space')
        return v.title()

    @validator('password2')
    def passwords_match(cls, v, values, **kwargs):
        if 'password1' in values and v != values['password1']:
            raise ValueError('passwords do not match')
        return v

    @validator('username')
    def username_alphanumeric(cls, v):
        assert v.isalnum(), 'must be alphanumeric'
        return v


user = UserModel(
    name='samuel colvin',
    username='scolvin',
    password1='zxcvbn',
    password2='zxcvbn',
)
print(user)
#> name='Samuel Colvin' username='scolvin' password1='zxcvbn' password2='zxcvbn'

try:
    UserModel(
        name='samuel',
        username='scolvin',
        password1='zxcvbn',
        password2='zxcvbn2',
    )
except ValidationError as e:
    print(e)
    """
    2 validation errors for UserModel
    name
      must contain a space (type=value_error)
    password2
      passwords do not match (type=value_error)
    """
复制代码

关于验证器的一些注意事项:

  • 验证器是“类方法”,因此它们接收的第一个参数值是UserModel类,而不是UserModel
  • 第二个参数始终是要验证的字段值,可以随意命名
  • 单个验证器可以通过传递多个字段名称来应用于多个字段,也可以通过传递特殊值在所有字段上调用单个验证器'*'
  • 关键字参数pre将导致在其他验证之前调用验证器
  • 通过each_item=True将导致验证器被施加到单独的值(例如ListDictSet等),而不是整个对象
from typing import List
from pydantic import BaseModel, ValidationError, validator


class ParentModel(BaseModel):
    names: List[str]


class ChildModel(ParentModel):
    @validator('names', each_item=True)
    def check_names_not_empty(cls, v):
        assert v != '', 'Empty strings are not allowed.'
        return v


# This will NOT raise a ValidationError because the validator was not called
try:
    child = ChildModel(names=['Alice', 'Bob', 'Eve', ''])
except ValidationError as e:
    print(e)
else:
    print('No ValidationError caught.')
    #> No ValidationError caught.


class ChildModel2(ParentModel):
    @validator('names')
    def check_names_not_empty(cls, v):
        for name in v:
            assert name != '', 'Empty strings are not allowed.'
        return v


try:
    child = ChildModel2(names=['Alice', 'Bob', 'Eve', ''])
except ValidationError as e:
    print(e)
    """
    1 validation error for ChildModel2
    names
      Empty strings are not allowed. (type=assertion_error)
    """
复制代码
  • 关键字参数 always 将导致始终验证,出于性能原因,默认情况下,当未提供值时,不会为字段调用验证器。然而,在某些情况下,始终调用验证器可能很有用或需要,例如设置动态默认值。
  • allow_reuse 可以在多个字段/模型上使用相同的验证器
from pydantic import BaseModel, validator


def normalize(name: str) -> str:
    return ' '.join((word.capitalize()) for word in name.split(' '))


class Producer(BaseModel):
    name: str

    # validators
    _normalize_name = validator('name', allow_reuse=True)(normalize)


class Consumer(BaseModel):
    name: str

    # validators
    _normalize_name = validator('name', allow_reuse=True)(normalize)
复制代码

六、配置

如果您创建一个继承自BaseSettings的模型,模型初始化程序将尝试通过从环境中读取,来确定未作为关键字参数传递的任何字段的值。(如果未设置匹配的环境变量,则仍将使用默认值。)

这使得很容易:

  • 创建明确定义、类型提示的应用程序配置类
  • 自动从环境变量中读取对配置的修改
  • Manually override specific settings in initializers where needed (e.g. in unit tests)
from typing import Set

from pydantic import (
    BaseModel,
    BaseSettings,
    PyObject,
    RedisDsn,
    PostgresDsn,
    Field,
)


class SubModel(BaseModel):
    foo = 'bar'
    apple = 1


class Settings(BaseSettings):
    auth_key: str
    api_key: str = Field(..., env='my_api_key')

    redis_dsn: RedisDsn = 'redis://user:pass@localhost:6379/1'
    pg_dsn: PostgresDsn = 'postgres://user:pass@localhost:5432/foobar'

    special_function: PyObject = 'math.cos'

    # to override domains:
    # export my_prefix_domains='["foo.com", "bar.com"]'
    domains: Set[str] = set()

    # to override more_settings:
    # export my_prefix_more_settings='{"foo": "x", "apple": 1}'
    more_settings: SubModel = SubModel()

    class Config:
        env_prefix = 'my_prefix_'  # defaults to no prefix, i.e. ""
        fields = {
            'auth_key': {
                'env': 'my_auth_key',
            },
            'redis_dsn': {
                'env': ['service_redis_dsn', 'redis_url']
            }
        }


print(Settings().dict())
"""
{
    'auth_key': 'xxx',
    'api_key': 'xxx',
    'redis_dsn': RedisDsn('redis://user:pass@localhost:6379/1',
scheme='redis', user='user', password='pass', host='localhost',
host_type='int_domain', port='6379', path='/1'),
    'pg_dsn': PostgresDsn('postgres://user:pass@localhost:5432/foobar',
scheme='postgres', user='user', password='pass', host='localhost',
host_type='int_domain', port='5432', path='/foobar'),
    'special_function': <built-in function cos>,
    'domains': set(),
    'more_settings': {'foo': 'bar', 'apple': 1},
}
"""
复制代码

Dotenv files are supported to set variables, and pydantic loads it in two ways:

class Settings(BaseSettings):
    ...

    class Config:
        env_file = '.env'
        env_file_encoding = 'utf-8'
复制代码

or

settings=Settings(_env_file='prod.env',_env_file_encoding='utf-8')
复制代码

Even with dotenv files, pydantic will still read environment variables, which will always take precedence over values ​​loaded from dotenv files .

pydantic supports setting sensitive information files, which are also loaded in two ways:

class Settings(BaseSettings):
    ...
    database_password: str

    class Config:
        secrets_dir = '/var/run'
复制代码

or:

settings = Settings(_secrets_dir='/var/run')
复制代码

Even with the secrets directory, pydantic will still read environment variables from the dotenv file or environment, and the dotenv file and environment variables will always take precedence over values ​​loaded from the secrets directory .

7. Use with mypy

Pydantic ships with a mypy plugin that adds a number of important pydantic-specific features to mypy to improve its ability to type-check code.

For example the following script:

from datetime import datetime
from typing import List, Optional
from pydantic import BaseModel, NoneStr


class Model(BaseModel):
    age: int
    first_name = 'John'
    last_name: NoneStr = None
    signup_ts: Optional[datetime] = None
    list_of_ints: List[int]


m = Model(age=42, list_of_ints=[1, '2', b'3'])
print(m.middle_name)  # not a model field!
Model()  # will raise a validation error for age and list_of_ints
复制代码

Without any special configuration, mypy catches one of these errors:

13: error: "Model" has no attribute "middle_name"
复制代码

When the plugin is enabled, it captures both:

13: error: "Model" has no attribute "middle_name"
16: error: Missing named argument "age" for "Model"
16: error: Missing named argument "list_of_ints" for "Model"
复制代码

To enable the plugin, just add pydantic.mypyto the list of plugins in the mypy configuration file:

[mypy]
plugins = pydantic.mypy
复制代码

To change the value of a plugin setting, create a section in the mypy configuration file called [pydantic-mypy]​​​and add a key-value pair for the setting you want to override:

[mypy]
plugins = pydantic.mypy

follow_imports = silent
warn_redundant_casts = True
warn_unused_ignores = True
disallow_any_generics = True
check_untyped_defs = True
no_implicit_reexport = True

# for strict mypy: (this is the tricky one :-))
disallow_untyped_defs = True

[pydantic-mypy]
init_forbid_extra = True
init_typed = True
warn_required_dynamic_aliases = True
warn_untyped_fields = True
复制代码

Guess you like

Origin juejin.im/post/7079027549896081421