1. Introduction
The pydantic library is a library for data interface definition checking and settings management in python.
pydantic enforces type hinting at runtime and provides friendly errors when data is invalid.
It has the following advantages:
- Works perfectly with IDE/linter, no need to learn new patterns, just use type annotations to define instances of classes
- Multipurpose, BaseSettings can both validate request data and read system settings from environment variables
- fast
- Can verify complex structures
- Extensible,
validator
validation can be extended using methods on models decorated with decorators - In addition to dataclass integration,
BaseModel
pydantic also provides a dataclass decorator which creates plain Python dataclasses with input data parsing and validation.
2. Installation
pip install pydantic
复制代码
To test that pydantic has been compiled, run:
import pydantic
print('compiled:', pydantic.compiled)
复制代码
Support using dotenv file to get configuration, need to install python-dotenv
pip install pydantic[dotenv]
复制代码
3. Common Models
Objects are defined in pydantic through models, and you can think of models as types in a type language.
1. BaseModel basic model
from pydantic import BaseModel
class User(BaseModel):
id: int
name = 'Jane Doe'
复制代码
The above example defines a User model, which inherits from BaseModel, has 2 fields, id
is an integer and is required, name
is a string with a default value and is not required
Instantiate using:
user = User(id='123')
复制代码
The instantiation will perform all parsing and validation, and if there is an error, a ValidationError will be raised.
Models have the following properties:
-
dict() a dictionary of model fields and values
-
json() JSON string representation
dict()
-
copy() a copy of the model (default is a shallow copy)
-
parse_obj() parses data using dict
-
parse_raw takes str or bytes and parses it to json, then passes the result to
parse_obj
-
parse_file The file path, read the file and pass the content to
parse_raw
. Ifcontent_type
omitted, inferred from the file's extension -
from_orm() 从ORM 对象创建模型
-
schema() 返回模式的字典
-
schema_json() 返回该字典的 JSON 字符串表示
-
construct() 允许在没有验证的情况下创建模型
-
__fields_set__ 初始化模型实例时设置的字段名称集
-
__fields__ 模型字段的字典
-
__config__ 模型的配置类
2、递归模型
可以使用模型本身作为注释中的类型来定义更复杂的数据结构。
from typing import List
from pydantic import BaseModel
class Foo(BaseModel):
count: int
size: float = None
class Bar(BaseModel):
apple = 'x'
banana = 'y'
class Spam(BaseModel):
foo: Foo
bars: List[Bar]
复制代码
3、GenericModel 通用模型(泛型):
使用 typing.TypeVar 的实例作为参数,传递给 typing.Generic,然后在继承了pydantic.generics.GenericModel 的模型中使用:
from typing import Generic, TypeVar, Optional, List
from pydantic import BaseModel, validator, ValidationError
from pydantic.generics import GenericModel
DataT = TypeVar('DataT')
class Error(BaseModel):
code: int
message: str
class DataModel(BaseModel):
numbers: List[int]
people: List[str]
class Response(GenericModel, Generic[DataT]):
data: Optional[DataT]
error: Optional[Error]
@validator('error', always=True)
def check_consistency(cls, v, values):
if v is not None and values['data'] is not None:
raise ValueError('must not provide both data and error')
if v is None and values.get('data') is None:
raise ValueError('must provide data or error')
return v
data = DataModel(numbers=[1, 2, 3], people=[])
error = Error(code=404, message='Not found')
print(Response[int](data=1))
#> data=1 error=None
print(Response[str](data='value'))
#> data='value' error=None
print(Response[str](data='value').dict())
#> {'data': 'value', 'error': None}
print(Response[DataModel](data=data).dict())
"""
{
'data': {'numbers': [1, 2, 3], 'people': []},
'error': None,
}
"""
print(Response[DataModel](error=error).dict())
"""
{
'data': None,
'error': {'code': 404, 'message': 'Not found'},
}
"""
try:
Response[int](data='value')
except ValidationError as e:
print(e)
"""
2 validation errors for Response[int]
data
value is not a valid integer (type=type_error.integer)
error
must provide data or error (type=value_error)
"""
复制代码
4、create_model 动态模型
在某些情况下,直到运行时才知道模型的结构。为此 pydantic 提供了create_model
允许动态创建模型的方法。
from pydantic import BaseModel, create_model
DynamicFoobarModel = create_model('DynamicFoobarModel', foo=(str, ...), bar=123)
复制代码
四、常用类型
-
None
,type(None)
或Literal[None]
只允许None
值 -
bool 布尔类型
-
int 整数类型
-
float 浮点数类型
-
str 字符串类型
-
bytes 字节类型
-
list 允许
list
,tuple
,set
,frozenset
,deque
, 或生成器并转换为列表 -
tuple 允许
list
,tuple
,set
,frozenset
,deque
, 或生成器并转换为元组 -
dict 字典类型
-
set 允许
list
,tuple
,set
,frozenset
,deque
, 或生成器和转换为集合; -
frozenset 允许
list
,tuple
,set
,frozenset
,deque
, 或生成器和强制转换为冻结集 -
deque 允许
list
,tuple
,set
,frozenset
,deque
, 或生成器和强制转换为双端队列 -
datetime 的date,datetime,time,timedelta 等日期类型
-
typing 中的 Deque, Dict, FrozenSet, List, Optional, Sequence, Set, Tuple, Union,Callable,Pattern等类型
-
FilePath,文件路径
-
DirectoryPath 目录路径
-
EmailStr 电子邮件地址
-
NameEmail 有效的电子邮件地址或格式
-
PyObject 需要一个字符串并加载可在该虚线路径中导入的 python 对象;
-
Color 颜色类型
-
AnyUrl 任意网址
-
SecretStr、SecretBytes 敏感信息,将被格式化为
'**********'
或''
-
Json 类型
-
PaymentCardNumber 支付卡类型
-
约束类型,可以使用
con*
类型函数限制许多常见类型的值conlist
item_type: Type[T]
: 列表项的类型min_items: int = None
: 列表中的最小项目数max_items: int = None
: 列表中的最大项目数
conset
item_type: Type[T]
: 设置项目的类型min_items: int = None
: 集合中的最小项目数max_items: int = None
: 集合中的最大项目数
conint
strict: bool = False
: 控制类型强制gt: int = None
: 强制整数大于设定值ge: int = None
: 强制整数大于或等于设定值lt: int = None
: 强制整数小于设定值le: int = None
: 强制整数小于或等于设定值multiple_of: int = None
: 强制整数为设定值的倍数
confloat
strict: bool = False
: 控制类型强制gt: float = None
: 强制浮点数大于设定值ge: float = None
: 强制 float 大于或等于设定值lt: float = None
: 强制浮点数小于设定值le: float = None
: 强制 float 小于或等于设定值multiple_of: float = None
: 强制 float 为设定值的倍数
condecimal
gt: Decimal = None
: 强制十进制大于设定值ge: Decimal = None
: 强制十进制大于或等于设定值lt: Decimal = None
: 强制十进制小于设定值le: Decimal = None
: 强制十进制小于或等于设定值max_digits: int = None
: 小数点内的最大位数。它不包括小数点前的零或尾随的十进制零decimal_places: int = None
: 允许的最大小数位数。它不包括尾随十进制零multiple_of: Decimal = None
: 强制十进制为设定值的倍数
constr
strip_whitespace: bool = False
: 删除前尾空格to_lower: bool = False
: 将所有字符转为小写strict: bool = False
: 控制类型强制min_length: int = None
: 字符串的最小长度max_length: int = None
: 字符串的最大长度curtail_length: int = None
: 当字符串长度超过设定值时,将字符串长度缩小到设定值regex: str = None
: 正则表达式来验证字符串
conbytes
strip_whitespace: bool = False
: 删除前尾空格to_lower: bool = False
: 将所有字符转为小写min_length: int = None
: 字节串的最小长度max_length: int = None
: 字节串的最大长度
-
严格类型,您可以使用
StrictStr
,StrictBytes
,StrictInt
,StrictFloat
,和StrictBool
类型,以防止强制兼容类型
五、验证器
使用validator
装饰器可以实现自定义验证和对象之间的复杂关系。
from pydantic import BaseModel, ValidationError, validator
class UserModel(BaseModel):
name: str
username: str
password1: str
password2: str
@validator('name')
def name_must_contain_space(cls, v):
if ' ' not in v:
raise ValueError('must contain a space')
return v.title()
@validator('password2')
def passwords_match(cls, v, values, **kwargs):
if 'password1' in values and v != values['password1']:
raise ValueError('passwords do not match')
return v
@validator('username')
def username_alphanumeric(cls, v):
assert v.isalnum(), 'must be alphanumeric'
return v
user = UserModel(
name='samuel colvin',
username='scolvin',
password1='zxcvbn',
password2='zxcvbn',
)
print(user)
#> name='Samuel Colvin' username='scolvin' password1='zxcvbn' password2='zxcvbn'
try:
UserModel(
name='samuel',
username='scolvin',
password1='zxcvbn',
password2='zxcvbn2',
)
except ValidationError as e:
print(e)
"""
2 validation errors for UserModel
name
must contain a space (type=value_error)
password2
passwords do not match (type=value_error)
"""
复制代码
关于验证器的一些注意事项:
- 验证器是“类方法”,因此它们接收的第一个参数值是
UserModel
类,而不是UserModel
- 第二个参数始终是要验证的字段值,可以随意命名
- 单个验证器可以通过传递多个字段名称来应用于多个字段,也可以通过传递特殊值在所有字段上调用单个验证器
'*'
- 关键字参数
pre
将导致在其他验证之前调用验证器 - 通过
each_item=True
将导致验证器被施加到单独的值(例如List
,Dict
,Set
等),而不是整个对象
from typing import List
from pydantic import BaseModel, ValidationError, validator
class ParentModel(BaseModel):
names: List[str]
class ChildModel(ParentModel):
@validator('names', each_item=True)
def check_names_not_empty(cls, v):
assert v != '', 'Empty strings are not allowed.'
return v
# This will NOT raise a ValidationError because the validator was not called
try:
child = ChildModel(names=['Alice', 'Bob', 'Eve', ''])
except ValidationError as e:
print(e)
else:
print('No ValidationError caught.')
#> No ValidationError caught.
class ChildModel2(ParentModel):
@validator('names')
def check_names_not_empty(cls, v):
for name in v:
assert name != '', 'Empty strings are not allowed.'
return v
try:
child = ChildModel2(names=['Alice', 'Bob', 'Eve', ''])
except ValidationError as e:
print(e)
"""
1 validation error for ChildModel2
names
Empty strings are not allowed. (type=assertion_error)
"""
复制代码
- 关键字参数 always 将导致始终验证,出于性能原因,默认情况下,当未提供值时,不会为字段调用验证器。然而,在某些情况下,始终调用验证器可能很有用或需要,例如设置动态默认值。
- allow_reuse 可以在多个字段/模型上使用相同的验证器
from pydantic import BaseModel, validator
def normalize(name: str) -> str:
return ' '.join((word.capitalize()) for word in name.split(' '))
class Producer(BaseModel):
name: str
# validators
_normalize_name = validator('name', allow_reuse=True)(normalize)
class Consumer(BaseModel):
name: str
# validators
_normalize_name = validator('name', allow_reuse=True)(normalize)
复制代码
六、配置
如果您创建一个继承自BaseSettings
的模型,模型初始化程序将尝试通过从环境中读取,来确定未作为关键字参数传递的任何字段的值。(如果未设置匹配的环境变量,则仍将使用默认值。)
这使得很容易:
- 创建明确定义、类型提示的应用程序配置类
- 自动从环境变量中读取对配置的修改
- Manually override specific settings in initializers where needed (e.g. in unit tests)
from typing import Set
from pydantic import (
BaseModel,
BaseSettings,
PyObject,
RedisDsn,
PostgresDsn,
Field,
)
class SubModel(BaseModel):
foo = 'bar'
apple = 1
class Settings(BaseSettings):
auth_key: str
api_key: str = Field(..., env='my_api_key')
redis_dsn: RedisDsn = 'redis://user:pass@localhost:6379/1'
pg_dsn: PostgresDsn = 'postgres://user:pass@localhost:5432/foobar'
special_function: PyObject = 'math.cos'
# to override domains:
# export my_prefix_domains='["foo.com", "bar.com"]'
domains: Set[str] = set()
# to override more_settings:
# export my_prefix_more_settings='{"foo": "x", "apple": 1}'
more_settings: SubModel = SubModel()
class Config:
env_prefix = 'my_prefix_' # defaults to no prefix, i.e. ""
fields = {
'auth_key': {
'env': 'my_auth_key',
},
'redis_dsn': {
'env': ['service_redis_dsn', 'redis_url']
}
}
print(Settings().dict())
"""
{
'auth_key': 'xxx',
'api_key': 'xxx',
'redis_dsn': RedisDsn('redis://user:pass@localhost:6379/1',
scheme='redis', user='user', password='pass', host='localhost',
host_type='int_domain', port='6379', path='/1'),
'pg_dsn': PostgresDsn('postgres://user:pass@localhost:5432/foobar',
scheme='postgres', user='user', password='pass', host='localhost',
host_type='int_domain', port='5432', path='/foobar'),
'special_function': <built-in function cos>,
'domains': set(),
'more_settings': {'foo': 'bar', 'apple': 1},
}
"""
复制代码
Dotenv files are supported to set variables, and pydantic loads it in two ways:
class Settings(BaseSettings):
...
class Config:
env_file = '.env'
env_file_encoding = 'utf-8'
复制代码
or
settings=Settings(_env_file='prod.env',_env_file_encoding='utf-8')
复制代码
Even with dotenv files, pydantic will still read environment variables, which will always take precedence over values loaded from dotenv files .
pydantic supports setting sensitive information files, which are also loaded in two ways:
class Settings(BaseSettings):
...
database_password: str
class Config:
secrets_dir = '/var/run'
复制代码
or:
settings = Settings(_secrets_dir='/var/run')
复制代码
Even with the secrets directory, pydantic will still read environment variables from the dotenv file or environment, and the dotenv file and environment variables will always take precedence over values loaded from the secrets directory .
7. Use with mypy
Pydantic ships with a mypy plugin that adds a number of important pydantic-specific features to mypy to improve its ability to type-check code.
For example the following script:
from datetime import datetime
from typing import List, Optional
from pydantic import BaseModel, NoneStr
class Model(BaseModel):
age: int
first_name = 'John'
last_name: NoneStr = None
signup_ts: Optional[datetime] = None
list_of_ints: List[int]
m = Model(age=42, list_of_ints=[1, '2', b'3'])
print(m.middle_name) # not a model field!
Model() # will raise a validation error for age and list_of_ints
复制代码
Without any special configuration, mypy catches one of these errors:
13: error: "Model" has no attribute "middle_name"
复制代码
When the plugin is enabled, it captures both:
13: error: "Model" has no attribute "middle_name"
16: error: Missing named argument "age" for "Model"
16: error: Missing named argument "list_of_ints" for "Model"
复制代码
To enable the plugin, just add pydantic.mypy
to the list of plugins in the mypy configuration file:
[mypy]
plugins = pydantic.mypy
复制代码
To change the value of a plugin setting, create a section in the mypy configuration file called [pydantic-mypy]
and add a key-value pair for the setting you want to override:
[mypy]
plugins = pydantic.mypy
follow_imports = silent
warn_redundant_casts = True
warn_unused_ignores = True
disallow_any_generics = True
check_untyped_defs = True
no_implicit_reexport = True
# for strict mypy: (this is the tricky one :-))
disallow_untyped_defs = True
[pydantic-mypy]
init_forbid_extra = True
init_typed = True
warn_required_dynamic_aliases = True
warn_untyped_fields = True
复制代码