foreword
- First use OpenCV to judge whether the picture is blurred, if it is blurred, it needs to be retaken;
- Then use face_recognition to detect whether there is a human face in the picture, and then proceed to the next step;
- Then use face_recognition to calculate the 128-dimensional face encoding of each face in the picture;
- Finally, use face_recognition to compare the face codes calculated this time with those that have been calculated before, and get a result with the highest similarity.
- The calculated face codes mentioned in the previous step can be stored in vector databases, such as Milvus, Proxima, etc., so that they can be directly queried from the database when comparing. I use Milvus, the official website has its CRUD document , and the SDK I use PyMilvus, pay attention to choose the version corresponding to your Milvus version when installing PyMilvus, you can’t install it wrong, this is stated in the documentation and GitHub . Milvus has a GUI management program, Attu , which is also introduced in the Milvus documentation. It can view the data in the database on the web page.
- Only OpenCV can also be used for face recognition. I haven't tried it. There are reference links.
Before writing this article, I have never been in touch with this aspect. I checked some basic knowledge, which are all in the reference link, thank them for sharing.
Install
linux
The installation of the dlib library is different from windows, and the other steps are the same. I only tested CentOS 7.9.2009, and there is miniconda in the server.
conda install -c conda-forge dlib, install the dlib library.
Record the process of solving the failure of installing the dlib library, and know conda-forge
Install dlib_Batman under linux system. Blog-CSDN blog_linux installation dlib
windows
Compared with linux, the installation of windows is a bit complicated
- pip install cmake
- pip install boost
- Download the whl file of the dlib library from pypi or github or other places. The version should be consistent with your python version. For details on how to choose the version, see my previous article and use pip install to install it
- pip install face_recognition
- pip install opencv-python
use
You need to prepare some pictures containing faces and put them in a folder, preferably the front of the face; then prepare a picture of a face and put it in the same directory as the folder. The people in this picture must be the same as the file A picture in the folder is of the same person.
For example, if there are two pictures of Jay Chou, one is placed in the folder, and the other is placed in the directory at the same level as the folder, and then some pictures that are not Jay Chou are placed in the folder.
Take this face picture outside the folder, and compare it with all the face pictures in the folder to see if it matches the same person. The specific matching principle is in the code comments and the reference link at the end of the article.
local test
import face_recognition
import os
import time
from numpy import array # 若不加这一行,对从文件中读取已保存的128维人脸编码执行eval()时,报错name 'array' is not defined
import cv2
t1 = time.time()
# 路径中不能有中文
source_img_file_path = r"C:\Users\PC-1\Desktop\ZhouJieLun1.png"
img_folder_path = r"C:\Users\PC-1\Desktop\face_img"
# 获取列表的第二个元素
def takeSecond(elem):
return elem[1]
# 判断图片是否清晰,参数是图片的绝对路径
def getImageVar(imgPath: str) -> bool:
image = cv2.imread(imgPath)
img2gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
imageVar = cv2.Laplacian(img2gray, cv2.CV_64F).var() # 一般大于100时认为图片是清晰的,实际需根据情况调节
if imageVar <= 100:
return False
elif imageVar > 100:
return True
# 检测图片中是否有人脸。能接受不完整的人脸,如缺少鼻子以下,此时是能否检测到人脸的阈值之一,人脸中其他部位的缺少情况未测试
def check_face_img(img_path: str) -> bool:
img = face_recognition.load_image_file(img_path) # 加载图像。会将图像加载到 numpy 数组中。
result = face_recognition.face_locations(img) # 返回图像中每张人脸的人脸特征位置列表。图像中没有人脸时,返回空列表
if len(result)>0:
return True
else:
return False
# 获取图片库中每张图片的128维人脸编码,用于传给face_recognition.face_distance()的face_encodings参数
def get_face_encodings(img_folder_path):
img_face_encoding_list = [] # 每个元素是文件夹中所有图片的128维人脸编码
img_path_list = [] # 每个元素是文件夹中图片的绝对路径
img_list = os.listdir(img_folder_path)
for img in img_list:
img_path = os.path.join(img_folder_path, img)
if getImageVar(img_path) and check_face_img(img_path):
img_path_list.append(img_path)
img = face_recognition.load_image_file(img_path) # 加载图像
img_face_encoding = face_recognition.face_encodings(img)[0] # 返回图像中每张人脸的 128 维人脸编码。后面使用face_recognition.face_distance()时,face_recognition.face_distance()的face_encodings参数中的值不能是由face_recognition.face_encodings(img)组成的,而应由face_recognition.face_encodings(img)[0]组成。
img_face_encoding_list.append(img_face_encoding)
elif not getImageVar(img_path):
print(f'{img_path}图片太模糊,请重新拍摄,入库时已忽略此图片')
elif not check_face_img(img_path):
print(f'{img_path}没有检测到完整人脸,请重新拍摄,入库时已忽略此图片')
return img_face_encoding_list, img_path_list
img_face_encoding_list, img_path_list = get_face_encodings(img_folder_path)
# 从本地文件读取已保存的图片库中每张图片的128维人脸编码,节省计算时间
def get_face_encodings_from_file():
with open(r"C:\Users\PC-1\Desktop\img_face_encoding_list.txt", 'r', encoding='utf-8') as f:
img_face_encoding_list = eval(f.read()) # 需from numpy import array,否则eval()报错name 'array' is not defined
img_path_list=[]
img_list = os.listdir(img_folder_path)
for img in img_list:
img_path = os.path.join(img_folder_path, img)
img_path_list.append(img_path)
return img_face_encoding_list, img_path_list
# img_face_encoding_list, img_path_list = get_face_encodings_from_file()
if getImageVar(source_img_file_path) and check_face_img(source_img_file_path):
img = face_recognition.load_image_file(source_img_file_path)
# face_locations = face_recognition.face_locations(img) # 返回图像中每张人脸的人脸特征位置列表。face_locations为(顶部、右侧、底部、左侧)顺序找到的人脸位置的元组列表
source_img_face_encoding = face_recognition.face_encodings(img)[0]
result = face_recognition.face_distance(face_encodings=img_face_encoding_list, face_to_compare=source_img_face_encoding) # 给定人脸编码列表,将它们与已知的人脸编码进行比较,并得到每个比较人脸的欧氏距离。距离大小为面孔的相似程度。欧氏距离越小相似度越大。欧氏距离的典型阈值是0.6,即小于0.6的可认为匹配成功。face_encodings是要比较的人脸编码列表,face_to_compare是要与之进行比较的人脸编码。
temp_list = list(zip(img_path_list, result))
temp_list.sort(key=takeSecond) # 将原列表按列表中每个子元素中的第二个元素的值进行升序排列
print(temp_list)
if temp_list[0][-1] < 0.6: # 升序排列后第一个元素的相似度最高,如果它的欧氏距离小于0.6,认为匹配成功
print(temp_list[0])
else:
print('匹配失败') # 有可能不是本人,也有可能是本次拍摄时不清晰
elif not getImageVar(source_img_file_path):
print('图片太模糊,请重新拍摄')
elif not check_face_img(source_img_file_path):
print('没有检测到完整人脸,请重新拍摄')
t2 = time.time()
print(t2-t1) # 每次运行都计算所有图片的特征值,与6张图片对比耗时4.7秒;直接从本地文件读取已保存的图片库中每张图片的128维人脸编码,与6张图片对比耗时0.6秒。
Example of content of img_face_encoding_list.txt
[array([-0.01729634, 0.10430054, 0.07625537, 0.03276102, -0.07866749,
-0.02818274, -0.06889073, -0.04842021, 0.13796961, -0.05026057,
0.23302871, 0.00723806, -0.2370982 , -0.01487464, -0.06505933,
0.10084826, -0.1636425 , -0.07501083, -0.12756079, -0.10016631,
0.04007092, 0.0581928 , 0.03240498, -0.01134465, -0.13176998,
-0.2787565 , -0.07050405, -0.09639949, 0.09951212, -0.14790948,
0.03040591, 0.0431371 , -0.15323421, 0.01086914, 0.0220076 ,
0.01904444, -0.00923841, -0.06280316, 0.18326622, 0.03018811,
-0.14447245, -0.01474072, -0.01069712, 0.24571334, 0.24511024,
0.02628675, -0.04335158, -0.0862942 , 0.06775976, -0.25436637,
0.011755 , 0.20667061, 0.05011301, 0.13582194, 0.05677999,
-0.16563055, 0.02566983, 0.11052104, -0.15707225, -0.00112594,
0.02829274, -0.05549442, -0.03616317, -0.05381152, 0.13676476,
0.04478186, -0.09165385, -0.12494379, 0.12897444, -0.18318115,
-0.00553679, 0.09620941, -0.12292095, -0.19553109, -0.20274746,
0.08238635, 0.38079879, 0.18709588, -0.14765534, 0.01780803,
-0.05188759, -0.02483106, 0.03612664, -0.0149317 , -0.10475352,
-0.05023222, -0.05713973, 0.07221895, 0.14290087, -0.05685572,
0.01866397, 0.22389664, -0.04868836, 0.01686323, 0.02283146,
-0.01206459, -0.05818488, 0.05559994, -0.09816868, 0.00713632,
0.06758018, -0.11922558, 0.04631898, 0.06196419, -0.13793124,
0.0963118 , 0.01178237, -0.03925588, 0.04579747, 0.01644197,
-0.13718042, -0.04710923, 0.24413618, -0.24439959, 0.27725762,
0.2127548 , 0.04380967, 0.12280341, 0.07158501, 0.13633233,
-0.03227835, -0.03378378, -0.09766266, -0.03692475, 0.01623037,
-0.04219364, -0.03954222, -0.04019544]), array([-3.03432867e-02, 5.43443598e-02, 6.78744391e-02, 2.58869212e-02,
-1.23072207e-01, -7.02521503e-02, -4.98226285e-02, -1.72496010e-02,
7.67212510e-02, -4.26685344e-03, 1.95951223e-01, -1.67683400e-02,
-2.15903431e-01, -4.35754284e-03, -3.12260389e-02, 7.05019981e-02,
-1.18235752e-01, -8.07965323e-02, -1.49761930e-01, -1.44610867e-01,
3.96064669e-02, 4.81094792e-02, 8.37654807e-03, -3.67331831e-03,
-1.19957335e-01, -2.90346920e-01, -8.59770030e-02, -1.07959040e-01,
1.21038556e-01, -1.51315838e-01, 3.27938870e-02, 4.70837858e-03,
-1.73844323e-01, -3.17394398e-02, 2.35435199e-02, 1.66472271e-02,
-1.98804624e-02, -9.39380080e-02, 1.69165611e-01, 6.68221340e-03,
-1.77107140e-01, 1.08059309e-02, 2.33938862e-02, 2.76533812e-01,
2.21089691e-01, 2.53173988e-02, -1.82928685e-02, -5.52510098e-02,
6.09008260e-02, -2.64982283e-01, 3.63944210e-02, 1.93908125e-01,
1.00877725e-01, 1.14827916e-01, 7.65965059e-02, -1.46504432e-01,
3.48312259e-02, 1.25334620e-01, -1.56430572e-01, 2.46226899e-02,
2.86777914e-02, -3.82297374e-02, -1.68778505e-02, -1.19121701e-01,
1.69600606e-01, 4.99524400e-02, -8.64719599e-02, -1.21018678e-01,
1.09727934e-01, -1.94434166e-01, -3.34550738e-02, 8.30303952e-02,
-1.11592978e-01, -1.87120140e-01, -2.38885000e-01, 1.09276593e-01,
3.92007828e-01, 2.01715931e-01, -1.94086283e-01, 2.30474807e-02,
-8.15587863e-02, -1.26086362e-02, 2.73021199e-02, -1.08986711e-02,
-1.09373406e-01, -5.31885028e-03, -8.07625800e-02, 6.63535371e-02,
1.74903929e-01, -9.10659656e-02, 3.04515511e-02, 1.86322704e-01,
-2.50804201e-02, 9.74564068e-03, 2.13033091e-02, -3.06630391e-04,
-1.11615628e-01, 3.51248235e-02, -9.08284336e-02, 1.10836141e-03,
1.00318097e-01, -1.36240765e-01, 5.00137471e-02, 8.65165964e-02,
-1.79704651e-01, 1.37958780e-01, 5.58579108e-04, -7.75038823e-02,
1.49468854e-02, 1.46387927e-02, -8.91202837e-02, -3.77928428e-02,
2.32269630e-01, -2.34496534e-01, 2.73653299e-01, 2.40941241e-01,
3.94841023e-02, 1.35746434e-01, 5.84627874e-02, 1.05018303e-01,
-5.28650098e-02, -1.84285827e-02, -1.11144938e-01, -4.56666909e-02,
1.28744273e-02, -4.79294769e-02, -4.71757315e-02, -6.85334997e-03]), array([-1.19727597e-01, 2.37059481e-02, 5.41203022e-02, -5.31067979e-03,
-1.02095716e-01, -6.70940429e-02, 5.28478026e-02, -9.28077772e-02,
1.14727125e-01, -5.81904836e-02, 1.65611401e-01, -9.44045261e-02,
-2.55105197e-01, -4.14437950e-02, -2.89703198e-02, 1.06697828e-01,
-9.61818695e-02, -1.10643283e-01, -1.84341758e-01, -9.77335051e-02,
8.45019799e-03, 6.26069605e-02, -1.26636475e-02, 5.73121430e-03,
-1.16348065e-01, -2.58341700e-01, -7.38507137e-02, -1.35549113e-01,
6.04273453e-02, -6.86645284e-02, 3.62175442e-02, 1.11231357e-01,
-1.17278963e-01, -5.90739734e-02, 2.98288725e-02, 3.80353555e-02,
8.69030412e-03, -8.14783499e-02, 2.00276792e-01, -3.30614746e-02,
-1.73424274e-01, -8.51858705e-02, 8.01797211e-02, 1.89619228e-01,
1.80681810e-01, -3.04257311e-02, 8.26369599e-03, -1.84000656e-02,
3.17999758e-02, -2.45052502e-01, -3.82037610e-02, 1.40862808e-01,
1.22625537e-01, 9.89214256e-02, 5.23140244e-02, -1.46283448e-01,
-8.13427381e-03, 1.29438370e-01, -1.73568085e-01, 7.16047511e-02,
1.99302640e-02, -1.64908677e-01, -6.60278201e-02, -2.77188811e-02,
1.60548747e-01, 8.64261240e-02, -8.74619484e-02, -1.47788346e-01,
1.73828602e-01, -1.89269871e-01, -4.95751537e-02, 8.37096721e-02,
-1.09814622e-01, -1.14491023e-01, -2.01664209e-01, 2.60536000e-03,
3.81230652e-01, 1.08411252e-01, -1.66443884e-01, -6.42048474e-03,
-1.52495116e-01, -6.41984940e-02, -2.20064986e-02, 8.72708671e-03,
-3.22415009e-02, -1.00544520e-01, -9.76137221e-02, -1.15213916e-04,
2.00150207e-01, -7.71829262e-02, 1.90558862e-02, 1.59315109e-01,
-2.54158806e-02, -5.27779348e-02, -2.89773215e-02, -4.60110046e-03,
-1.19977474e-01, -8.91544484e-03, -2.61534844e-03, -5.38323261e-02,
9.46650878e-02, -1.22638777e-01, 4.54899520e-02, 9.24251229e-02,
-1.47666842e-01, 1.31898567e-01, -3.10513265e-02, -2.91747078e-02,
3.84223610e-02, -3.18786129e-02, -2.67100539e-02, -5.41412318e-03,
2.24448472e-01, -1.44214451e-01, 2.11022705e-01, 1.61338180e-01,
-4.03464250e-02, 1.21684045e-01, 2.53224019e-02, 1.12374000e-01,
-1.62403062e-02, 1.75351724e-02, -1.21122740e-01, -1.05977543e-01,
1.91601515e-02, -2.42309403e-02, -3.90784908e-03, 9.26381629e-03])]
Save the face code into Milvus
Sample code in the docs
First release the basic code used in the document. I copied it step by step according to the prompts in the document. Here I only tested insert and query. There are prompts for the next step at the end of each page (the Chinese in the screenshot is the browser translation ), you need to change the host to your own server ip when using the code.
The link to the Chinese document in the code comment has now jumped to the English document, because I reported that there was a problem with the Chinese document , and then the official deleted the Chinese document, or even though it was not deleted, you were not allowed to access the Chinese document.
from pymilvus import connections, CollectionSchema, FieldSchema, DataType, Collection
import random
# 代码是从2.0.0中文文档复制的
# 从gitee的README中https://gitee.com/milvus-io/milvus#%E5%90%AF%E5%8A%A8-milvus中'安装 Milvus 单机版'进入中文文档https://milvus.io/cn/docs/v2.0.0/install_standalone-docker.md
# 从github的README中https://github.com/milvus-io/milvus#install-milvus中'Standalone Quick Start Guide'进入英文文档https://milvus.io/docs/v2.0.x/install_standalone-docker.md
# 中文文档可能有字母大小写错误(会导致报错),最好看英文文档
connections.connect(alias="default", host='YOUR HOST', port='19530') # 构建一个 Milvus 连接,alias是创建的Milvus连接的别名
# 准备架构,包括字段架构、集合架构和集合名称。
book_id = FieldSchema(
name="book_id",
dtype=DataType.INT64,
is_primary=True,
)
word_count = FieldSchema(
name="word_count",
dtype=DataType.INT64,
)
book_intro = FieldSchema(
name="book_intro",
dtype=DataType.FLOAT_VECTOR, # 一个集合中必须有一个字段是DataType.FLOAT_VECTOR或DataType.BINARY_VECTOR类型,文档原话The collection to create must contain a primary key field and a vector field. INT64 is the only supported data type for the primary key field in current release of Milvus. from https://milvus.io/cn/docs/v2.0.0/create_collection.md#Prepare-Schema
dim=2 # 向量列数,一个向量有多少个元素就有多少列。
)
schema = CollectionSchema(
fields=[book_id, word_count, book_intro],
description="Test book search"
)
collection_name = "book"
# 使用架构创建集合
collection = Collection(
name=collection_name,
schema=schema,
using='default', # 服务器别名,要在哪个服务器创建集合
shards_num=2,
consistency_level="Strong"
)
# 准备要插入的数据。要插入的数据的数据类型必须与集合的架构匹配,否则 Milvus 将引发异常。
data = [
[i for i in range(2000)],
[i for i in range(10000, 12000)],
[[random.random() for _ in range(2)] for _ in range(2000)],
]
collection = Collection("book") # Get an existing collection.
mr = collection.insert(data) # 插入数据
# 为向量构建索引。矢量索引是元数据的组织单位,用于加速矢量相似性搜索。如果没有在向量上构建索引,Milvus将默认执行暴力搜索。
index_params = { # 准备索引参数
"metric_type":"L2",
"index_type":"IVF_FLAT",
"params":{"nlist":1024}
}
# 通过指定矢量字段名称和索引参数来构建索引
collection = Collection("book") # Get an existing collection.
collection.create_index(
field_name="book_intro",
index_params=index_params
)
# Milvus 中的所有搜索和查询操作都在内存中执行。在执行向量相似性搜索之前将 collection 加载到内存中。
collection = Collection("book") # Get an existing collection.
collection.load()
search_params = {"metric_type": "L2", "params": {"nprobe": 10}} # 准备适合你的搜索场景的参数。下面的示例定义了搜索将使用欧式距离计算,并从 IVF_FLAT 索引构建的十个最近的聚类中检索向量。
results = collection.search( # 进行向量搜索
data=[[0.1, 0.2]],
anns_field="book_intro",
param=search_params,
limit=10, # 输出相似度最高的向量结果的数量
expr=None,
consistency_level="Strong" # 从gitee的README进入的中文文档(https://milvus.io/cn/docs/v2.0.0/search.md#%E8%BF%9B%E8%A1%8C%E5%90%91%E9%87%8F%E6%90%9C%E7%B4%A2)中这里错写成strong,会有下面两行的报错,源码中说这个参数的介绍在https://github.com/milvus-io/milvus/blob/master/docs/developer_guides/how-guarantee-ts-works.md,看了参数介绍中'if no consistency level was specified, search will use the consistency level when you create the collection.',再看Collection()创建集合时的代码(https://milvus.io/cn/docs/v2.0.0/create_collection.md#Create-a-collection-with-the-schema),发现是大小写错了。从github的README进入的英文文档没错。
# raise InvalidConsistencyLevel(0, f"invalid consistency level: {consistency_level}") from e
# pymilvus.client.exceptions.InvalidConsistencyLevel: <InvalidConsistencyLevel: (code=0, message=invalid consistency level: strong)>
)
# 查看最相似向量的 primary key 及其距离值
print(results[0].ids)
print(results[0].distances)
collection.release() # 搜索完成时释放 Milvus 中加载的 collection 以减少内存消耗
connections.disconnect("default") # 断开 Milvus 连接
my actual code
When using it, you need to change the host to your own server ip
import face_recognition
import os
import time
import random
from pymilvus import connections, CollectionSchema, FieldSchema, DataType, Collection
import cv2
# 路径中不能有中文
img_folder_path = r"C:\Users\PC-1\Desktop\face_img"
# 判断图片是否清晰,参数是图片的绝对路径
def getImageVar(imgPath: str) -> bool:
image = cv2.imread(imgPath)
img2gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
imageVar = cv2.Laplacian(img2gray, cv2.CV_64F).var() # 一般大于100时认为图片是清晰的,实际需根据情况调节
if imageVar <= 100:
return False
elif imageVar > 100:
return True
# 检测图片中是否有人脸。能接受不完整的人脸,如缺少鼻子以下,此时是能否检测到人脸的阈值之一,人脸中其他部位的缺少情况未测试
def check_face_img(img_path: str) -> bool:
img = face_recognition.load_image_file(img_path) # 加载图像。会将图像加载到 numpy 数组中。
result = face_recognition.face_locations(img) # 返回图像中每张人脸的人脸特征位置列表。图像中没有人脸时,返回空列表
if len(result)>0:
return True
else:
return False
connections.connect(alias="default", host='YOUR HOST', port='19530') # 构建一个 Milvus 连接,alias是创建的Milvus连接的别名
# 准备架构,包括字段架构、集合架构和集合名称。
face_id = FieldSchema(
name="face_id",
dtype=DataType.INT64,
is_primary=True,
auto_id=False # 关闭自动赋值自增id,实际人员的id由前端传入
)
face_feature_vector = FieldSchema(
name="face_feature_vector",
dtype=DataType.FLOAT_VECTOR, # 128维人脸编码是浮点数。一个集合中必须有一个字段是DataType.FLOAT_VECTOR或DataType.BINARY_VECTOR类型,文档原话The collection to create must contain a primary key field and a vector field. INT64 is the only supported data type for the primary key field in current release of Milvus. from https://milvus.io/cn/docs/v2.0.0/create_collection.md#Prepare-Schema
dim=128 # face_recognition.face_encodings()返回的列表再取第一个元素,值是128维人脸编码。向量列数,一个向量有多少个元素就有多少列。
)
schema = CollectionSchema(
fields=[face_id, face_feature_vector],
description="face feature vector test test test"
)
collection_name = "face_feature_vector_test_test"
# 使用架构创建集合
collection = Collection(
name=collection_name,
schema=schema,
using='default', # 服务器别名,要在哪个服务器创建集合
shards_num=2,
consistency_level="Strong"
)
# 准备要插入的数据。要插入的数据的数据类型必须与集合的架构匹配,否则 Milvus 将引发异常。
img_list = os.listdir(img_folder_path)
for img in img_list:
img_path = os.path.join(img_folder_path, img)
if getImageVar(img_path) and check_face_img(img_path):
img = face_recognition.load_image_file(img_path) # 加载图像
img_face_encoding = face_recognition.face_encodings(img)[0] # 返回图像中每张人脸的 128 维人脸编码。后面使用face_recognition.face_distance()时,face_recognition.face_distance()的face_encodings参数中的值不能是由face_recognition.face_encodings(img)组成的,而应由face_recognition.face_encodings(img)[0]组成。
temp_id = random.randint(0, 100000) # face_id字段的值,能唯一确定一个人,暂时随机产生,实际应由前端传入
collection = Collection("face_feature_vector_test_test") # Get an existing collection.
mr = collection.insert([ [temp_id], [img_face_encoding] ]) # 插入数据。from https://github.com/milvus-io/milvus/discussions/10713
# print(mr) # 插入操作的执行结果
elif not getImageVar(img_path):
print(f'{img_path}图片太模糊,请重新拍摄,入库时已忽略此图片')
elif not check_face_img(img_path):
print(f'{img_path}没有检测到完整人脸,请重新拍摄,入库时已忽略此图片')
# 为向量构建索引。矢量索引是元数据的组织单位,用于加速矢量相似性搜索。如果没有在向量上构建索引,Milvus将默认执行暴力搜索。
index_params = { # 准备索引参数
"metric_type":"L2",
"index_type":"IVF_FLAT",
"params":{"nlist":1024}
}
# 通过指定矢量字段名称和索引参数来构建索引
collection = Collection("face_feature_vector_test_test") # Get an existing collection.
collection.create_index(
field_name="face_feature_vector",
index_params=index_params
)
# Milvus 中的所有搜索和查询操作都在内存中执行。在执行向量相似性搜索之前将 collection 加载到内存中。
collection = Collection("face_feature_vector_test_test") # Get an existing collection.
collection.load()
connections.disconnect("default") # 断开 Milvus 连接
Query from Milvus
The similarity has a threshold, and all results that meet the threshold are returned. The thresholds 0.2 and 0.6 here are all from the articles in the reference link. When using it, you need to change the host to your own server ip
import face_recognition
import os
import time
import random
from pymilvus import connections, CollectionSchema, FieldSchema, DataType, Collection
import cv2
t1 = time.time()
# 路径中不能有中文
# source_img_file_path = r"C:\Users\PC-1\Desktop\ZhouJieLun1.png"
source_img_file_path = r"C:\Users\PC-1\Desktop\LinJunJie2.png"
# 判断图片是否清晰,参数是图片的绝对路径
def getImageVar(imgPath: str) -> bool:
image = cv2.imread(imgPath)
img2gray = cv2.cvtColor(image, cv2.COLOR_BGR2GRAY)
imageVar = cv2.Laplacian(img2gray, cv2.CV_64F).var() # 一般大于100时认为图片是清晰的,实际需根据情况调节
if imageVar <= 100:
return False
elif imageVar > 100:
return True
# 检测图片中是否有人脸。能接受不完整的人脸,如缺少鼻子以下,此时是能否检测到人脸的阈值之一,人脸中其他部位的缺少情况未测试
def check_face_img(img_path: str) -> bool:
img = face_recognition.load_image_file(img_path) # 加载图像。会将图像加载到 numpy 数组中。
result = face_recognition.face_locations(img) # 返回图像中每张人脸的人脸特征位置列表。图像中没有人脸时,返回空列表
if len(result)>0:
return True
else:
return False
if getImageVar(source_img_file_path) and check_face_img(source_img_file_path):
connections.connect(alias="default", host='YOUR HOST', port='19530') # 构建一个 Milvus 连接,alias是创建的Milvus连接的别名
# Milvus 中的所有搜索和查询操作都在内存中执行。在执行向量相似性搜索之前将 collection 加载到内存中。
collection = Collection("face_feature_vector_test_test") # Get an existing collection.
collection.load()
search_params = {"metric_type": "L2", "params": {"nprobe": 10}} # 准备适合你的搜索场景的参数。下面的示例定义了搜索将使用欧式距离计算,并从 IVF_FLAT 索引构建的十个最近的聚类中检索向量。
img = face_recognition.load_image_file(source_img_file_path) # 加载图像
img_face_encoding = face_recognition.face_encodings(img)[0]
results = collection.search( # 进行向量搜索
data=[img_face_encoding],
anns_field="face_feature_vector",
param=search_params,
limit=2, # 输出相似度最高的向量结果的数量,即输出几个与被匹配人脸最相似的人脸特征向量
expr=None,
consistency_level="Strong" # 从gitee的README进入的中文文档(https://milvus.io/cn/docs/v2.0.0/search.md#%E8%BF%9B%E8%A1%8C%E5%90%91%E9%87%8F%E6%90%9C%E7%B4%A2)中这里错写成strong,会有下面两行的报错,源码中说这个参数的介绍在https://github.com/milvus-io/milvus/blob/master/docs/developer_guides/how-guarantee-ts-works.md,看了参数介绍中'if no consistency level was specified, search will use the consistency level when you create the collection.',再看Collection()创建集合时的代码(https://milvus.io/cn/docs/v2.0.0/create_collection.md#Create-a-collection-with-the-schema),发现是大小写错了。从github的README进入的英文文档没错。
# raise InvalidConsistencyLevel(0, f"invalid consistency level: {consistency_level}") from e
# pymilvus.client.exceptions.InvalidConsistencyLevel: <InvalidConsistencyLevel: (code=0, message=invalid consistency level: strong)>
)
# 查看最相似向量的 primary key 及其距离值
id_list = results[0].ids
distance_list = results[0].distances
# print(id_list)
# print(distance_list)
if (len(list(id_list)) > 0) and (list(distance_list)[0] <= 0.2):
print(list(id_list)[0])
print(list(distance_list)[0])
else:
print('这是else,因欧氏距离全都大于0.2,不一定真实地匹配成功,0.2这个阈值是暂时写的,具体需根据情况测试确定')
print(id_list)
print(distance_list)
# collection.release() # 搜索完成时释放 Milvus 中加载的 collection 以减少内存消耗
connections.disconnect("default") # 断开 Milvus 连接
elif not getImageVar(source_img_file_path):
print('图片太模糊,请重新拍摄')
elif not check_face_img(source_img_file_path):
print('没有检测到完整人脸,请重新拍摄')
t2 = time.time()
print(t2-t1)
Delete face information from Milvus
Delete the current piece of data according to the value of the field. At present (Milvus 2.0.1, PyMilvus 2.0.2) it can only be deleted according to the value of the primary key, and other fields cannot be used. When using it, you need to change the host to your own server ip
from pymilvus import connections, CollectionSchema, FieldSchema, DataType, Collection
connections.connect(alias="default", host='YOUR HOST', port='19530') # 构建一个 Milvus 连接,alias是创建的Milvus连接的别名
expr = "face_id in [835193322]" # 删除时只能用in,不能用其他运算符,如==,文档中有说明
collection = Collection("face_feature_vector_test_test") # Get an existing collection.
result = collection.delete(expr)
print(result)
connections.disconnect("default") # 断开 Milvus 连接
reference link
6 ways to read pictures in Python - Programmer Sought
Face recognition and face comparison
opencv training face comparison_Python face detection method summary
Windows-Install the dlib library (pro-test is definitely possible, super detailed)
Python face recognition face_recognition - Programmer Sought
Python uses face_recognition face recognition - I am ed - 博客园
Teach you to use python to realize face recognition, the recognition rate is as high as 99.38% bzdww
[Deep Learning] Python face recognition library face_recognition tutorial
Python OpenCV Computer Vision Learning (Face Recognition) 01_ou.cs Blog - CSDN Blog
Use Python to implement a simple - face similarity comparison - Programmer Sought
Check if the image is blurry
How to perform fuzzy detection - Zhihu
OpenCV combat series image blur detection - Programmer Sought
Image Blur Detection Using Laplacian Transform - Arkenstone - 博客园
Python+Opencv detects blurry pictures - Programmer Sought
Practical Tips | Blur detection with OpenCV - Tencent Cloud Developer Community - Tencent Cloud
Use OpenCV to perform blur detection on photos in mobile phones - Programmer Sought
Save face data to database
Database Access for Face Recognition
Attu, the Milvus graphical management tool, is here! - Know almost
Powerful vector database: Milvus bzdww
Milvus2.0 | Simple Introduction to Vector Database Python - Programmer Sought
How Milvus Realizes Dynamic Data Update and Query - Nuggets
Unprecedented analysis of Milvus source code architecture Develop Paper
Milvus Analysis | Unprecedented Milvus Source Code Architecture Analysis-Technology Circle , as of the time I posted this article, the content is the same as the previous link, but there is an extra QR code at the end of the article
Milvus Pit Avoidance Raiders-Knowledge
Milvus data backup migration
This is an additional supplementary content that is not used in this article
0325 Live Registration|Milvus Data Migration Tool and Milvus v1.0 bazyd
Milvus Data Migration Tool-Milvusdm_Tencent News
GitHub - milvus-io/milvus-tools: A data migration tool for Kitvus.
Milvus Data Migration Tool-Milvusdm Introduction & Milvus Community Group Established_哔哩哔哩_bilibili