基于OpenCV & face-recognition 的实时人脸识别与身份验证
项目英文名:Face recognition from camera with OpenCV
一、系统需求设计
1. 系统概述
本人工智能人脸识别系统旨在实现对人脸图像或视频流中的人脸进行精准识别、验证与分析,可以应用于安防监控、门禁系统、人员考勤、身份认证等多个领域,以提高安全性与管理效率。
2. 功能概述
1). 人脸录入与图像编码
2). 人脸识别与身份验证
3). web端实现上述功能
二、系统功能模块介绍
核心环境配置:
模块
版本
python
3.12.4
opencv-python
4.10.0.84
face-recognition
1.3.0
Pillow
10.0.0
numpy
2.1.2
Flask (web需要)
3.1.0
项目架构如下:
1 2 3 4 5 6 7 8 Face_recognition_local ├─ camera.py ├─ fps.py ├─ haarcascade_frontalface_default.xml ├─ trained_model.pkl ├─ training.py ├─ main.py └─ person
1 2 3 4 5 6 7 8 9 10 Face_recognition_web ├─ camera.py ├─ fps.py ├─ haarcascade_frontalface_default.xml ├─ trained_model.pkl ├─ training.py ├─ person ├─ app.py └─ templates └─ index.html
camera.py :优化后的opencv视频捕获
fps.py :用于展示当前视频画面fps
haarcascade_frontalface_default.xml :opencv提供的预训练集,用于捕获人脸
trained_model.pkl :录入的人脸图片编码后的文件
training.py :用于将jpg图片编码为.pkl
person :存放人脸图片
main.py :主程序
app.py :web后端
templates :web前端文件夹
index.html :web前端
三、系统实现
系统核心流程图
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 import osimport face_recognitionimport pickleperson_encodings = [] person_names = [] for filename in os.listdir('person' ): if filename.endswith('.jpg' ): image = face_recognition.load_image_file(os.path.join('person' , filename)) encodings = face_recognition.face_encodings(image) if encodings: encoding = encodings[0 ] person_encodings.append(encoding) person_name = '' .join([i for i in os.path.splitext(filename)[0 ] if not i.isdigit()]) person_names.append(person_name) with open ('trained_model.pkl' , 'wb' ) as f: pickle.dump((person_encodings, person_names), f)
分段解析:
读取person文件夹中.jpg文件,其中文件名格式为 姓名+编号.jpg
,如 张三1.jpg
1 2 3 4 5 person_encodings = [] person_names = [] for filename in os.listdir('person' ): if filename.endswith('.jpg' ):
使用face-recognition编码图片
1 2 3 4 5 6 7 8 9 image = face_recognition.load_image_file(os.path.join('person' , filename)) encodings = face_recognition.face_encodings(image) if encodings: encoding = encodings[0 ] person_encodings.append(encoding) person_name = '' .join([i for i in os.path.splitext(filename)[0 ] if not i.isdigit()]) person_names.append(person_name)
在person文件夹中放入目标人脸的jpg图片后运行脚本,得到模型trained_model.pkl
经测试,每人9个图片识别准确率为 65%-90%
主流程
来源:基于cv2.VideoCapture 和 OpenCV 得到更快的 FPS之文件篇
主要优化函数:cv2.VideoCapture
.read
方法是一个阻塞操作,通过将这些阻塞 I/O 操作移至单独的线程并维护解码帧队列,我们实际上可以将 FPS 处理速率提高 52% 以上
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 from threading import Thread, Lockfrom datetime import datetimeimport timeimport cv2time_cycle = 80 class CameraThread (Thread ): def __init__ (self, kill_event, src = 0 , width = 320 , height = 240 ): self .kill_event = kill_event self .stream = cv2.VideoCapture(src) self .stream.set (cv2.CAP_PROP_FRAME_WIDTH, width) self .stream.set (cv2.CAP_PROP_FRAME_HEIGHT, height) (self .grabbed, self .frame) = self .stream.read() self .read_lock = Lock() Thread.__init__(self , args = kill_event) def update (self ): (grabbed, frame) = self .stream.read() self .read_lock.acquire() self .grabbed, self .frame = grabbed, frame self .read_lock.release() def read (self ): self .read_lock.acquire() frame = self .frame.copy() self .read_lock.release() return frame def run (self ): while not self .kill_event.is_set(): start_time = datetime.now() self .update() finish_time = datetime.now() dt = finish_time - start_time ms = (dt.days * 24 * 60 * 60 + dt.seconds) * 1000 + dt.microseconds / 1000.0 if ms < time_cycle: time.sleep((time_cycle - ms) / 1000.0 )
CameraThread
类继承自 Thread
类,可以在单独的线程 中捕获摄像头图像
初始化方法 __init__
:
1 2 3 4 5 6 7 8 9 10 11 def __init__ (self, kill_event, src=0 , width=320 , height=240 ): self .kill_event = kill_event self .stream = cv2.VideoCapture(src) self .stream.set (cv2.CAP_PROP_FRAME_WIDTH, width) self .stream.set (cv2.CAP_PROP_FRAME_HEIGHT, height) (self .grabbed, self .frame) = self .stream.read() self .read_lock = Lock() Thread.__init__(self , args=(kill_event,))
kill_event
:用于停止线程的事件对象。
src
:摄像头索引,默认值为 0。
width
和 height
:视频帧的宽度和高度。
self.stream
:创建一个视频捕获对象。
self.stream.set
:设置视频帧的宽度和高度。
self.grabbed
和 self.frame
:读取第一帧图像。
self.read_lock
:创建一个锁对象,用于线程同步。
更新方法 update
:
1 2 3 4 5 def update (self ): (grabbed, frame) = self .stream.read() self .read_lock.acquire() self .grabbed, self .frame = grabbed, frame self .read_lock.release()
self.stream.read()
:读取一帧图像。
self.read_lock.acquire()
和 self.read_lock.release()
:在更新 self.grabbed
和 self.frame
时加锁和解锁,以确保线程安全。
读取方法 read
:
1 2 3 4 5 def read (self ): self .read_lock.acquire() frame = self .frame.copy() self .read_lock.release() return frame
self.read_lock.acquire()
和 self.read_lock.release()
:在读取 self.frame
时加锁和解锁,以确保线程安全。
self.frame.copy()
:返回当前帧的副本。
运行方法 run
:
1 2 3 4 5 6 7 8 9 10 def run (self ): while not self .kill_event.is_set(): start_time = datetime.now() self .update() finish_time = datetime.now() dt = finish_time - start_time ms = (dt.days * 24 * 60 * 60 + dt.seconds) * 1000 + dt.microseconds / 1000.0 if ms < time_cycle: time.sleep((time_cycle - ms) / 1000.0 )
while not self.kill_event.is_set()
:循环运行,直到 kill_event
被设置。
start_time
和 finish_time
:记录每次循环的开始和结束时间。
dt
:计算每次循环的时间差。
ms
:将时间差转换为毫秒。
time.sleep((time_cycle - ms) / 1000.0)
:如果循环时间小于 time_cycle
,则延时以控制帧率。
1 2 3 4 5 6 7 8 9 10 11 12 import timeclass FPS : def __init__ (self ): self .prev_time = time.time() self .fps = 0 def update (self ): current_time = time.time() self .fps = 1 / (current_time - self .prev_time) self .prev_time = current_time return self .fps
获取当前时间戳 current_time
。
计算当前帧率 self.fps
f p s = 1 ( c u r r e n t _ t i m e − s e l f . p r e v _ t i m e ) fps = {1\over(current\_time-self.prev\_time)}
f p s = ( c u r r e n t _ t i m e − s e l f . p r e v _ t i m e ) 1
更新 self.prev_time
为当前时间戳 current_time
。
返回计算得到的帧率 self.fps
。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 import cv2import tkinter as tkfrom PIL import Image, ImageTk, ImageDrawimport numpy as npfrom PIL import ImageFontfrom threading import Eventfrom camera import CameraThreadimport face_recognitionimport picklefrom fps import FPSchoose_camera = 0 min_matching_degree = 0.65 def cv2AddChineseText (img, text, position, textColor=(0 , 255 , 0 ), textSize=30 ): if (isinstance (img, np.ndarray)): img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB)) draw = ImageDraw.Draw(img) fontStyle = ImageFont.truetype("simsun.ttc" , textSize, encoding="utf-8" ) draw.text(position, text, textColor, font=fontStyle) return cv2.cvtColor(np.asarray(img), cv2.COLOR_RGB2BGR) face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml' ) root = tk.Tk() root.geometry('640x480' ) root.title('人脸识别' ) image_label = tk.Label(root) image_label.pack() photo = None with open ('trained_model.pkl' , 'rb' ) as f: person_encodings, person_names = pickle.load(f) kill_event = Event() camera_thread = CameraThread(kill_event, src=choose_camera, width=640 , height=480 ) camera_thread.start() fps_calculator = FPS() def update_frame (): global photo frame = camera_thread.read() frame = cv2.flip(frame, 1 ) rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) face_locations = face_recognition.face_locations(rgb_frame) face_encodings = face_recognition.face_encodings(rgb_frame, face_locations) for (top, right, bottom, left), face_encoding in zip (face_locations, face_encodings): matches = face_recognition.compare_faces(person_encodings, face_encoding) face_distances = face_recognition.face_distance(person_encodings, face_encoding) best_match_index = np.argmin(face_distances) name = "Unknown" matching_degree = 1 - face_distances[best_match_index] if matches[best_match_index] and matching_degree > min_matching_degree: name = person_names[best_match_index] cv2.rectangle(frame, (left, top), (right, bottom), (0 , 255 , 255 ), 2 ) frame = cv2AddChineseText(frame, name, (left + (right-left)//2 - 10 , top - 30 ), (0 , 255 , 255 ), 30 ) fps = fps_calculator.update() cv2.putText(frame, f"FPS: {fps:.2 f} " , (10 , 30 ), cv2.FONT_HERSHEY_SIMPLEX, 1 , (0 , 255 , 0 ), 2 ) image = Image.fromarray(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)) photo = ImageTk.PhotoImage(image) image_label.configure(image=photo) image_label.image = photo root.after(10 , update_frame) update_frame() def on_closing (): kill_event.set () camera_thread.join() camera_thread.stream.release() cv2.destroyAllWindows() root.destroy() root.protocol("WM_DELETE_WINDOW" , on_closing) root.mainloop()
web实现——app.py & index.html
app.py是main.py使用Flask框架后的后端服务
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 from flask import Flask, render_template, Responseimport cv2import face_recognitionimport pickleimport numpy as npfrom fps import FPSfrom PIL import Image, ImageDraw, ImageFontapp = Flask(__name__) choose_camera = 1 min_matching_degree = 0.65 with open ('trained_model.pkl' , 'rb' ) as f: person_encodings, person_names = pickle.load(f) cap = cv2.VideoCapture(choose_camera) fps_calculator = FPS() def cv2AddChineseText (img, text, position, textColor=(0 , 255 , 0 ), textSize=30 ): if (isinstance (img, np.ndarray)): img = Image.fromarray(cv2.cvtColor(img, cv2.COLOR_BGR2RGB)) draw = ImageDraw.Draw(img) fontStyle = ImageFont.truetype("simsun.ttc" , textSize, encoding="utf-8" ) draw.text(position, text, textColor, font=fontStyle) return cv2.cvtColor(np.asarray(img), cv2.COLOR_RGB2BGR) def generate_frames (): while True : success, frame = cap.read() if not success: break else : frame = cv2.flip(frame, 1 ) rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) face_locations = face_recognition.face_locations(rgb_frame) face_encodings = face_recognition.face_encodings(rgb_frame, face_locations) for (top, right, bottom, left), face_encoding in zip (face_locations, face_encodings): matches = face_recognition.compare_faces(person_encodings, face_encoding) face_distances = face_recognition.face_distance(person_encodings, face_encoding) best_match_index = np.argmin(face_distances) name = "Unknown" matching_degree = 1 - face_distances[best_match_index] if matches[best_match_index] and matching_degree > min_matching_degree: name = person_names[best_match_index] cv2.rectangle(frame, (left, top), (right, bottom), (0 , 255 , 255 ), 2 ) frame = cv2AddChineseText(frame, name, (left + (right-left)//2 - 10 , top - 30 ), (0 , 255 , 255 ), 30 ) fps = fps_calculator.update() cv2.putText(frame, f"FPS: {fps:.2 f} " , (10 , 30 ), cv2.FONT_HERSHEY_SIMPLEX, 1 , (0 , 255 , 0 ), 2 ) ret, buffer = cv2.imencode('.jpg' , frame) frame = buffer.tobytes() yield (b'--frame\r\n' b'Content-Type: image/jpeg\r\n\r\n' + frame + b'\r\n' ) @app.route('/' ) def index (): return render_template('index.html' ) @app.route('/video_feed' ) def video_feed (): return Response(generate_frames(), mimetype='multipart/x-mixed-replace; boundary=frame' ) if __name__ == '__main__' : app.run(debug=True )
前端页面index.html
1 2 3 4 5 6 7 8 9 10 11 12 <!DOCTYPE html > <html lang ="en" > <head > <meta charset ="UTF-8" > <meta name ="viewport" content ="width=device-width, initial-scale=1.0" > <title > 人脸识别</title > </head > <body > <h1 > 人脸识别</h1 > <img src ="{{ url_for('video_feed') }}" width ="640" height ="480" > </body > </html >
四、系统效果
预先使用本人9张照片进行训练
CPU
i9-14900HX
内存
32GB
显卡
RTX4060
平均帧率:2.6
平均准确率:70%
五、改进历程与不足
在刚开始制作时,并没有使用face-recognition训练模型进行人脸验证,采取的验证手段的为读取文件夹内所有图片,寻找与捕获人脸匹配度更高的图片。发现每次验证都要遍历文件夹内所有图片,严重影响系统性能(以至于卡死),于是对代码进行重构,使用face-recognition训练模型,将捕获的人脸与模型对比,大大缓解了性能问题。同时,由于cv2.VideoCapture的read方法阻塞,性能仍然不佳,搜集资料后采用一位博主的方法 ,将这些阻塞 I/O 操作移至单独的线程。
虽然系统已经能正常运行,但帧率仍然很低,还需要进一步优化。除此之外,目前系统使用的opencv官方提供的分类器 haarcascade_frontalface_default.xml
对于捕获正脸方面较优,但捕获其他方向和复杂表情方面效果很差。需要训练一个新的分类器来适应更复杂的环境。
六、后续优化
根据实际情况来看,并不需要每一帧都检测人脸,因此可以通过增加检测人脸间隔来提升流畅度。
大致思路:
1 2 3 4 5 6 7 8 9 detection_interval = 5 //间隔帧率 frame_count = 0 //计算帧率 def detect_faces (self ): while not self .kill_event.is_set(): if self .frame_count % self .detection_interval == 0 : frame = self .camera_thread.read() ... self .frame_count += 1
此外,还可以将人脸检测放到独立的线程中,避免阻塞线程
结合以上两点,创建新的python文件 FaceDetector.py
,主程序通过在新的线程调用人脸检测来缓解卡顿
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 import threadingimport cv2import face_recognitionclass FaceDetector : def __init__ (self, camera_thread, detection_interval ): self .camera_thread = camera_thread self .detection_interval = detection_interval self .frame_count = 0 self .face_locations = [] self .face_encodings = [] self .kill_event = threading.Event() self .lock = threading.Lock() self .detection_thread = threading.Thread(target=self .detect_faces) self .detection_thread.start() def detect_faces (self ): while not self .kill_event.is_set(): if self .frame_count % self .detection_interval == 0 : frame = self .camera_thread.read() rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) face_locations = face_recognition.face_locations(rgb_frame) face_encodings = face_recognition.face_encodings(rgb_frame, self .face_locations) with self .lock: self .face_locations = face_locations self .face_encodings = face_encodings self .frame_count += 1 def stop (self ): self .kill_event.set () self .detection_thread.join() def get_faces (self ): with self .lock: return self .face_locations, self .face_encodings
修改main.py
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 detection_interval = 20 face_detector = FaceDetector(camera_thread, detection_interval) def update_frame (): '''替换 # 转换图像格式 rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) # 检测人脸 face_locations = face_recognition.face_locations(rgb_frame) face_encodings = face_recognition.face_encodings(rgb_frame, face_locations) ''' face_locations, face_encodings = face_detector.get_faces()
这样修改后帧率大大提高,但是出现了框选人脸位置错误的问题,如下图:
人脸位置和框选位置刚好对称
原因是main.py中通过 frame = cv2.flip(frame, 1)
进行了镜像处理,FaceDetect.py 中调用 camera.py 后没有镜像处理,返回的人脸位置是没有经过处理的人脸位置,所以可以在 FaceDetect.py 中添加 frame = cv2.flip(frame, 1)
,或者删除 main.py 中的镜像处理。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 import threadingimport cv2import face_recognitionclass FaceDetector : def __init__ (self, camera_thread, detection_interval ): self .camera_thread = camera_thread self .detection_interval = detection_interval self .frame_count = 0 self .face_locations = [] self .face_encodings = [] self .kill_event = threading.Event() self .lock = threading.Lock() self .detection_thread = threading.Thread(target=self .detect_faces) self .detection_thread.start() def detect_faces (self ): while not self .kill_event.is_set(): if self .frame_count % self .detection_interval == 0 : frame = self .camera_thread.read() frame = cv2.flip(frame, 1 ) rgb_frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) face_locations = face_recognition.face_locations(rgb_frame) face_encodings = face_recognition.face_encodings(rgb_frame, self .face_locations) with self .lock: self .face_locations = face_locations self .face_encodings = face_encodings self .frame_count += 1 def stop (self ): self .kill_event.set () self .detection_thread.join() def get_faces (self ): with self .lock: return self .face_locations, self .face_encodings
改进效果比对如下
可见通过添加新的线程和增加间隔的改进效果非常显著。
参考资料:
[1] OpenCV Tutorials : https://docs.opencv.org/4.x/d9/df8/tutorial_root.html
[2] face-recognition : https://github.com/ageitgey/face_recognition
[3] 基于cv2.VideoCapture 和 OpenCV 得到更快的 FPS之文件篇 : https://blog.csdn.net/weixin_43229348/article/details/122688684
[4] OpenCV 中文文档 : https://apachecn.github.io/opencv-doc-zh/#/
[5] Flask框架入门教程 : https://blog.csdn.net/wly55690/article/details/131683846