How To Building A Facial Recognition System

Overview of OpenCV

OpenCV (Open Source Computer Vision Library) is an open-source computer vision and machine learning software library. It was designed for computational efficiency and is widely used for real-time computer vision applications. OpenCV supports various programming languages, including Python, C++, and Java, making it versatile for developers.

The library provides a vast array of functions that allow users to process images and videos to identify objects, faces, and even read and understand gestures and expressions. It includes built-in algorithms for image filtering, feature detection, object tracking, and more. OpenCV's extensive community and robust set of tools make it ideal for creating applications ranging from basic image manipulations to complex deep learning integrations.

With its ease of use and capability to handle large-scale data, OpenCV has become a popular choice in industries like security, healthcare, and entertainment for projects involving computer vision.

Step 1: Getting And Installing Necessary Files And Libraries

To get started we need to install the opencv library for image processing copy the code below and paste it in your terminal

pip install opencv-python

Step 2: Accessing The Webcam

import cv2

This line imports the cv2 module, which is part of OpenCV. This module contains functions for image and video processing.

# Open the default webcam (0 indicates the first camera)
cap = cv2.VideoCapture(0)

This line starts the process of accessing your computer’s default webcam. The 0 parameter tells OpenCV to use the first connected camera. If you had multiple cameras and wanted to use a different one, you could use 1, 2, etc.

# Check if the webcam is opened successfully
if not cap.isOpened():
    print("Error: Could not open webcam.")
    exit()

This checks if the webcam opened successfully. If it didn’t, the program prints an error message and stops running using exit().

# Capture video frame-by-frame
while True:
    ret, frame = cap.read()

This line starts a loop to read frames continuously from the webcam. The cap.read() function captures each frame. ret is a boolean that shows if the frame was read correctly (True or False), and frame is the actual image/frame captured.

    if not ret:
        print("Error: Frame not read correctly.")
        break

This checks if a frame was read properly. If ret is False, the program prints an error message and breaks out of the loop to stop reading frames.

    # Display the frame
    cv2.imshow('Webcam', frame)

This line displays the current frame in a window named 'Webcam'. The cv2.imshow() function creates the window and shows the frame inside it.

    # Press 'q' to quit the webcam window
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

This waits for a key press for a short period (1 millisecond). If the 'q' key is pressed, the program breaks out of the loop and stops capturing frames. The 0xFF part is used to handle compatibility between different systems.

# Release the webcam and close the window
cap.release()
cv2.destroyAllWindows()

cap.release() stops access to the webcam, freeing it up for other applications.
cv2.destroyAllWindows() closes all windows created by OpenCV to ensure no program windows remain open after the code finishes running.

Code Summary

import cv2

# Open the default webcam (0 indicates the first camera)
cap = cv2.VideoCapture(0)

# Check if the webcam is opened successfully
if not cap.isOpened():
    print("Error: Could not open webcam.")
    exit()

# Capture video frame-by-frame
while True:
    ret, frame = cap.read()
    if not ret:
        print("Error: Frame not read correctly.")
        break

    # Display the frame
    cv2.imshow('Webcam', frame)

    # Press 'q' to quit the webcam window
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# Release the webcam and close the window
cap.release()
cv2.destroyAllWindows()

Step 3: Loading Our Model

Here's a breakdown of each part of the code to help you understand how the face recognition project is built:

Imports

import cv2
import face_recognition
import numpy as np
import os
import imutils

cv2 (OpenCV): Used for image processing and handling camera input/output.
face_recognition: A library built on dlib that simplifies face recognition tasks.
numpy: Used for numerical operations, particularly for handling arrays and distances.
os: Used for interacting with the operating system, such as reading files from directories.
imutils: Simplifies common OpenCV tasks like resizing images.

Initializing Path and Preparing Image Data

path = "Models"
images = []
classNames = []
mylist = os.listdir(path)

path = "Models": Specifies the directory where images for training are stored.
images = []: A list to store images read from the directory.
classNames = []: A list to store the names of the people (or classes) based on image filenames.
mylist = os.listdir(path): Lists all files in the specified directory.

Reading Images from the File Path

for cls in mylist:
    curntimg = cv2.imread(f'{path}/{cls}')
    images.append(curntimg)
    classNames.append(os.path.splitext(cls)[0])

for cls in mylist:: Iterates over each file in the path.
curntimg = cv2.imread(f'{path}/{cls}'): Reads the image file using OpenCV.
images.append(curntimg): Adds the image to the images list.
classNames.append(os.path.splitext(cls)[0]): Adds the file name (without extension) to classNames.

Encoding Images

def Encodings(images):
    encodeList = []
    for img in images:
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        encode = face_recognition.face_encodings(img)[0]
        encodeList.append(encode)
    return encodeList

Function Encodings: Takes a list of images and returns a list of their encodings.
cv2.cvtColor(img, cv2.COLOR_BGR2RGB): Converts the image from BGR to RGB format for compatibility with face_recognition.
face_recognition.face_encodings(img)[0]: Computes the face encoding for the image.
encodeList.append(encode): Adds the encoding to encodeList.
return encodeList: Returns the list of all encodings.

Initialize Known Encodings

encodeListKnown = Encodings(images)

encodeListKnown: Stores the encoded representations of all training images.

Initializing Camera

cam = cv2.VideoCapture(0)

cv2.VideoCapture(0): Initializes the camera for capturing video frames (0 indicates the default camera).

Main Loop for Real-Time Recognition

while True:
    _, img = cam.read()
    imgs = cv2.resize(img, (0, 0), None, 0.33, 0.33)
    imgs = cv2.cvtColor(imgs, cv2.COLOR_BGR2RGB)
    img = imutils.resize(img, height=450, width=700)

while True:: Keeps the loop running for real-time video feed.
_, img = cam.read(): Captures a frame from the camera.
imgs = cv2.resize(img, (0, 0), None, 0.33, 0.33): Resizes the image to speed up processing.
imgs = cv2.cvtColor(imgs, cv2.COLOR_BGR2RGB): Converts the frame to RGB.
img = imutils.resize(img, height=450, width=700): Resizes the displayed image for better viewing.

Detect and Encode Faces in the Frame

facesCurFrame = face_recognition.face_locations(imgs)
encode = face_recognition.face_encodings(imgs, facesCurFrame)

facesCurFrame = face_recognition.face_locations(imgs): Detects the locations of faces in the frame.
encode = face_recognition.face_encodings(imgs, facesCurFrame): Encodes the detected faces.

Matching Encodings and Displaying Results

for encodeFace, faceloc in zip(encode, facesCurFrame):
    matches = face_recognition.compare_faces(encodeListKnown, encodeFace)
    faceDis = face_recognition.face_distance(encodeListKnown, encodeFace)
    matchIndex = np.argmin(faceDis)

for encodeFace, faceloc in zip(encode, facesCurFrame):: Loops through each detected face and its encoding.
matches = face_recognition.compare_faces(encodeListKnown, encodeFace): Checks for matches between the known encodings and the current face.
faceDis = face_recognition.face_distance(encodeListKnown, encodeFace): Calculates the distance between encodings (lower means better match).
matchIndex = np.argmin(faceDis): Finds the index of the closest match.

Drawing and Labeling Faces

if matches[matchIndex]:
    name = classNames[matchIndex].upper()
    y1, x2, y2, x1 = faceloc
    y1, x2, y2, x1 = y1*3, x2*3, y2*3, x1*3
    cv2.rectangle(img, (x1, y1), (x2, y2+9), (0, 255, 0), 2)
    cv2.rectangle(img, (x1, y2-32), (x2, y2+9), (0, 255, 0), cv2.FILLED)
    cv2.putText(img, name, (x1+5, y2+5), cv2.FONT_HERSHEY_COMPLEX, 1, (255, 255, 255), 2)

if matches[matchIndex]:: Checks if the closest match is valid.
name = classNames[matchIndex].upper(): Retrieves and formats the name of the matched person.
cv2.rectangle and cv2.putText: Draws a rectangle around the face and labels it with the person's name.

Handling Unknown Faces

if not matches[matchIndex]:
    y1, x2, y2, x1 = faceloc
    y1, x2, y2, x1 = y1*4, x2*4, y2*4, x1*4
    cv2.rectangle(img, (x1, y1), (x2, y2), (0, 0, 255), 2)
    cv2.rectangle(img, (x1, y2-32), (x2, y2+9), (0, 0, 255), cv2.FILLED)
    cv2.putText(img, "UNKNOWN", (x1+6, y2+6), cv2.FONT_HERSHEY_COMPLEX, 1, (255, 255, 255), 2)

Draws a red rectangle: If the face is not recognized, it is labeled as "UNKNOWN".

Display the Output

cv2.imshow('FRAME', img)
cv2.waitKey(1)

cv2.imshow('FRAME', img): Displays the video feed with labeled faces.
cv2.waitKey(1): Waits for a short period to keep the window updated.

This code essentially captures video, detects and encodes faces, matches them with known images, and displays the result with labeled rectangles around recognized or unknown faces.