* feat/yoloface (#334)

* added yolov8 to face_detector (#323)

* added yolov8 to face_detector

* added yolov8 to face_detector

* Initial cleanup and renaming

* Update README

* refactored detect_with_yoloface (#329)

* refactored detect_with_yoloface

* apply review

* Change order again

* Restore working code

* modified code (#330)

* refactored detect_with_yoloface

* apply review

* use temp_frame in detect_with_yoloface

* reorder

* modified

* reorder models

* Tiny cleanup

---------

Co-authored-by: tamoharu <133945583+tamoharu@users.noreply.github.com>

* include audio file functions (#336)

* Add testing for audio handlers

* Change order

* Fix naming

* Use correct typing in choices

* Update help message for arguments, Notation based wording approach (#347)

* Update help message for arguments, Notation based wording approach

* Fix installer

* Audio functions (#345)

* Update ffmpeg.py

* Create audio.py

* Update ffmpeg.py

* Update audio.py

* Update audio.py

* Update typing.py

* Update ffmpeg.py

* Update audio.py

* Rename Frame to VisionFrame (#346)

* Minor tidy up

* Introduce audio testing

* Add more todo for testing

* Add more todo for testing

* Fix indent

* Enable venv on the fly

* Enable venv on the fly

* Revert venv on the fly

* Revert venv on the fly

* Force Gradio to shut up

* Force Gradio to shut up

* Clear temp before processing

* Reduce terminal output

* include audio file functions

* Enforce output resolution on merge video

* Minor cleanups

* Add age and gender to face debugger items (#353)

* Add age and gender to face debugger items

* Rename like suggested in the code review

* Fix the output framerate vs. time

* Lip Sync (#356)

* Cli implementation of wav2lip

* - create get_first_item()
- remove non gan wav2lip model
- implement video memory strategy
- implement get_reference_frame()
- implement process_image()
- rearrange crop_mask_list
- implement test_cli

* Simplify testing

* Rename to lip syncer

* Fix testing

* Fix testing

* Minor cleanup

* Cuda 12 installer (#362)

* Make cuda nightly (12) the default

* Better keep legacy cuda just in case

* Use CUDA and ROCM versions

* Remove MacOS options from installer (CoreML include in default package)

* Add lip-syncer support to source component

* Add lip-syncer support to source component

* Fix the check in the source component

* Add target image check

* Introduce more helpers to suite the lip-syncer needs

* Downgrade onnxruntime as of buggy 1.17.0 release

* Revert "Downgrade onnxruntime as of buggy 1.17.0 release"

This reverts commit f4a7ae6824fed87f0be50906bbc7e2d61d00617b.

* More testing and add todos

* Fix the frame processor API to at least not throw errors

* Introduce dict based frame processor inputs (#364)

* Introduce dict based frame processor inputs

* Forgot to adjust webcam

* create path payloads (#365)

* create index payload to paths for process_frames

* rename to payload_paths

* This code now is poetry

* Fix the terminal output

* Make lip-syncer work in the preview

* Remove face debugger test for now

* Reoder reference_faces, Fix testing

* Use inswapper_128 on buggy onnxruntime 1.17.0

* Undo inswapper_128_fp16 duo broken onnxruntime 1.17.0

* Undo inswapper_128_fp16 duo broken onnxruntime 1.17.0

* Fix lip_syncer occluder & region mask issue

* Fix preview once in case there was no output video fps

* fix lip_syncer custom fps

* remove unused import

* Add 68 landmark functions (#367)

* Add 68 landmark model

* Add landmark to face object

* Re-arrange and modify typing

* Rename function

* Rearrange

* Rearrange

* ignore type

* ignore type

* change type

* ignore

* name

* Some cleanup

* Some cleanup

* Opps, I broke something

* Feat/face analyser refactoring (#369)

* Restructure face analyser and start TDD

* YoloFace and Yunet testing are passing

* Remove offset from yoloface detection

* Cleanup code

* Tiny fix

* Fix get_many_faces()

* Tiny fix (again)

* Use 320x320 fallback for retinaface

* Fix merging mashup

* Upload wave2lip model

* Upload 2dfan2 model and rename internal to face_predictor

* Downgrade onnxruntime for most cases

* Update for the face debugger to render landmark 68

* Try to make detect_face_landmark_68() and detect_gender_age() more uniform

* Enable retinaface testing for 320x320

* Make detect_face_landmark_68() and detect_gender_age() as uniform as … (#370)

* Make detect_face_landmark_68() and detect_gender_age() as uniform as possible

* Revert landmark scale and translation

* Make box-mask for lip-syncer adjustable

* Add create_bbox_from_landmark()

* Remove currently unused code

* Feat/uniface (#375)

* add uniface (#373)

* Finalize UniFace implementation

---------

Co-authored-by: Harisreedhar <46858047+harisreedhar@users.noreply.github.com>

* My approach how todo it

* edit

* edit

* replace vertical blur with gaussian

* remove region mask

* Rebase against next and restore method

* Minor improvements

* Minor improvements

* rename & add forehead padding

* Adjust and host uniface model

* Use 2dfan4 model

* Rename to face landmarker

* Feat/replace bbox with bounding box (#380)

* Add landmark 68 to 5 convertion

* Add landmark 68 to 5 convertion

* Keep 5, 5/68 and 68 landmarks

* Replace kps with landmark

* Replace bbox with bounding box

* Reshape face_landmark5_list different

* Make yoloface the default

* Move convert_face_landmark_68_to_5 to face_helper

* Minor spacing issue

* Dynamic detector sizes according to model (#382)

* Dynamic detector sizes according to model

* Dynamic detector sizes according to model

* Undo false commited files

* Add lib syncer model to the UI

* fix halo (#383)

* Bump to 2.3.0

* Update README and wording

* Update README and wording

* Fix spacing

* Apply _vision suffix

* Apply _vision suffix

* Apply _vision suffix

* Apply _vision suffix

* Apply _vision suffix

* Apply _vision suffix

* Apply _vision suffix, Move mouth mask to face_masker.py

* Apply _vision suffix

* Apply _vision suffix

* increase forehead padding

---------

Co-authored-by: tamoharu <133945583+tamoharu@users.noreply.github.com>
Co-authored-by: Harisreedhar <46858047+harisreedhar@users.noreply.github.com>
This commit is contained in:
Henry Ruhs
2024-02-14 14:08:29 +01:00
committed by GitHub
parent 122da0545b
commit c77493ff9a
66 changed files with 1893 additions and 884 deletions

View File

@@ -5,12 +5,13 @@ import numpy
import onnxruntime
import facefusion.globals
from facefusion.download import conditional_download
from facefusion.common_helper import get_first
from facefusion.face_helper import warp_face_by_face_landmark_5, warp_face_by_translation, create_static_anchors, distance_to_face_landmark_5, distance_to_bounding_box, convert_face_landmark_68_to_5, apply_nms, categorize_age, categorize_gender
from facefusion.face_store import get_static_faces, set_static_faces
from facefusion.execution_helper import apply_execution_provider_options
from facefusion.face_helper import warp_face_by_kps, create_static_anchors, distance_to_kps, distance_to_bbox, apply_nms
from facefusion.download import conditional_download
from facefusion.filesystem import resolve_relative_path
from facefusion.typing import Frame, Face, FaceSet, FaceAnalyserOrder, FaceAnalyserAge, FaceAnalyserGender, ModelSet, Bbox, Kps, Score, Embedding
from facefusion.typing import VisionFrame, Face, FaceSet, FaceAnalyserOrder, FaceAnalyserAge, FaceAnalyserGender, ModelSet, BoundingBox, FaceLandmarkSet, FaceLandmark5, FaceLandmark68, Score, Embedding
from facefusion.vision import resize_frame_resolution, unpack_resolution
FACE_ANALYSER = None
@@ -23,6 +24,11 @@ MODELS : ModelSet =\
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/retinaface_10g.onnx',
'path': resolve_relative_path('../.assets/models/retinaface_10g.onnx')
},
'face_detector_yoloface':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/yoloface_8n.onnx',
'path': resolve_relative_path('../.assets/models/yoloface_8n.onnx')
},
'face_detector_yunet':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/yunet_2023mar.onnx',
@@ -43,6 +49,16 @@ MODELS : ModelSet =\
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/arcface_simswap.onnx',
'path': resolve_relative_path('../.assets/models/arcface_simswap.onnx')
},
'face_recognizer_arcface_uniface':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/arcface_w600k_r50.onnx',
'path': resolve_relative_path('../.assets/models/arcface_w600k_r50.onnx')
},
'face_landmarker':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/2dfan4.onnx',
'path': resolve_relative_path('../.assets/models/2dfan4.onnx')
},
'gender_age':
{
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/gender_age.onnx',
@@ -58,6 +74,8 @@ def get_face_analyser() -> Any:
if FACE_ANALYSER is None:
if facefusion.globals.face_detector_model == 'retinaface':
face_detector = onnxruntime.InferenceSession(MODELS.get('face_detector_retinaface').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers))
if facefusion.globals.face_detector_model == 'yoloface':
face_detector = onnxruntime.InferenceSession(MODELS.get('face_detector_yoloface').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers))
if facefusion.globals.face_detector_model == 'yunet':
face_detector = cv2.FaceDetectorYN.create(MODELS.get('face_detector_yunet').get('path'), '', (0, 0))
if facefusion.globals.face_recognizer_model == 'arcface_blendswap':
@@ -66,11 +84,15 @@ def get_face_analyser() -> Any:
face_recognizer = onnxruntime.InferenceSession(MODELS.get('face_recognizer_arcface_inswapper').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers))
if facefusion.globals.face_recognizer_model == 'arcface_simswap':
face_recognizer = onnxruntime.InferenceSession(MODELS.get('face_recognizer_arcface_simswap').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers))
if facefusion.globals.face_recognizer_model == 'arcface_uniface':
face_recognizer = onnxruntime.InferenceSession(MODELS.get('face_recognizer_arcface_uniface').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers))
face_landmarker = onnxruntime.InferenceSession(MODELS.get('face_landmarker').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers))
gender_age = onnxruntime.InferenceSession(MODELS.get('gender_age').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers))
FACE_ANALYSER =\
{
'face_detector': face_detector,
'face_recognizer': face_recognizer,
'face_landmarker': face_landmarker,
'gender_age': gender_age
}
return FACE_ANALYSER
@@ -88,47 +110,36 @@ def pre_check() -> bool:
model_urls =\
[
MODELS.get('face_detector_retinaface').get('url'),
MODELS.get('face_detector_yoloface').get('url'),
MODELS.get('face_detector_yunet').get('url'),
MODELS.get('face_recognizer_arcface_blendswap').get('url'),
MODELS.get('face_recognizer_arcface_inswapper').get('url'),
MODELS.get('face_recognizer_arcface_simswap').get('url'),
MODELS.get('gender_age').get('url')
MODELS.get('face_recognizer_arcface_uniface').get('url'),
MODELS.get('face_landmarker').get('url'),
MODELS.get('gender_age').get('url'),
]
conditional_download(download_directory_path, model_urls)
return True
def extract_faces(frame : Frame) -> List[Face]:
face_detector_width, face_detector_height = unpack_resolution(facefusion.globals.face_detector_size)
frame_height, frame_width, _ = frame.shape
temp_frame = resize_frame_resolution(frame, face_detector_width, face_detector_height)
temp_frame_height, temp_frame_width, _ = temp_frame.shape
ratio_height = frame_height / temp_frame_height
ratio_width = frame_width / temp_frame_width
if facefusion.globals.face_detector_model == 'retinaface':
bbox_list, kps_list, score_list = detect_with_retinaface(temp_frame, temp_frame_height, temp_frame_width, face_detector_height, face_detector_width, ratio_height, ratio_width)
return create_faces(frame, bbox_list, kps_list, score_list)
elif facefusion.globals.face_detector_model == 'yunet':
bbox_list, kps_list, score_list = detect_with_yunet(temp_frame, temp_frame_height, temp_frame_width, ratio_height, ratio_width)
return create_faces(frame, bbox_list, kps_list, score_list)
return []
def detect_with_retinaface(temp_frame : Frame, temp_frame_height : int, temp_frame_width : int, face_detector_height : int, face_detector_width : int, ratio_height : float, ratio_width : float) -> Tuple[List[Bbox], List[Kps], List[Score]]:
def detect_with_retinaface(vision_frame : VisionFrame, face_detector_size : str) -> Tuple[List[BoundingBox], List[FaceLandmark5], List[Score]]:
face_detector = get_face_analyser().get('face_detector')
bbox_list = []
kps_list = []
score_list = []
face_detector_width, face_detector_height = unpack_resolution(face_detector_size)
temp_vision_frame = resize_frame_resolution(vision_frame, face_detector_width, face_detector_height)
ratio_height = vision_frame.shape[0] / temp_vision_frame.shape[0]
ratio_width = vision_frame.shape[1] / temp_vision_frame.shape[1]
feature_strides = [ 8, 16, 32 ]
feature_map_channel = 3
anchor_total = 2
prepare_frame = numpy.zeros((face_detector_height, face_detector_width, 3))
prepare_frame[:temp_frame_height, :temp_frame_width, :] = temp_frame
temp_frame = (prepare_frame - 127.5) / 128.0
temp_frame = numpy.expand_dims(temp_frame.transpose(2, 0, 1), axis = 0).astype(numpy.float32)
bounding_box_list = []
face_landmark5_list = []
score_list = []
with THREAD_SEMAPHORE:
detections = face_detector.run(None,
{
face_detector.get_inputs()[0].name: temp_frame
face_detector.get_inputs()[0].name: prepare_detect_frame(temp_vision_frame, face_detector_size)
})
for index, feature_stride in enumerate(feature_strides):
keep_indices = numpy.where(detections[index] >= facefusion.globals.face_detector_score)[0]
@@ -136,63 +147,119 @@ def detect_with_retinaface(temp_frame : Frame, temp_frame_height : int, temp_fra
stride_height = face_detector_height // feature_stride
stride_width = face_detector_width // feature_stride
anchors = create_static_anchors(feature_stride, anchor_total, stride_height, stride_width)
bbox_raw = detections[index + feature_map_channel] * feature_stride
kps_raw = detections[index + feature_map_channel * 2] * feature_stride
for bbox in distance_to_bbox(anchors, bbox_raw)[keep_indices]:
bbox_list.append(numpy.array(
bounding_box_raw = detections[index + feature_map_channel] * feature_stride
face_landmark_5_raw = detections[index + feature_map_channel * 2] * feature_stride
for bounding_box in distance_to_bounding_box(anchors, bounding_box_raw)[keep_indices]:
bounding_box_list.append(numpy.array(
[
bbox[0] * ratio_width,
bbox[1] * ratio_height,
bbox[2] * ratio_width,
bbox[3] * ratio_height
bounding_box[0] * ratio_width,
bounding_box[1] * ratio_height,
bounding_box[2] * ratio_width,
bounding_box[3] * ratio_height
]))
for kps in distance_to_kps(anchors, kps_raw)[keep_indices]:
kps_list.append(kps * [ ratio_width, ratio_height ])
for face_landmark5 in distance_to_face_landmark_5(anchors, face_landmark_5_raw)[keep_indices]:
face_landmark5_list.append(face_landmark5 * [ ratio_width, ratio_height ])
for score in detections[index][keep_indices]:
score_list.append(score[0])
return bbox_list, kps_list, score_list
return bounding_box_list, face_landmark5_list, score_list
def detect_with_yunet(temp_frame : Frame, temp_frame_height : int, temp_frame_width : int, ratio_height : float, ratio_width : float) -> Tuple[List[Bbox], List[Kps], List[Score]]:
def detect_with_yoloface(vision_frame : VisionFrame, face_detector_size : str) -> Tuple[List[BoundingBox], List[FaceLandmark5], List[Score]]:
face_detector = get_face_analyser().get('face_detector')
face_detector.setInputSize((temp_frame_width, temp_frame_height))
face_detector.setScoreThreshold(facefusion.globals.face_detector_score)
bbox_list = []
kps_list = []
face_detector_width, face_detector_height = unpack_resolution(face_detector_size)
temp_vision_frame = resize_frame_resolution(vision_frame, face_detector_width, face_detector_height)
ratio_height = vision_frame.shape[0] / temp_vision_frame.shape[0]
ratio_width = vision_frame.shape[1] / temp_vision_frame.shape[1]
bounding_box_list = []
face_landmark5_list = []
score_list = []
with THREAD_SEMAPHORE:
_, detections = face_detector.detect(temp_frame)
detections = face_detector.run(None,
{
face_detector.get_inputs()[0].name: prepare_detect_frame(temp_vision_frame, face_detector_size)
})
detections = numpy.squeeze(detections).T
bounding_box_raw, score_raw, face_landmark_5_raw = numpy.split(detections, [ 4, 5 ], axis = 1)
keep_indices = numpy.where(score_raw > facefusion.globals.face_detector_score)[0]
if keep_indices.any():
bounding_box_raw, face_landmark_5_raw, score_raw = bounding_box_raw[keep_indices], face_landmark_5_raw[keep_indices], score_raw[keep_indices]
for bounding_box in bounding_box_raw:
bounding_box_list.append(numpy.array(
[
(bounding_box[0] - bounding_box[2] / 2) * ratio_width,
(bounding_box[1] - bounding_box[3] / 2) * ratio_height,
(bounding_box[0] + bounding_box[2] / 2) * ratio_width,
(bounding_box[1] + bounding_box[3] / 2) * ratio_height
]))
face_landmark_5_raw[:, 0::3] = (face_landmark_5_raw[:, 0::3]) * ratio_width
face_landmark_5_raw[:, 1::3] = (face_landmark_5_raw[:, 1::3]) * ratio_height
for face_landmark_5 in face_landmark_5_raw:
face_landmark5_list.append(numpy.array(face_landmark_5.reshape(-1, 3)[:, :2]))
score_list = score_raw.ravel().tolist()
return bounding_box_list, face_landmark5_list, score_list
def detect_with_yunet(vision_frame : VisionFrame, face_detector_size : str) -> Tuple[List[BoundingBox], List[FaceLandmark5], List[Score]]:
face_detector = get_face_analyser().get('face_detector')
face_detector_width, face_detector_height = unpack_resolution(face_detector_size)
temp_vision_frame = resize_frame_resolution(vision_frame, face_detector_width, face_detector_height)
ratio_height = vision_frame.shape[0] / temp_vision_frame.shape[0]
ratio_width = vision_frame.shape[1] / temp_vision_frame.shape[1]
bounding_box_list = []
face_landmark5_list = []
score_list = []
face_detector.setInputSize((temp_vision_frame.shape[1], temp_vision_frame.shape[0]))
face_detector.setScoreThreshold(facefusion.globals.face_detector_score)
with THREAD_SEMAPHORE:
_, detections = face_detector.detect(temp_vision_frame)
if detections.any():
for detection in detections:
bbox_list.append(numpy.array(
bounding_box_list.append(numpy.array(
[
detection[0] * ratio_width,
detection[1] * ratio_height,
(detection[0] + detection[2]) * ratio_width,
(detection[1] + detection[3]) * ratio_height
]))
kps_list.append(detection[4:14].reshape((5, 2)) * [ ratio_width, ratio_height])
face_landmark5_list.append(detection[4:14].reshape((5, 2)) * [ ratio_width, ratio_height ])
score_list.append(detection[14])
return bbox_list, kps_list, score_list
return bounding_box_list, face_landmark5_list, score_list
def create_faces(frame : Frame, bbox_list : List[Bbox], kps_list : List[Kps], score_list : List[Score]) -> List[Face]:
def prepare_detect_frame(temp_vision_frame : VisionFrame, face_detector_size : str) -> VisionFrame:
face_detector_width, face_detector_height = unpack_resolution(face_detector_size)
detect_vision_frame = numpy.zeros((face_detector_height, face_detector_width, 3))
detect_vision_frame[:temp_vision_frame.shape[0], :temp_vision_frame.shape[1], :] = temp_vision_frame
detect_vision_frame = (detect_vision_frame - 127.5) / 128.0
detect_vision_frame = numpy.expand_dims(detect_vision_frame.transpose(2, 0, 1), axis = 0).astype(numpy.float32)
return detect_vision_frame
def create_faces(vision_frame : VisionFrame, bounding_box_list : List[BoundingBox], face_landmark5_list : List[FaceLandmark5], score_list : List[Score]) -> List[Face]:
faces = []
if facefusion.globals.face_detector_score > 0:
sort_indices = numpy.argsort(-numpy.array(score_list))
bbox_list = [ bbox_list[index] for index in sort_indices ]
kps_list = [ kps_list[index] for index in sort_indices ]
bounding_box_list = [ bounding_box_list[index] for index in sort_indices ]
face_landmark5_list = [ face_landmark5_list[index] for index in sort_indices ]
score_list = [ score_list[index] for index in sort_indices ]
keep_indices = apply_nms(bbox_list, 0.4)
keep_indices = apply_nms(bounding_box_list, 0.4)
for index in keep_indices:
bbox = bbox_list[index]
kps = kps_list[index]
bounding_box = bounding_box_list[index]
face_landmark_68 = detect_face_landmark_68(vision_frame, bounding_box)
landmark : FaceLandmarkSet =\
{
'5': face_landmark5_list[index],
'5/68': convert_face_landmark_68_to_5(face_landmark_68),
'68': face_landmark_68
}
score = score_list[index]
embedding, normed_embedding = calc_embedding(frame, kps)
gender, age = detect_gender_age(frame, bbox)
embedding, normed_embedding = calc_embedding(vision_frame, landmark['5/68'])
gender, age = detect_gender_age(vision_frame, bounding_box)
faces.append(Face(
bbox = bbox,
kps = kps,
bounding_box = bounding_box,
landmark = landmark,
score = score,
embedding = embedding,
normed_embedding = normed_embedding,
@@ -202,41 +269,57 @@ def create_faces(frame : Frame, bbox_list : List[Bbox], kps_list : List[Kps], sc
return faces
def calc_embedding(temp_frame : Frame, kps : Kps) -> Tuple[Embedding, Embedding]:
def calc_embedding(temp_vision_frame : VisionFrame, face_landmark_5 : FaceLandmark5) -> Tuple[Embedding, Embedding]:
face_recognizer = get_face_analyser().get('face_recognizer')
crop_frame, matrix = warp_face_by_kps(temp_frame, kps, 'arcface_112_v2', (112, 112))
crop_frame = crop_frame.astype(numpy.float32) / 127.5 - 1
crop_frame = crop_frame[:, :, ::-1].transpose(2, 0, 1)
crop_frame = numpy.expand_dims(crop_frame, axis = 0)
crop_vision_frame, matrix = warp_face_by_face_landmark_5(temp_vision_frame, face_landmark_5, 'arcface_112_v2', (112, 112))
crop_vision_frame = crop_vision_frame / 127.5 - 1
crop_vision_frame = crop_vision_frame[:, :, ::-1].transpose(2, 0, 1).astype(numpy.float32)
crop_vision_frame = numpy.expand_dims(crop_vision_frame, axis = 0)
embedding = face_recognizer.run(None,
{
face_recognizer.get_inputs()[0].name: crop_frame
face_recognizer.get_inputs()[0].name: crop_vision_frame
})[0]
embedding = embedding.ravel()
normed_embedding = embedding / numpy.linalg.norm(embedding)
return embedding, normed_embedding
def detect_gender_age(frame : Frame, bbox : Bbox) -> Tuple[int, int]:
def detect_face_landmark_68(temp_vision_frame : VisionFrame, bounding_box : BoundingBox) -> FaceLandmark68:
face_landmarker = get_face_analyser().get('face_landmarker')
scale = 195 / numpy.subtract(bounding_box[2:], bounding_box[:2]).max()
translation = (256 - numpy.add(bounding_box[2:], bounding_box[:2]) * scale) * 0.5
crop_vision_frame, affine_matrix = warp_face_by_translation(temp_vision_frame, translation, scale, (256, 256))
crop_vision_frame = crop_vision_frame.transpose(2, 0, 1).astype(numpy.float32) / 255.0
face_landmark_68 = face_landmarker.run(None,
{
face_landmarker.get_inputs()[0].name: [ crop_vision_frame ]
})[0]
face_landmark_68 = face_landmark_68[:, :, :2][0] / 64
face_landmark_68 = face_landmark_68.reshape(1, -1, 2) * 256
face_landmark_68 = cv2.transform(face_landmark_68, cv2.invertAffineTransform(affine_matrix))
face_landmark_68 = face_landmark_68.reshape(-1, 2)
return face_landmark_68
def detect_gender_age(temp_vision_frame : VisionFrame, bounding_box : BoundingBox) -> Tuple[int, int]:
gender_age = get_face_analyser().get('gender_age')
bbox = bbox.reshape(2, -1)
scale = 64 / numpy.subtract(*bbox[::-1]).max()
translation = 48 - bbox.sum(axis = 0) * 0.5 * scale
affine_matrix = numpy.array([[ scale, 0, translation[0] ], [ 0, scale, translation[1] ]])
crop_frame = cv2.warpAffine(frame, affine_matrix, (96, 96))
crop_frame = crop_frame.astype(numpy.float32)[:, :, ::-1].transpose(2, 0, 1)
crop_frame = numpy.expand_dims(crop_frame, axis = 0)
bounding_box = bounding_box.reshape(2, -1)
scale = 64 / numpy.subtract(*bounding_box[::-1]).max()
translation = 48 - bounding_box.sum(axis = 0) * scale * 0.5
crop_vision_frame, affine_matrix = warp_face_by_translation(temp_vision_frame, translation, scale, (96, 96))
crop_vision_frame = crop_vision_frame[:, :, ::-1].transpose(2, 0, 1).astype(numpy.float32)
crop_vision_frame = numpy.expand_dims(crop_vision_frame, axis = 0)
prediction = gender_age.run(None,
{
gender_age.get_inputs()[0].name: crop_frame
gender_age.get_inputs()[0].name: crop_vision_frame
})[0][0]
gender = int(numpy.argmax(prediction[:2]))
age = int(numpy.round(prediction[2] * 100))
return gender, age
def get_one_face(frame : Frame, position : int = 0) -> Optional[Face]:
many_faces = get_many_faces(frame)
def get_one_face(vision_frame : VisionFrame, position : int = 0) -> Optional[Face]:
many_faces = get_many_faces(vision_frame)
if many_faces:
try:
return many_faces[position]
@@ -245,52 +328,64 @@ def get_one_face(frame : Frame, position : int = 0) -> Optional[Face]:
return None
def get_average_face(frames : List[Frame], position : int = 0) -> Optional[Face]:
def get_average_face(vision_frames : List[VisionFrame], position : int = 0) -> Optional[Face]:
average_face = None
faces = []
embedding_list = []
normed_embedding_list = []
for frame in frames:
face = get_one_face(frame, position)
for vision_frame in vision_frames:
face = get_one_face(vision_frame, position)
if face:
faces.append(face)
embedding_list.append(face.embedding)
normed_embedding_list.append(face.normed_embedding)
if faces:
first_face = get_first(faces)
average_face = Face(
bbox = faces[0].bbox,
kps = faces[0].kps,
score = faces[0].score,
bounding_box = first_face.bounding_box,
landmark = first_face.landmark,
score = first_face.score,
embedding = numpy.mean(embedding_list, axis = 0),
normed_embedding = numpy.mean(normed_embedding_list, axis = 0),
gender = faces[0].gender,
age = faces[0].age
gender = first_face.gender,
age = first_face.age
)
return average_face
def get_many_faces(frame : Frame) -> List[Face]:
def get_many_faces(vision_frame : VisionFrame) -> List[Face]:
faces = []
try:
faces_cache = get_static_faces(frame)
faces_cache = get_static_faces(vision_frame)
if faces_cache:
faces = faces_cache
else:
faces = extract_faces(frame)
set_static_faces(frame, faces)
if facefusion.globals.face_detector_model == 'retinaface':
bounding_box_list, face_landmark5_list, score_list = detect_with_retinaface(vision_frame, facefusion.globals.face_detector_size)
faces = create_faces(vision_frame, bounding_box_list, face_landmark5_list, score_list)
if facefusion.globals.face_detector_model == 'yoloface':
bounding_box_list, face_landmark5_list, score_list = detect_with_yoloface(vision_frame, facefusion.globals.face_detector_size)
faces = create_faces(vision_frame, bounding_box_list, face_landmark5_list, score_list)
if facefusion.globals.face_detector_model == 'yunet':
bounding_box_list, face_landmark5_list, score_list = detect_with_yunet(vision_frame, facefusion.globals.face_detector_size)
faces = create_faces(vision_frame, bounding_box_list, face_landmark5_list, score_list)
if faces:
set_static_faces(vision_frame, faces)
if facefusion.globals.face_analyser_order:
faces = sort_by_order(faces, facefusion.globals.face_analyser_order)
if facefusion.globals.face_analyser_age:
faces = filter_by_age(faces, facefusion.globals.face_analyser_age)
if facefusion.globals.face_analyser_gender:
faces = filter_by_gender(faces, facefusion.globals.face_analyser_gender)
return faces
except (AttributeError, ValueError):
return []
pass
return faces
def find_similar_faces(frame : Frame, reference_faces : FaceSet, face_distance : float) -> List[Face]:
def find_similar_faces(reference_faces : FaceSet, vision_frame : VisionFrame, face_distance : float) -> List[Face]:
similar_faces : List[Face] = []
many_faces = get_many_faces(frame)
many_faces = get_many_faces(vision_frame)
if reference_faces:
for reference_set in reference_faces:
@@ -315,17 +410,17 @@ def calc_face_distance(face : Face, reference_face : Face) -> float:
def sort_by_order(faces : List[Face], order : FaceAnalyserOrder) -> List[Face]:
if order == 'left-right':
return sorted(faces, key = lambda face: face.bbox[0])
return sorted(faces, key = lambda face: face.bounding_box[0])
if order == 'right-left':
return sorted(faces, key = lambda face: face.bbox[0], reverse = True)
return sorted(faces, key = lambda face: face.bounding_box[0], reverse = True)
if order == 'top-bottom':
return sorted(faces, key = lambda face: face.bbox[1])
return sorted(faces, key = lambda face: face.bounding_box[1])
if order == 'bottom-top':
return sorted(faces, key = lambda face: face.bbox[1], reverse = True)
return sorted(faces, key = lambda face: face.bounding_box[1], reverse = True)
if order == 'small-large':
return sorted(faces, key = lambda face: (face.bbox[2] - face.bbox[0]) * (face.bbox[3] - face.bbox[1]))
return sorted(faces, key = lambda face: (face.bounding_box[2] - face.bounding_box[0]) * (face.bounding_box[3] - face.bounding_box[1]))
if order == 'large-small':
return sorted(faces, key = lambda face: (face.bbox[2] - face.bbox[0]) * (face.bbox[3] - face.bbox[1]), reverse = True)
return sorted(faces, key = lambda face: (face.bounding_box[2] - face.bounding_box[0]) * (face.bounding_box[3] - face.bounding_box[1]), reverse = True)
if order == 'best-worst':
return sorted(faces, key = lambda face: face.score, reverse = True)
if order == 'worst-best':
@@ -336,13 +431,7 @@ def sort_by_order(faces : List[Face], order : FaceAnalyserOrder) -> List[Face]:
def filter_by_age(faces : List[Face], age : FaceAnalyserAge) -> List[Face]:
filter_faces = []
for face in faces:
if face.age < 13 and age == 'child':
filter_faces.append(face)
elif face.age < 19 and age == 'teen':
filter_faces.append(face)
elif face.age < 60 and age == 'adult':
filter_faces.append(face)
elif face.age > 59 and age == 'senior':
if categorize_age(face.age) == age:
filter_faces.append(face)
return filter_faces
@@ -350,8 +439,6 @@ def filter_by_age(faces : List[Face], age : FaceAnalyserAge) -> List[Face]:
def filter_by_gender(faces : List[Face], gender : FaceAnalyserGender) -> List[Face]:
filter_faces = []
for face in faces:
if face.gender == 0 and gender == 'female':
filter_faces.append(face)
if face.gender == 1 and gender == 'male':
if categorize_gender(face.gender) == gender:
filter_faces.append(face)
return filter_faces