Next (#384)
* feat/yoloface (#334) * added yolov8 to face_detector (#323) * added yolov8 to face_detector * added yolov8 to face_detector * Initial cleanup and renaming * Update README * refactored detect_with_yoloface (#329) * refactored detect_with_yoloface * apply review * Change order again * Restore working code * modified code (#330) * refactored detect_with_yoloface * apply review * use temp_frame in detect_with_yoloface * reorder * modified * reorder models * Tiny cleanup --------- Co-authored-by: tamoharu <133945583+tamoharu@users.noreply.github.com> * include audio file functions (#336) * Add testing for audio handlers * Change order * Fix naming * Use correct typing in choices * Update help message for arguments, Notation based wording approach (#347) * Update help message for arguments, Notation based wording approach * Fix installer * Audio functions (#345) * Update ffmpeg.py * Create audio.py * Update ffmpeg.py * Update audio.py * Update audio.py * Update typing.py * Update ffmpeg.py * Update audio.py * Rename Frame to VisionFrame (#346) * Minor tidy up * Introduce audio testing * Add more todo for testing * Add more todo for testing * Fix indent * Enable venv on the fly * Enable venv on the fly * Revert venv on the fly * Revert venv on the fly * Force Gradio to shut up * Force Gradio to shut up * Clear temp before processing * Reduce terminal output * include audio file functions * Enforce output resolution on merge video * Minor cleanups * Add age and gender to face debugger items (#353) * Add age and gender to face debugger items * Rename like suggested in the code review * Fix the output framerate vs. time * Lip Sync (#356) * Cli implementation of wav2lip * - create get_first_item() - remove non gan wav2lip model - implement video memory strategy - implement get_reference_frame() - implement process_image() - rearrange crop_mask_list - implement test_cli * Simplify testing * Rename to lip syncer * Fix testing * Fix testing * Minor cleanup * Cuda 12 installer (#362) * Make cuda nightly (12) the default * Better keep legacy cuda just in case * Use CUDA and ROCM versions * Remove MacOS options from installer (CoreML include in default package) * Add lip-syncer support to source component * Add lip-syncer support to source component * Fix the check in the source component * Add target image check * Introduce more helpers to suite the lip-syncer needs * Downgrade onnxruntime as of buggy 1.17.0 release * Revert "Downgrade onnxruntime as of buggy 1.17.0 release" This reverts commit f4a7ae6824fed87f0be50906bbc7e2d61d00617b. * More testing and add todos * Fix the frame processor API to at least not throw errors * Introduce dict based frame processor inputs (#364) * Introduce dict based frame processor inputs * Forgot to adjust webcam * create path payloads (#365) * create index payload to paths for process_frames * rename to payload_paths * This code now is poetry * Fix the terminal output * Make lip-syncer work in the preview * Remove face debugger test for now * Reoder reference_faces, Fix testing * Use inswapper_128 on buggy onnxruntime 1.17.0 * Undo inswapper_128_fp16 duo broken onnxruntime 1.17.0 * Undo inswapper_128_fp16 duo broken onnxruntime 1.17.0 * Fix lip_syncer occluder & region mask issue * Fix preview once in case there was no output video fps * fix lip_syncer custom fps * remove unused import * Add 68 landmark functions (#367) * Add 68 landmark model * Add landmark to face object * Re-arrange and modify typing * Rename function * Rearrange * Rearrange * ignore type * ignore type * change type * ignore * name * Some cleanup * Some cleanup * Opps, I broke something * Feat/face analyser refactoring (#369) * Restructure face analyser and start TDD * YoloFace and Yunet testing are passing * Remove offset from yoloface detection * Cleanup code * Tiny fix * Fix get_many_faces() * Tiny fix (again) * Use 320x320 fallback for retinaface * Fix merging mashup * Upload wave2lip model * Upload 2dfan2 model and rename internal to face_predictor * Downgrade onnxruntime for most cases * Update for the face debugger to render landmark 68 * Try to make detect_face_landmark_68() and detect_gender_age() more uniform * Enable retinaface testing for 320x320 * Make detect_face_landmark_68() and detect_gender_age() as uniform as … (#370) * Make detect_face_landmark_68() and detect_gender_age() as uniform as possible * Revert landmark scale and translation * Make box-mask for lip-syncer adjustable * Add create_bbox_from_landmark() * Remove currently unused code * Feat/uniface (#375) * add uniface (#373) * Finalize UniFace implementation --------- Co-authored-by: Harisreedhar <46858047+harisreedhar@users.noreply.github.com> * My approach how todo it * edit * edit * replace vertical blur with gaussian * remove region mask * Rebase against next and restore method * Minor improvements * Minor improvements * rename & add forehead padding * Adjust and host uniface model * Use 2dfan4 model * Rename to face landmarker * Feat/replace bbox with bounding box (#380) * Add landmark 68 to 5 convertion * Add landmark 68 to 5 convertion * Keep 5, 5/68 and 68 landmarks * Replace kps with landmark * Replace bbox with bounding box * Reshape face_landmark5_list different * Make yoloface the default * Move convert_face_landmark_68_to_5 to face_helper * Minor spacing issue * Dynamic detector sizes according to model (#382) * Dynamic detector sizes according to model * Dynamic detector sizes according to model * Undo false commited files * Add lib syncer model to the UI * fix halo (#383) * Bump to 2.3.0 * Update README and wording * Update README and wording * Fix spacing * Apply _vision suffix * Apply _vision suffix * Apply _vision suffix * Apply _vision suffix * Apply _vision suffix * Apply _vision suffix * Apply _vision suffix, Move mouth mask to face_masker.py * Apply _vision suffix * Apply _vision suffix * increase forehead padding --------- Co-authored-by: tamoharu <133945583+tamoharu@users.noreply.github.com> Co-authored-by: Harisreedhar <46858047+harisreedhar@users.noreply.github.com>
This commit is contained in:
@@ -5,12 +5,13 @@ import numpy
|
||||
import onnxruntime
|
||||
|
||||
import facefusion.globals
|
||||
from facefusion.download import conditional_download
|
||||
from facefusion.common_helper import get_first
|
||||
from facefusion.face_helper import warp_face_by_face_landmark_5, warp_face_by_translation, create_static_anchors, distance_to_face_landmark_5, distance_to_bounding_box, convert_face_landmark_68_to_5, apply_nms, categorize_age, categorize_gender
|
||||
from facefusion.face_store import get_static_faces, set_static_faces
|
||||
from facefusion.execution_helper import apply_execution_provider_options
|
||||
from facefusion.face_helper import warp_face_by_kps, create_static_anchors, distance_to_kps, distance_to_bbox, apply_nms
|
||||
from facefusion.download import conditional_download
|
||||
from facefusion.filesystem import resolve_relative_path
|
||||
from facefusion.typing import Frame, Face, FaceSet, FaceAnalyserOrder, FaceAnalyserAge, FaceAnalyserGender, ModelSet, Bbox, Kps, Score, Embedding
|
||||
from facefusion.typing import VisionFrame, Face, FaceSet, FaceAnalyserOrder, FaceAnalyserAge, FaceAnalyserGender, ModelSet, BoundingBox, FaceLandmarkSet, FaceLandmark5, FaceLandmark68, Score, Embedding
|
||||
from facefusion.vision import resize_frame_resolution, unpack_resolution
|
||||
|
||||
FACE_ANALYSER = None
|
||||
@@ -23,6 +24,11 @@ MODELS : ModelSet =\
|
||||
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/retinaface_10g.onnx',
|
||||
'path': resolve_relative_path('../.assets/models/retinaface_10g.onnx')
|
||||
},
|
||||
'face_detector_yoloface':
|
||||
{
|
||||
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/yoloface_8n.onnx',
|
||||
'path': resolve_relative_path('../.assets/models/yoloface_8n.onnx')
|
||||
},
|
||||
'face_detector_yunet':
|
||||
{
|
||||
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/yunet_2023mar.onnx',
|
||||
@@ -43,6 +49,16 @@ MODELS : ModelSet =\
|
||||
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/arcface_simswap.onnx',
|
||||
'path': resolve_relative_path('../.assets/models/arcface_simswap.onnx')
|
||||
},
|
||||
'face_recognizer_arcface_uniface':
|
||||
{
|
||||
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/arcface_w600k_r50.onnx',
|
||||
'path': resolve_relative_path('../.assets/models/arcface_w600k_r50.onnx')
|
||||
},
|
||||
'face_landmarker':
|
||||
{
|
||||
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/2dfan4.onnx',
|
||||
'path': resolve_relative_path('../.assets/models/2dfan4.onnx')
|
||||
},
|
||||
'gender_age':
|
||||
{
|
||||
'url': 'https://github.com/facefusion/facefusion-assets/releases/download/models/gender_age.onnx',
|
||||
@@ -58,6 +74,8 @@ def get_face_analyser() -> Any:
|
||||
if FACE_ANALYSER is None:
|
||||
if facefusion.globals.face_detector_model == 'retinaface':
|
||||
face_detector = onnxruntime.InferenceSession(MODELS.get('face_detector_retinaface').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers))
|
||||
if facefusion.globals.face_detector_model == 'yoloface':
|
||||
face_detector = onnxruntime.InferenceSession(MODELS.get('face_detector_yoloface').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers))
|
||||
if facefusion.globals.face_detector_model == 'yunet':
|
||||
face_detector = cv2.FaceDetectorYN.create(MODELS.get('face_detector_yunet').get('path'), '', (0, 0))
|
||||
if facefusion.globals.face_recognizer_model == 'arcface_blendswap':
|
||||
@@ -66,11 +84,15 @@ def get_face_analyser() -> Any:
|
||||
face_recognizer = onnxruntime.InferenceSession(MODELS.get('face_recognizer_arcface_inswapper').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers))
|
||||
if facefusion.globals.face_recognizer_model == 'arcface_simswap':
|
||||
face_recognizer = onnxruntime.InferenceSession(MODELS.get('face_recognizer_arcface_simswap').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers))
|
||||
if facefusion.globals.face_recognizer_model == 'arcface_uniface':
|
||||
face_recognizer = onnxruntime.InferenceSession(MODELS.get('face_recognizer_arcface_uniface').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers))
|
||||
face_landmarker = onnxruntime.InferenceSession(MODELS.get('face_landmarker').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers))
|
||||
gender_age = onnxruntime.InferenceSession(MODELS.get('gender_age').get('path'), providers = apply_execution_provider_options(facefusion.globals.execution_providers))
|
||||
FACE_ANALYSER =\
|
||||
{
|
||||
'face_detector': face_detector,
|
||||
'face_recognizer': face_recognizer,
|
||||
'face_landmarker': face_landmarker,
|
||||
'gender_age': gender_age
|
||||
}
|
||||
return FACE_ANALYSER
|
||||
@@ -88,47 +110,36 @@ def pre_check() -> bool:
|
||||
model_urls =\
|
||||
[
|
||||
MODELS.get('face_detector_retinaface').get('url'),
|
||||
MODELS.get('face_detector_yoloface').get('url'),
|
||||
MODELS.get('face_detector_yunet').get('url'),
|
||||
MODELS.get('face_recognizer_arcface_blendswap').get('url'),
|
||||
MODELS.get('face_recognizer_arcface_inswapper').get('url'),
|
||||
MODELS.get('face_recognizer_arcface_simswap').get('url'),
|
||||
MODELS.get('gender_age').get('url')
|
||||
MODELS.get('face_recognizer_arcface_uniface').get('url'),
|
||||
MODELS.get('face_landmarker').get('url'),
|
||||
MODELS.get('gender_age').get('url'),
|
||||
]
|
||||
conditional_download(download_directory_path, model_urls)
|
||||
return True
|
||||
|
||||
|
||||
def extract_faces(frame : Frame) -> List[Face]:
|
||||
face_detector_width, face_detector_height = unpack_resolution(facefusion.globals.face_detector_size)
|
||||
frame_height, frame_width, _ = frame.shape
|
||||
temp_frame = resize_frame_resolution(frame, face_detector_width, face_detector_height)
|
||||
temp_frame_height, temp_frame_width, _ = temp_frame.shape
|
||||
ratio_height = frame_height / temp_frame_height
|
||||
ratio_width = frame_width / temp_frame_width
|
||||
if facefusion.globals.face_detector_model == 'retinaface':
|
||||
bbox_list, kps_list, score_list = detect_with_retinaface(temp_frame, temp_frame_height, temp_frame_width, face_detector_height, face_detector_width, ratio_height, ratio_width)
|
||||
return create_faces(frame, bbox_list, kps_list, score_list)
|
||||
elif facefusion.globals.face_detector_model == 'yunet':
|
||||
bbox_list, kps_list, score_list = detect_with_yunet(temp_frame, temp_frame_height, temp_frame_width, ratio_height, ratio_width)
|
||||
return create_faces(frame, bbox_list, kps_list, score_list)
|
||||
return []
|
||||
|
||||
|
||||
def detect_with_retinaface(temp_frame : Frame, temp_frame_height : int, temp_frame_width : int, face_detector_height : int, face_detector_width : int, ratio_height : float, ratio_width : float) -> Tuple[List[Bbox], List[Kps], List[Score]]:
|
||||
def detect_with_retinaface(vision_frame : VisionFrame, face_detector_size : str) -> Tuple[List[BoundingBox], List[FaceLandmark5], List[Score]]:
|
||||
face_detector = get_face_analyser().get('face_detector')
|
||||
bbox_list = []
|
||||
kps_list = []
|
||||
score_list = []
|
||||
face_detector_width, face_detector_height = unpack_resolution(face_detector_size)
|
||||
temp_vision_frame = resize_frame_resolution(vision_frame, face_detector_width, face_detector_height)
|
||||
ratio_height = vision_frame.shape[0] / temp_vision_frame.shape[0]
|
||||
ratio_width = vision_frame.shape[1] / temp_vision_frame.shape[1]
|
||||
feature_strides = [ 8, 16, 32 ]
|
||||
feature_map_channel = 3
|
||||
anchor_total = 2
|
||||
prepare_frame = numpy.zeros((face_detector_height, face_detector_width, 3))
|
||||
prepare_frame[:temp_frame_height, :temp_frame_width, :] = temp_frame
|
||||
temp_frame = (prepare_frame - 127.5) / 128.0
|
||||
temp_frame = numpy.expand_dims(temp_frame.transpose(2, 0, 1), axis = 0).astype(numpy.float32)
|
||||
bounding_box_list = []
|
||||
face_landmark5_list = []
|
||||
score_list = []
|
||||
|
||||
with THREAD_SEMAPHORE:
|
||||
detections = face_detector.run(None,
|
||||
{
|
||||
face_detector.get_inputs()[0].name: temp_frame
|
||||
face_detector.get_inputs()[0].name: prepare_detect_frame(temp_vision_frame, face_detector_size)
|
||||
})
|
||||
for index, feature_stride in enumerate(feature_strides):
|
||||
keep_indices = numpy.where(detections[index] >= facefusion.globals.face_detector_score)[0]
|
||||
@@ -136,63 +147,119 @@ def detect_with_retinaface(temp_frame : Frame, temp_frame_height : int, temp_fra
|
||||
stride_height = face_detector_height // feature_stride
|
||||
stride_width = face_detector_width // feature_stride
|
||||
anchors = create_static_anchors(feature_stride, anchor_total, stride_height, stride_width)
|
||||
bbox_raw = detections[index + feature_map_channel] * feature_stride
|
||||
kps_raw = detections[index + feature_map_channel * 2] * feature_stride
|
||||
for bbox in distance_to_bbox(anchors, bbox_raw)[keep_indices]:
|
||||
bbox_list.append(numpy.array(
|
||||
bounding_box_raw = detections[index + feature_map_channel] * feature_stride
|
||||
face_landmark_5_raw = detections[index + feature_map_channel * 2] * feature_stride
|
||||
for bounding_box in distance_to_bounding_box(anchors, bounding_box_raw)[keep_indices]:
|
||||
bounding_box_list.append(numpy.array(
|
||||
[
|
||||
bbox[0] * ratio_width,
|
||||
bbox[1] * ratio_height,
|
||||
bbox[2] * ratio_width,
|
||||
bbox[3] * ratio_height
|
||||
bounding_box[0] * ratio_width,
|
||||
bounding_box[1] * ratio_height,
|
||||
bounding_box[2] * ratio_width,
|
||||
bounding_box[3] * ratio_height
|
||||
]))
|
||||
for kps in distance_to_kps(anchors, kps_raw)[keep_indices]:
|
||||
kps_list.append(kps * [ ratio_width, ratio_height ])
|
||||
for face_landmark5 in distance_to_face_landmark_5(anchors, face_landmark_5_raw)[keep_indices]:
|
||||
face_landmark5_list.append(face_landmark5 * [ ratio_width, ratio_height ])
|
||||
for score in detections[index][keep_indices]:
|
||||
score_list.append(score[0])
|
||||
return bbox_list, kps_list, score_list
|
||||
return bounding_box_list, face_landmark5_list, score_list
|
||||
|
||||
|
||||
def detect_with_yunet(temp_frame : Frame, temp_frame_height : int, temp_frame_width : int, ratio_height : float, ratio_width : float) -> Tuple[List[Bbox], List[Kps], List[Score]]:
|
||||
def detect_with_yoloface(vision_frame : VisionFrame, face_detector_size : str) -> Tuple[List[BoundingBox], List[FaceLandmark5], List[Score]]:
|
||||
face_detector = get_face_analyser().get('face_detector')
|
||||
face_detector.setInputSize((temp_frame_width, temp_frame_height))
|
||||
face_detector.setScoreThreshold(facefusion.globals.face_detector_score)
|
||||
bbox_list = []
|
||||
kps_list = []
|
||||
face_detector_width, face_detector_height = unpack_resolution(face_detector_size)
|
||||
temp_vision_frame = resize_frame_resolution(vision_frame, face_detector_width, face_detector_height)
|
||||
ratio_height = vision_frame.shape[0] / temp_vision_frame.shape[0]
|
||||
ratio_width = vision_frame.shape[1] / temp_vision_frame.shape[1]
|
||||
bounding_box_list = []
|
||||
face_landmark5_list = []
|
||||
score_list = []
|
||||
|
||||
with THREAD_SEMAPHORE:
|
||||
_, detections = face_detector.detect(temp_frame)
|
||||
detections = face_detector.run(None,
|
||||
{
|
||||
face_detector.get_inputs()[0].name: prepare_detect_frame(temp_vision_frame, face_detector_size)
|
||||
})
|
||||
detections = numpy.squeeze(detections).T
|
||||
bounding_box_raw, score_raw, face_landmark_5_raw = numpy.split(detections, [ 4, 5 ], axis = 1)
|
||||
keep_indices = numpy.where(score_raw > facefusion.globals.face_detector_score)[0]
|
||||
if keep_indices.any():
|
||||
bounding_box_raw, face_landmark_5_raw, score_raw = bounding_box_raw[keep_indices], face_landmark_5_raw[keep_indices], score_raw[keep_indices]
|
||||
for bounding_box in bounding_box_raw:
|
||||
bounding_box_list.append(numpy.array(
|
||||
[
|
||||
(bounding_box[0] - bounding_box[2] / 2) * ratio_width,
|
||||
(bounding_box[1] - bounding_box[3] / 2) * ratio_height,
|
||||
(bounding_box[0] + bounding_box[2] / 2) * ratio_width,
|
||||
(bounding_box[1] + bounding_box[3] / 2) * ratio_height
|
||||
]))
|
||||
face_landmark_5_raw[:, 0::3] = (face_landmark_5_raw[:, 0::3]) * ratio_width
|
||||
face_landmark_5_raw[:, 1::3] = (face_landmark_5_raw[:, 1::3]) * ratio_height
|
||||
for face_landmark_5 in face_landmark_5_raw:
|
||||
face_landmark5_list.append(numpy.array(face_landmark_5.reshape(-1, 3)[:, :2]))
|
||||
score_list = score_raw.ravel().tolist()
|
||||
return bounding_box_list, face_landmark5_list, score_list
|
||||
|
||||
|
||||
def detect_with_yunet(vision_frame : VisionFrame, face_detector_size : str) -> Tuple[List[BoundingBox], List[FaceLandmark5], List[Score]]:
|
||||
face_detector = get_face_analyser().get('face_detector')
|
||||
face_detector_width, face_detector_height = unpack_resolution(face_detector_size)
|
||||
temp_vision_frame = resize_frame_resolution(vision_frame, face_detector_width, face_detector_height)
|
||||
ratio_height = vision_frame.shape[0] / temp_vision_frame.shape[0]
|
||||
ratio_width = vision_frame.shape[1] / temp_vision_frame.shape[1]
|
||||
bounding_box_list = []
|
||||
face_landmark5_list = []
|
||||
score_list = []
|
||||
|
||||
face_detector.setInputSize((temp_vision_frame.shape[1], temp_vision_frame.shape[0]))
|
||||
face_detector.setScoreThreshold(facefusion.globals.face_detector_score)
|
||||
with THREAD_SEMAPHORE:
|
||||
_, detections = face_detector.detect(temp_vision_frame)
|
||||
if detections.any():
|
||||
for detection in detections:
|
||||
bbox_list.append(numpy.array(
|
||||
bounding_box_list.append(numpy.array(
|
||||
[
|
||||
detection[0] * ratio_width,
|
||||
detection[1] * ratio_height,
|
||||
(detection[0] + detection[2]) * ratio_width,
|
||||
(detection[1] + detection[3]) * ratio_height
|
||||
]))
|
||||
kps_list.append(detection[4:14].reshape((5, 2)) * [ ratio_width, ratio_height])
|
||||
face_landmark5_list.append(detection[4:14].reshape((5, 2)) * [ ratio_width, ratio_height ])
|
||||
score_list.append(detection[14])
|
||||
return bbox_list, kps_list, score_list
|
||||
return bounding_box_list, face_landmark5_list, score_list
|
||||
|
||||
|
||||
def create_faces(frame : Frame, bbox_list : List[Bbox], kps_list : List[Kps], score_list : List[Score]) -> List[Face]:
|
||||
def prepare_detect_frame(temp_vision_frame : VisionFrame, face_detector_size : str) -> VisionFrame:
|
||||
face_detector_width, face_detector_height = unpack_resolution(face_detector_size)
|
||||
detect_vision_frame = numpy.zeros((face_detector_height, face_detector_width, 3))
|
||||
detect_vision_frame[:temp_vision_frame.shape[0], :temp_vision_frame.shape[1], :] = temp_vision_frame
|
||||
detect_vision_frame = (detect_vision_frame - 127.5) / 128.0
|
||||
detect_vision_frame = numpy.expand_dims(detect_vision_frame.transpose(2, 0, 1), axis = 0).astype(numpy.float32)
|
||||
return detect_vision_frame
|
||||
|
||||
|
||||
def create_faces(vision_frame : VisionFrame, bounding_box_list : List[BoundingBox], face_landmark5_list : List[FaceLandmark5], score_list : List[Score]) -> List[Face]:
|
||||
faces = []
|
||||
if facefusion.globals.face_detector_score > 0:
|
||||
sort_indices = numpy.argsort(-numpy.array(score_list))
|
||||
bbox_list = [ bbox_list[index] for index in sort_indices ]
|
||||
kps_list = [ kps_list[index] for index in sort_indices ]
|
||||
bounding_box_list = [ bounding_box_list[index] for index in sort_indices ]
|
||||
face_landmark5_list = [ face_landmark5_list[index] for index in sort_indices ]
|
||||
score_list = [ score_list[index] for index in sort_indices ]
|
||||
keep_indices = apply_nms(bbox_list, 0.4)
|
||||
keep_indices = apply_nms(bounding_box_list, 0.4)
|
||||
for index in keep_indices:
|
||||
bbox = bbox_list[index]
|
||||
kps = kps_list[index]
|
||||
bounding_box = bounding_box_list[index]
|
||||
face_landmark_68 = detect_face_landmark_68(vision_frame, bounding_box)
|
||||
landmark : FaceLandmarkSet =\
|
||||
{
|
||||
'5': face_landmark5_list[index],
|
||||
'5/68': convert_face_landmark_68_to_5(face_landmark_68),
|
||||
'68': face_landmark_68
|
||||
}
|
||||
score = score_list[index]
|
||||
embedding, normed_embedding = calc_embedding(frame, kps)
|
||||
gender, age = detect_gender_age(frame, bbox)
|
||||
embedding, normed_embedding = calc_embedding(vision_frame, landmark['5/68'])
|
||||
gender, age = detect_gender_age(vision_frame, bounding_box)
|
||||
faces.append(Face(
|
||||
bbox = bbox,
|
||||
kps = kps,
|
||||
bounding_box = bounding_box,
|
||||
landmark = landmark,
|
||||
score = score,
|
||||
embedding = embedding,
|
||||
normed_embedding = normed_embedding,
|
||||
@@ -202,41 +269,57 @@ def create_faces(frame : Frame, bbox_list : List[Bbox], kps_list : List[Kps], sc
|
||||
return faces
|
||||
|
||||
|
||||
def calc_embedding(temp_frame : Frame, kps : Kps) -> Tuple[Embedding, Embedding]:
|
||||
def calc_embedding(temp_vision_frame : VisionFrame, face_landmark_5 : FaceLandmark5) -> Tuple[Embedding, Embedding]:
|
||||
face_recognizer = get_face_analyser().get('face_recognizer')
|
||||
crop_frame, matrix = warp_face_by_kps(temp_frame, kps, 'arcface_112_v2', (112, 112))
|
||||
crop_frame = crop_frame.astype(numpy.float32) / 127.5 - 1
|
||||
crop_frame = crop_frame[:, :, ::-1].transpose(2, 0, 1)
|
||||
crop_frame = numpy.expand_dims(crop_frame, axis = 0)
|
||||
crop_vision_frame, matrix = warp_face_by_face_landmark_5(temp_vision_frame, face_landmark_5, 'arcface_112_v2', (112, 112))
|
||||
crop_vision_frame = crop_vision_frame / 127.5 - 1
|
||||
crop_vision_frame = crop_vision_frame[:, :, ::-1].transpose(2, 0, 1).astype(numpy.float32)
|
||||
crop_vision_frame = numpy.expand_dims(crop_vision_frame, axis = 0)
|
||||
embedding = face_recognizer.run(None,
|
||||
{
|
||||
face_recognizer.get_inputs()[0].name: crop_frame
|
||||
face_recognizer.get_inputs()[0].name: crop_vision_frame
|
||||
})[0]
|
||||
embedding = embedding.ravel()
|
||||
normed_embedding = embedding / numpy.linalg.norm(embedding)
|
||||
return embedding, normed_embedding
|
||||
|
||||
|
||||
def detect_gender_age(frame : Frame, bbox : Bbox) -> Tuple[int, int]:
|
||||
def detect_face_landmark_68(temp_vision_frame : VisionFrame, bounding_box : BoundingBox) -> FaceLandmark68:
|
||||
face_landmarker = get_face_analyser().get('face_landmarker')
|
||||
scale = 195 / numpy.subtract(bounding_box[2:], bounding_box[:2]).max()
|
||||
translation = (256 - numpy.add(bounding_box[2:], bounding_box[:2]) * scale) * 0.5
|
||||
crop_vision_frame, affine_matrix = warp_face_by_translation(temp_vision_frame, translation, scale, (256, 256))
|
||||
crop_vision_frame = crop_vision_frame.transpose(2, 0, 1).astype(numpy.float32) / 255.0
|
||||
face_landmark_68 = face_landmarker.run(None,
|
||||
{
|
||||
face_landmarker.get_inputs()[0].name: [ crop_vision_frame ]
|
||||
})[0]
|
||||
face_landmark_68 = face_landmark_68[:, :, :2][0] / 64
|
||||
face_landmark_68 = face_landmark_68.reshape(1, -1, 2) * 256
|
||||
face_landmark_68 = cv2.transform(face_landmark_68, cv2.invertAffineTransform(affine_matrix))
|
||||
face_landmark_68 = face_landmark_68.reshape(-1, 2)
|
||||
return face_landmark_68
|
||||
|
||||
|
||||
def detect_gender_age(temp_vision_frame : VisionFrame, bounding_box : BoundingBox) -> Tuple[int, int]:
|
||||
gender_age = get_face_analyser().get('gender_age')
|
||||
bbox = bbox.reshape(2, -1)
|
||||
scale = 64 / numpy.subtract(*bbox[::-1]).max()
|
||||
translation = 48 - bbox.sum(axis = 0) * 0.5 * scale
|
||||
affine_matrix = numpy.array([[ scale, 0, translation[0] ], [ 0, scale, translation[1] ]])
|
||||
crop_frame = cv2.warpAffine(frame, affine_matrix, (96, 96))
|
||||
crop_frame = crop_frame.astype(numpy.float32)[:, :, ::-1].transpose(2, 0, 1)
|
||||
crop_frame = numpy.expand_dims(crop_frame, axis = 0)
|
||||
bounding_box = bounding_box.reshape(2, -1)
|
||||
scale = 64 / numpy.subtract(*bounding_box[::-1]).max()
|
||||
translation = 48 - bounding_box.sum(axis = 0) * scale * 0.5
|
||||
crop_vision_frame, affine_matrix = warp_face_by_translation(temp_vision_frame, translation, scale, (96, 96))
|
||||
crop_vision_frame = crop_vision_frame[:, :, ::-1].transpose(2, 0, 1).astype(numpy.float32)
|
||||
crop_vision_frame = numpy.expand_dims(crop_vision_frame, axis = 0)
|
||||
prediction = gender_age.run(None,
|
||||
{
|
||||
gender_age.get_inputs()[0].name: crop_frame
|
||||
gender_age.get_inputs()[0].name: crop_vision_frame
|
||||
})[0][0]
|
||||
gender = int(numpy.argmax(prediction[:2]))
|
||||
age = int(numpy.round(prediction[2] * 100))
|
||||
return gender, age
|
||||
|
||||
|
||||
def get_one_face(frame : Frame, position : int = 0) -> Optional[Face]:
|
||||
many_faces = get_many_faces(frame)
|
||||
def get_one_face(vision_frame : VisionFrame, position : int = 0) -> Optional[Face]:
|
||||
many_faces = get_many_faces(vision_frame)
|
||||
if many_faces:
|
||||
try:
|
||||
return many_faces[position]
|
||||
@@ -245,52 +328,64 @@ def get_one_face(frame : Frame, position : int = 0) -> Optional[Face]:
|
||||
return None
|
||||
|
||||
|
||||
def get_average_face(frames : List[Frame], position : int = 0) -> Optional[Face]:
|
||||
def get_average_face(vision_frames : List[VisionFrame], position : int = 0) -> Optional[Face]:
|
||||
average_face = None
|
||||
faces = []
|
||||
embedding_list = []
|
||||
normed_embedding_list = []
|
||||
for frame in frames:
|
||||
face = get_one_face(frame, position)
|
||||
|
||||
for vision_frame in vision_frames:
|
||||
face = get_one_face(vision_frame, position)
|
||||
if face:
|
||||
faces.append(face)
|
||||
embedding_list.append(face.embedding)
|
||||
normed_embedding_list.append(face.normed_embedding)
|
||||
if faces:
|
||||
first_face = get_first(faces)
|
||||
average_face = Face(
|
||||
bbox = faces[0].bbox,
|
||||
kps = faces[0].kps,
|
||||
score = faces[0].score,
|
||||
bounding_box = first_face.bounding_box,
|
||||
landmark = first_face.landmark,
|
||||
score = first_face.score,
|
||||
embedding = numpy.mean(embedding_list, axis = 0),
|
||||
normed_embedding = numpy.mean(normed_embedding_list, axis = 0),
|
||||
gender = faces[0].gender,
|
||||
age = faces[0].age
|
||||
gender = first_face.gender,
|
||||
age = first_face.age
|
||||
)
|
||||
return average_face
|
||||
|
||||
|
||||
def get_many_faces(frame : Frame) -> List[Face]:
|
||||
def get_many_faces(vision_frame : VisionFrame) -> List[Face]:
|
||||
faces = []
|
||||
try:
|
||||
faces_cache = get_static_faces(frame)
|
||||
faces_cache = get_static_faces(vision_frame)
|
||||
if faces_cache:
|
||||
faces = faces_cache
|
||||
else:
|
||||
faces = extract_faces(frame)
|
||||
set_static_faces(frame, faces)
|
||||
if facefusion.globals.face_detector_model == 'retinaface':
|
||||
bounding_box_list, face_landmark5_list, score_list = detect_with_retinaface(vision_frame, facefusion.globals.face_detector_size)
|
||||
faces = create_faces(vision_frame, bounding_box_list, face_landmark5_list, score_list)
|
||||
if facefusion.globals.face_detector_model == 'yoloface':
|
||||
bounding_box_list, face_landmark5_list, score_list = detect_with_yoloface(vision_frame, facefusion.globals.face_detector_size)
|
||||
faces = create_faces(vision_frame, bounding_box_list, face_landmark5_list, score_list)
|
||||
if facefusion.globals.face_detector_model == 'yunet':
|
||||
bounding_box_list, face_landmark5_list, score_list = detect_with_yunet(vision_frame, facefusion.globals.face_detector_size)
|
||||
faces = create_faces(vision_frame, bounding_box_list, face_landmark5_list, score_list)
|
||||
if faces:
|
||||
set_static_faces(vision_frame, faces)
|
||||
if facefusion.globals.face_analyser_order:
|
||||
faces = sort_by_order(faces, facefusion.globals.face_analyser_order)
|
||||
if facefusion.globals.face_analyser_age:
|
||||
faces = filter_by_age(faces, facefusion.globals.face_analyser_age)
|
||||
if facefusion.globals.face_analyser_gender:
|
||||
faces = filter_by_gender(faces, facefusion.globals.face_analyser_gender)
|
||||
return faces
|
||||
except (AttributeError, ValueError):
|
||||
return []
|
||||
pass
|
||||
return faces
|
||||
|
||||
|
||||
def find_similar_faces(frame : Frame, reference_faces : FaceSet, face_distance : float) -> List[Face]:
|
||||
def find_similar_faces(reference_faces : FaceSet, vision_frame : VisionFrame, face_distance : float) -> List[Face]:
|
||||
similar_faces : List[Face] = []
|
||||
many_faces = get_many_faces(frame)
|
||||
many_faces = get_many_faces(vision_frame)
|
||||
|
||||
if reference_faces:
|
||||
for reference_set in reference_faces:
|
||||
@@ -315,17 +410,17 @@ def calc_face_distance(face : Face, reference_face : Face) -> float:
|
||||
|
||||
def sort_by_order(faces : List[Face], order : FaceAnalyserOrder) -> List[Face]:
|
||||
if order == 'left-right':
|
||||
return sorted(faces, key = lambda face: face.bbox[0])
|
||||
return sorted(faces, key = lambda face: face.bounding_box[0])
|
||||
if order == 'right-left':
|
||||
return sorted(faces, key = lambda face: face.bbox[0], reverse = True)
|
||||
return sorted(faces, key = lambda face: face.bounding_box[0], reverse = True)
|
||||
if order == 'top-bottom':
|
||||
return sorted(faces, key = lambda face: face.bbox[1])
|
||||
return sorted(faces, key = lambda face: face.bounding_box[1])
|
||||
if order == 'bottom-top':
|
||||
return sorted(faces, key = lambda face: face.bbox[1], reverse = True)
|
||||
return sorted(faces, key = lambda face: face.bounding_box[1], reverse = True)
|
||||
if order == 'small-large':
|
||||
return sorted(faces, key = lambda face: (face.bbox[2] - face.bbox[0]) * (face.bbox[3] - face.bbox[1]))
|
||||
return sorted(faces, key = lambda face: (face.bounding_box[2] - face.bounding_box[0]) * (face.bounding_box[3] - face.bounding_box[1]))
|
||||
if order == 'large-small':
|
||||
return sorted(faces, key = lambda face: (face.bbox[2] - face.bbox[0]) * (face.bbox[3] - face.bbox[1]), reverse = True)
|
||||
return sorted(faces, key = lambda face: (face.bounding_box[2] - face.bounding_box[0]) * (face.bounding_box[3] - face.bounding_box[1]), reverse = True)
|
||||
if order == 'best-worst':
|
||||
return sorted(faces, key = lambda face: face.score, reverse = True)
|
||||
if order == 'worst-best':
|
||||
@@ -336,13 +431,7 @@ def sort_by_order(faces : List[Face], order : FaceAnalyserOrder) -> List[Face]:
|
||||
def filter_by_age(faces : List[Face], age : FaceAnalyserAge) -> List[Face]:
|
||||
filter_faces = []
|
||||
for face in faces:
|
||||
if face.age < 13 and age == 'child':
|
||||
filter_faces.append(face)
|
||||
elif face.age < 19 and age == 'teen':
|
||||
filter_faces.append(face)
|
||||
elif face.age < 60 and age == 'adult':
|
||||
filter_faces.append(face)
|
||||
elif face.age > 59 and age == 'senior':
|
||||
if categorize_age(face.age) == age:
|
||||
filter_faces.append(face)
|
||||
return filter_faces
|
||||
|
||||
@@ -350,8 +439,6 @@ def filter_by_age(faces : List[Face], age : FaceAnalyserAge) -> List[Face]:
|
||||
def filter_by_gender(faces : List[Face], gender : FaceAnalyserGender) -> List[Face]:
|
||||
filter_faces = []
|
||||
for face in faces:
|
||||
if face.gender == 0 and gender == 'female':
|
||||
filter_faces.append(face)
|
||||
if face.gender == 1 and gender == 'male':
|
||||
if categorize_gender(face.gender) == gender:
|
||||
filter_faces.append(face)
|
||||
return filter_faces
|
||||
|
||||
Reference in New Issue
Block a user