System for Image Processing, a Method of Facial Expression Detection and a Method of Reconstruction Image Generation

Link Copied.

Opportunity

Facial micro-expressions (FMEs) are brief, involuntary facial movements that reveal a person's true emotions, often when they are trying to conceal them. This makes FMEs a crucial cue for deception detection, with significant applications in national safety, political psychology, and medical care. However, analyzing FMEs is extremely challenging for both humans and automated systems. Unlike macro-expressions, FMEs occur within a very short time window (0.065-0.5 seconds), involve only slight variations and limited movement areas on the face. These characteristics make them nearly impossible to observe without dedicated training.
Furthermore, the development of robust automatic FME analysis is severely hampered by a lack of large, high-quality, ecologically valid databases. While some spontaneous FME databases exist, they are often created using artificial laboratory paradigms, which differ from real-life interpersonal contacts. Creating sufficiently large databases through manual annotation is an exceedingly time-consuming and labor-intensive process. Consequently, data-driven deep learning methods suffer from poor performance due to this data scarcity. Although generating new FME samples from existing ones using image animation techniques is a promising solution, current motion-based generative models often fail to capture the extremely subtle changes characteristic of FMEs, as they do not prioritize the edges and fine-grained details where these micro-movements primarily occur.

Technology

This patent introduces an Edge-Aware Motion generation method for Facial Micro-Expression Generation (EAM-FMEG), a system and method specifically designed to overcome the challenge of generating realistic facial micro-expression sequences. At its core, the technology innovatively enhances a deep motion retargeting network by integrating edge-awareness mechanisms to capture and amplify subtle facial movements.

The method begins with a facial expression extraction module that receives a source video of a sample target performing a specific FME. To detect the minute motions of key facial features (like eyebrows or lips), the system performs an auxiliary task: predicting the edges of moving regions on the face. It generates precise edge maps using an extended Difference of Gaussians (XDoG) operator, which serves as a guide for motion estimation, forcing the neural network to focus on boundaries where subtle changes are most apparent.

The technology employs an unsupervised deep motion retargeting module that first estimates sparse motion using a set of learned keypoints and their local affine transformations (Jacobians). A dense motion estimator then combines these sparse motions with learned masks to create a comprehensive motion field. The system then generates a warped image of the target face.

The key innovation lies in an Edge-Intensified Multi-Head Self-Attention (EIMHSA) module. This module takes the warped image and the warped predicted edge map as inputs. It uses the edge information as a "query" signal to search for and assign higher attention weights to edge-associated regions. By focusing on these important facial regions, the module effectively enhances the warped image representation. Finally, an image generator uses this enhanced representation to reconstruct a new, realistic video sequence where a target individual performs the exact same FME, complete with its subtle dynamics, without the target ever having to perform the expression.

Advantages

Superior Subtlety Capture: The integration of explicit edge prediction and the novel EIMHSA module allows the method to capture and generate significantly finer, more realistic micro-movements compared to baseline models like FOMM.
Strong Generalization Ability: The method demonstrates robust cross-database generalization, successfully generating FMEs even when trained on one dataset (e.g., CASME II) and tested on source sequences from a completely different one (e.g., SMIC or SAMM), including across grayscale and RGB domains.
Improved Reconstruction Quality: Quantitative results show that the edge-aware components lead to a large decrease in reconstruction loss, indicating higher fidelity in the generated images.
Reduced Artifacts: The edge-aware mechanism helps eliminate visual defects and unnatural artifacts common in other FME generation methods, as evidenced by higher expert evaluation scores.
Data Augmentation Capability: Provides a powerful tool to synthetically expand small, hard-to-collect FME databases, enabling the training of more robust deep learning models for FME recognition and analysis.

Applications

Security & Deception Detection: Generating synthetic training data for systems designed to detect deceit in airports, border control, and legal proceedings.
Mental Healthcare: Creating tools to help clinicians analyze patient micro-expressions for conditions like depression or anxiety, or using the technology to generate expressions for therapeutic exposure.
Human-Computer Interaction (HCI): Developing emotionally responsive AI agents, avatars, and robots that can recognize and display authentic, nuanced facial expressions.
Market Research & Advertising: Analyzing consumer micro-expressions in response to advertisements or products to gain deeper insights into genuine, subconscious preferences.
Automated Driver Assistance: Detecting driver fatigue, distraction, or aggression by analyzing their micro-expressions to improve road safety.

Remarks

CIMDA: P00001

IP Status

Patent filed

Technology Readiness Level (TRL)