System and a Method for Detecting Computer-Generated Images

Link Copied.

Opportunity

Advances in AI generative models (GANs, diffusion models) and rendering techniques have made computer-generated (CG) images increasingly realistic and difficult to distinguish from natural photographic (PG) images. This poses serious threats to information authenticity, as fake CG images can spread misinformation, fool documentary awards, and undermine trust in digital media. Traditional detection methods rely on hand-crafted features (texture, color, local binary patterns) that are tedious to design and lack robustness across diverse content and generation methods. Deep learning approaches often overfit to specific datasets and fail to generalize across different CG generation modalities (traditional rendering, GAN, diffusion models). Moreover, existing methods do not explicitly model the fundamental acquisition differences between PG (hardware + software traces) and CG (software-only traces), nor do they amplify the subtle discriminative traces for effective learning. A robust, generalizable, and trace-aware detection system is urgently needed.

Technology

This patent presents a multi-scale deep texture learning network (MDTL-NET) that detects CG images by analyzing intrinsic image traces (texture perturbation, high-frequency residuals, global spatial traces). The system comprises three synergistic modules. First, a Global Texture Representation Module (GTRM) using Gram matrix-based activation captures relationships and differences in multi-scale texture patterns, mimicking gray-level co-occurrence matrix analysis. Second, a Deep Texture Enhancement Module (DTEM) amplifies discriminative trace differences via semantic segmentation map-guided affine transformations and CNN-based texture recovery, making subtle CG/PG disparities more pronounced (enhanced images show increased high-frequency components for PG, decreased for CG). Third, an Attention-Based Feature Perception Module (AFPM) with sequential channel-spatial attention explores discriminative traces in both channel and spatial dimensions. A deep parsing network generates segmentation maps to guide the affine transformations. Features from all three modules are concatenated and fed through fully connected and softmax layers to output a probability of whether the input is CG or PG. The system is trained on a large, diverse new dataset (DSGCG) containing traditional CG, GAN, diffusion model images, and PG images from various cameras, resolutions, and scenes.

Advantages

Trace-Aware Design: Explicitly models fundamental acquisition differences between PG (hardware+software traces) and CG (software-only), leading to better generalization.
Multi-Modal Coverage: Detects traditional CG, GAN-generated, and diffusion model-generated images within a single framework.
Texture Difference Amplification: DTEM amplifies subtle discriminative traces, making CG/PG differences more detectable and suppressing content interference.
Superior Performance: Achieves 95.38% accuracy on the diverse DSGCG dataset, outperforming variants without enhancement or attention modules.
Robust to Post-Processing: Maintains high detection accuracy even when images are compressed (JPEG) or have added noise.
Cross-Modal Generalization: Trained on one CG type (e.g., traditional CG) and tested on another (e.g., GAN or DM) still yields strong results, demonstrating generalizability.
Balanced Detection: Achieves comparable True Positive and True Negative rates, avoiding bias toward either image class.

Applications

Digital Media Forensics: Authenticating news images, documentary photos, and user-generated content to detect AI-generated fakes.
Social Media Content Moderation: Flagging synthetic images that could spread misinformation or deepfakes.
Legal & Evidence Verification: Validating the authenticity of photographic evidence submitted in court or investigations.
Cybersecurity & Fraud Prevention: Detecting fake profile pictures, synthetic identity documents, or AI-generated phishing images.
Content Authenticity Certification: Providing an automated tool for image authenticity verification services and digital rights management.

Remarks

CMDA: P00096

IP Status

Patent filed

Technology Readiness Level (TRL)