Opportunity
Low-light image enhancement is a persistent challenge in image processing. Traditional methods (histogram equalization, Retinex models) often produce unnatural results or are computationally expensive. Deep learning approaches improve quality but typically require manually pre-set hyperparameters to connect low-light inputs with reference images—a non-one-to-one relationship that limits practical application. Vision transformers, while powerful, demand high computational resources for pixel-level image outputs. Moreover, existing methods lack the ability to adaptively adjust enhanced results based on learned sample-specific properties from training data. There is a need for an efficient, plug-and-play enhancement system that can recall dataset-wide normal-light properties during testing to refine low-light image enhancements without extra training or manual parameter tuning.
Technology
This patent presents an external memory-augmented network for low-light image enhancement. The system receives a low-light input image and processes it through a pre-trained transformer-based image enhancer (symmetric encoder-decoder with 4-level pyramid features and skip connections) to generate an initial enhanced image. Meanwhile, a pre-trained ResNet-18 feature generator extracts a query feature from the same input.
An image memory (plug-and-play external memory) stores a dictionary of memory keys and response values—sample-specific properties extracted from training data that represent desired normal-light image values. A memory reading module computes cosine similarity between the query and all memory keys to retrieve the most relevant response value. An adaptive fusion module then combines this retrieved response value with global average pooling data from the initial enhanced image. It calculates a ratio through element-wise division, concatenates the two inputs, applies a softmax function to derive weight vectors, and generates an adjustment factor. The final enhanced image is produced by applying this factor to the initial output. A memory writing process updates the dictionary during training when the difference between the retrieved value and desired ground-truth value exceeds a threshold.
Advantages
- Plug-and-Play Memory: External memory acts as a drop-in module for any existing enhancement system, improving quality without retraining.
- Adaptive Adjustment: Generates sample-specific adjustment factors using retrieved normal-light properties, avoiding manual hyperparameter tuning.
- Efficient Architecture: Transformer applies self-attention on channel dimensions (complexity O(C²)) rather than spatial dimensions (O(H²W²)), reducing computational cost.
- Recalls Dataset Properties: Memory stores sample-specific relationships across entire training dataset, preventing "forgetting" during testing.
- Robust Enhancement: Combines initial transformer-based enhancement with memory-guided adaptive fusion for superior results.
- Automatic Update: Memory writing process automatically updates keys and ages based on query-ground truth distance threshold.
Applications
- Night Photography: Enhancing smartphone or camera images captured in low-light conditions (night scenes, backlit subjects).
- Surveillance Systems: Improving visibility of road surveillance, security camera, or drone footage in poor lighting.
- Autonomous Vehicles: Enhancing low-light images from vehicle cameras for better object detection and navigation.
- Medical Imaging: Improving poorly illuminated endoscopic or microscopic images.
- Terrain & Aerial Survey: Enhancing drone or satellite images captured during dusk, dawn, or overcast conditions.
