Opportunity
Continual learning aims to train machine learning models on a sequence of tasks without forgetting previous knowledge. Existing methods face significant challenges. Regularization-based methods add penalties to preserve important weights but suffer from a stability-plasticity dilemma—as the number of tasks increases, forgetting worsens. Structure-based methods dynamically add neurons or parameters for new tasks, which becomes computationally expensive. Replay-based methods store previous samples in a buffer and rehearse them during new task training, but they lack mechanisms to explicitly model relationships between tasks. Moreover, most approaches focus on instance-level learning without capturing cross-task information, leading to suboptimal feature decomposition. There is a need for a method that decouples task-invariant features (general knowledge shared across tasks) from task-variant features (task-specific knowledge), enabling better knowledge transfer and reducing forgetting.
Technology
This patent presents a system that models task features by decoupling task-invariant and task-variant components using multi-level contrastive learning. A task input gateway receives a series of task samples. A feature extraction engine processes them, while a replay buffer stores representative samples from previous tasks. The core innovation is the Knowledge Decomposition Constraint (KDC) module, which constructs inner-task relations (within the current task) and cross-task relations (between current and previous tasks from the replay buffer) using three types of contrastive learning pairs.
First, negative pairs push task-invariant and task-variant features apart to establish a clear boundary. Second, positive pairs pull all task-invariant features together regardless of class or task, making them indistinguishable and generalizable. Third, variant-positive pairs gather task-variant features from the same class while pushing apart those from different classes. An orthogonal distance loss further enlarges the separation. Cross-task relations extend these pairs by including features from replayed samples, explicitly modeling connections between current and past tasks. The decoupled features are concatenated and fed to task-specific classifiers. A discriminator ensures invariant features remain task-agnostic. The system updates the replay buffer after each task.
Advantages
- Effective Feature Decoupling: Three types of contrastive pairs explicitly separate invariant (general) from variant (task-specific) features, improving knowledge transfer.
- Cross-Task Relation Modeling: Replay buffer samples are used to construct cross-task contrastive pairs, directly alleviating catastrophic forgetting.
- Stability-Plasticity Balance: Clear separation of features avoids the trade-off faced by regularization methods.
- Computationally Efficient: No dynamic network expansion; uses fixed architecture with replay buffer, reducing resource consumption.
- Improved Classification Performance: Concatenating decoupled invariant and variant features provides richer representations for task-specific classifiers.
- Demonstrated Generalization: Applicable to various continual learning scenarios (supervised, semi-supervised) and domains (computer vision, etc.).
Applications
- Continual Learning in Computer Vision: Learning new object classes or scenes sequentially without forgetting previous ones (e.g., robotic vision, autonomous driving).
- Natural Language Processing: Task-invariant language understanding with task-specific adapters for sentiment analysis, translation, or question answering.
- Healthcare AI: Learning from sequential patient data or new disease categories without retraining from scratch.
- Recommendation Systems: Adapting to new user behaviors or product categories while retaining knowledge of past preferences.
- Industrial Automation: Incrementally learning new assembly tasks or quality inspection criteria on a single robotic system.
