Multimedia Software Engineering Research Centre Logo

Recent Projects

Dr. Hau San WONG

Unsupervised Source-free Domain Adaptation

Unsupervised Source-free Domain Adaptation

Deep learning makes impressive progress based on large-scale and well-annotated datasets. In real-world applications, it often assumes that the test data has a similar distribution with the training data. However, large-scale and well-annotated datasets are time-consuming and expensive. Furthermore, the test assumption can be easily violated. The model performance may be seriously degraded due to lacking well-annotated training data or the training and test data following different distributions. To alleviate the performance degradation, we focus on unsupervised domain adaptation, which attempts to leverage the learned knowledge from a labeled training dataset (source domain) to adapt the deep model to a completely unlabeled test dataset (target domain). Mainstream domain adaptation works focus on minimizing the discrepancy between the source and target domains within a discriminative modeling framework. However, it is difficult to take into consideration the class-level distribution during discrepancy minimization. Different from previous methods, we propose unsupervised source-free domain adaptation methods based on generative models. Generative models are beneficial for learning the distribution of the target domain and based on the knowledge, these models generate training data, which augments the performance on the target domain.

Semi-supervised Pedestrian Detection

Semi-supervised Pedestrian Detection

Pedestrian detection is a fundamental and critical step toward many real-world applications, such as surveillance and autonomous driving. Precise detection of pedestrians is crucial for understanding the target environment. The performance of existing pedestrian detection methods is highly dependent on sufficient labeled training data. To reduce the dependence, we propose a semi-supervised pedestrian detection approach for the case where only a small portion of training data are annotated. Our aim is to improve the quality of the pedestrian instances collected from unannotated data. Toward this end, a Generative Adversarial Network (GAN)-based model is designed to improve the collected instances in terms of alignment and completeness of human body structure. The resulting instances increase the diversity of training data and eventually lead to better detection performance in the target scene.

Fine-grained Image Synthesis with Limited Labeled Data

Fine-grained Image Synthesis with Limited Labeled Data

Generic conditional generative modeling is based on the assumption that the amount of labeled training data is adequate. This does not hold when learning on fine-grained data, due to the reason that both data acquisition and annotation may be expensive and require extensive expertise. On the other hand, the distinctions among different categories can be subtle. To precisely capture class semantics, while at the same time alleviate the level of complexity of both model design and training, we propose a single-stage controllable generative adversarial network. In our model, we take into account three factors of variation for generative modeling: class-independent content, cross-class attributes and class semantics, and associate them with different variables. The proposed approach delivers a semantically controllable image generation process and high-fidelity synthesized images of fine-grained categories.