Dimensionality Reduction and Variable Selection in Multivariate Varying-Coefficient Models with a Large Number of Covariates

Dr Heng Lian
Date & Time
13 Jul 2016 (Wed) | 10:30 AM - 11:30 AM
Venue
B6605, AC1

ABSTRACT

Motivated by the study of gene and environment interactions, we consider a multivariate response varying-coefficient model with a large number of covariates. The need of non-parametrically estimating a large number of coefficient functions given relatively limited data poses a big challenge for fitting such a model. To overcome the challenge, we develop a method that incorporates three ideas: (i) approximate the unknown functions by polynomial splines; (ii) reduce the number of unknown functions to be estimated by using (non-centered) principal components; (iii) apply sparsityinducing penalization to select relevant covariates. The three ideas are integrated into a penalized least squares framework. Our asymptotic theory shows that the proposed method can consistently identify relevant covariates and can estimate the nonzero functions with the same convergence rate as when only the relevant variables are included in the model. We also develop a novel computational algorithm to solve the penalized least squares problem by combining proximal algorithms and optimization over Stiefel manifolds. Our method is illustrated using data from Framingham Heart Study