Genomic Data Search and Analytics with Applications to Colorectal Cancer Subtype Classification

Principal Investigator:

KaChun Wong(CS).jpg

Dr Ka Chun WONG
Assistant Professor, Department of Computer Science

Co-Principal Investigators:

  • Dr Xin WANG (Associate Head and Associate Professor, Department of Biomedical Sciences)
  • Dr Jun LI (Assistant Professor, Department of Infectious Diseases and Public Health; Affiliate Assistant Professor, School of Data Science)
  • Prof Zhaolei ZHANG (Assistant Professor, Associate Professor, Full Professor, Donnelly Centre for Cellular and Biomolecular Research, Department of Molecular Genetics, University of Toronto Faculty of Medicine)
  • Prof Jun YU (Professor, Department of Medicine and Therapeutics; Director, Research Laboratory of Digestive Disease; Associate Director, State Key Laboratory of Digestive Disease, The Chinese University of Hong Kong)

Project Period: 1 January 2020 – 31 December 2021

According to Hong Kong Cancer Registry, colorectal cancer is the top cancer in Hong Kong. Fortunately, the high-throughput sequencing technologies have enabled the generation of a deluge of colorectal cancer genome data. Such data amount is unprecedentedly rich and should be exploited for in-depth search and analysis. Although the existing colorectal cancer studies have already revealed candidate genes of interest, the current analyses are usually limited to the classic methods which may underestimate the complex combinatorics of colorectal cancer prognosis. Efficient search and algorithmic analytics need to be developed to exploit the genomics domain knowledge to harness those genome data to rescue any missing genomic insight before we can confidently move into the era of personalized medicine, avoiding any side effect.

This collaborative research project extends the existing strong collaborations and build upon project team’s well-published preliminary works in bioinformatics to form an interdisciplinary team, developing efficient algorithms for searching and analysing sequencing data with applications to colorectal cancer subtype classification.

The project aims to achieve the following research objectives:

  1. To develop computationally efficient search algorithms to search a genomic signature against the exponentially growing genome data for retrieving similar genomic signatures within minutes on affordable computing machines.
  2. To develop computationally efficient data analytics algorithms to discover novel genomic signatures from the exponentially growing genome data within minutes on affordable computing machines.
  3. To apply the efficient search and data analytics algorithms to reveal novel insights into colorectal cancer genome regulation patterns which can improve colorectal cancer subtype detection and classification.
  4. To release the algorithms and related datasets as open-source software and public databases to the scientific community, fostering long-term sustainability and promoting the CityU data science research in scientific reproducibility.

 Publications

  1. Reactions’ descriptors selection and yield estimation using metaheuristic algorithms and voting ensemble
    Petinrin, O.O., Saeed, F., Li, X., Ghabban, F. and Wong, K.C., Oct 2022, In: Computers, Materials and Continua, 70(3), pp.4745-4762.
  2. Colorectal Cancer Subtype Identification from Differential Gene Expression Levels using Minimalist Deep Learning
    Li, S., Yang, Y., Wang, X., Li, J., Yu, J., Li, X. & Wong, K., Apr 2022, In: BioData Mining. 15, 12
  3. Particle Swarm Optimized Gaussian Process Classifier for Treatment Discontinuation Prediction in Multicohort Metastatic Castration-Resistant Prostate Cancer Patients
    Petinrin, O. O., Li, X. & Wong, K., Mar 2022, In: IEEE Journal of Biomedical and Health Informatics. 26, 3, p.
  4. EGFI : drug-drug interaction extraction and generation with fusion of enriched entity and sentence information
    Huang, L., Lin, J., Li, X., Song, L., Zheng, Z. and Wong, K.C., Jan 2022, In: Briefings in Bioinformatics, 23(1), p.bbab451.
  5. Metric Learning Based Vision Transformer for Product Matching
    Huang, L., Shao, W., Wang, F., Xie, W. & Wong, K., Dec 2021, In: Neural Information Processing: 28th International Conference, ICONIP 2021, Sanur, Bali, Indonesia, December 8–12, 2021, Proceedings, Part I 28 (pp. 3-13). Springer International Publishing.
  6. Multiobjective Genome-Wide RNA-Binding Event Identification from CLIP-seq Data
    Li, X., Zhang, S. & Wong, K., Dec 2021, In: IEEE Transactions on Cybernetics. 51, 12, p. 5811-5824
  7. CancerEMC : frontline non-invasive cancer screening from circulating protein biomarkers and mutations in cell-free DNA
    Rahaman, S., Li, X., Yu, J. & Wong, K., Oct 2021, In: Bioinformatics. 37, 19, p. 3319–3327
  8. Early cancer detection from genome-wide cell-free DNA fragmentation via shuffled frog leaping algorithm and support vector machine
    Liu, L., Chen, X. & Wong, K., Oct 2021, In: Bioinformatics. 37, 19, p. 3099–3105 7 p.
  9. Nature-Inspired Compressed Sensing for Transcriptomic Profiling From Random Composite Measurements
    Zhang, S., Li, X., Lin, Q. & Wong, K., Sep 2021, In: IEEE Transactions on Cybernetics. 51, 9, p. 4476-4487
  10. Identification of pan-cancer Ras pathway activation with deep learning
    Li, X., Li, S., Wang, Y., Zhang, S. & Wong, K., Jul 2021, In: Briefings in Bioinformatics. 22, 4, bbaa258.
  11. Machine Learning Protocols in Early Cancer Detection Based on Liquid Biopsy : A Survey
    Liu, L., Chen, X., Petinrin, O. O., Zhang, W., Rahaman, S., Tang, Z. & Wong, K., Jul 2021, In: Life. 11, 7, 638
  12. RNCE : network integration with reciprocal neighbors contextual encoding for multi-modal drug community study on cancer targets
    Chen, J. & Wong, K., May 2021, In: Briefings in Bioinformatics. 22, 3, bbaa118.
  13. Noninvasive early diagnosis of intestinal diseases based on artificial intelligence in genomics and microbiome
    Zhang, W., Chen, X. & Wong, K., Apr 2021, In: Journal of Gastroenterology and Hepatology (Australia). 36, 4, p. 823-831
  14. CRISPR-Net A Recurrent Convolutional Network Quantifies CRISPR Off‐Target Activities with Mismatches and Indels
    Lin, J., Zhang, Z., Zhang, S., Chen, J. & Wong, K., Jul 2020, In: Advanced Science. 7, 13, p. 1903562