Towards Linguistically-motivated Text Readability Assessment for Chinese Learning in Hong Kong

Principal Investigators:

DingXuan.Zhou_.jpg

Prof Ding-Xuan ZHOU
Associate Dean and Chair Professor, School of Data Science

Meichun Liu(LT).jpg

Prof Meichun LIU
Head and Professor, Department of Linguistics and Translation

John Lee(LT).jpg

Dr John LEE
Associate Professor, Department of Linguistics and Translation

Zijun.Zhang_.jpg

Dr Zijun ZHANG Associate Professor, School of Data Science


 Project Period: 1 September 2019 - 31 August 2022

Text readability assessment analyses the difficulty of a reading materials and its most suitable school grades.  Current approaches rely largely on surface cues, such as word frequency and sentence length.  Syntactic and semantic complexity, however, also have significant impact on text difficulty.  This project aims to develop a linguistically-motivated readability assessment system that leads to more accurate prediction of text readability.  We are also interested in developing new deep neural network architectures and related deep learning algorithms suitable for Chinese.

In order to promote self-directed reading and learning among the Hong Kong public, the system will be web-based and made freely accessible.  For Chinese teachers, it will assist with selection of pedagogical material.  For students and parents, the system can identify suitable extra-curricular reading material.  To the best of our knowledge, this represents the first online text readability assessment system that specifically aims at improving Chinese teaching and learning in primary schools of HKSAR.

The project aims to achieve the following research objectives:

  1. To construct a 100K-word corpus of Chinese reading materials of Hong Kong primary schools.
  2. To annotate each text in this corpus with its lexical, syntactic and semantic difficulty levels, as well as the linguistic items and structures underlying its complexity.
  3. To design a text readability assessment model that takes lexical, syntactic, and semantic complexity of the text into account and train the text readability assessment model with the corpus, and to evaluate its accuracy in predicting the most suitable primary grade level for a text.
  4. To develop new deep learning methods suitable for Chinese.

 Publications

  1. Metaphorical Polysemy of the Chinese Color Term hēi ⿊ “black” : A corpus-based cognitive semantic analysis with Behavioral Profiles
    Liu, M. & Dou, J., Jun 2023, In: International Journal of Corpus Linguistics
  2. Correction to : Universal Dependencies for Mandarin Chinese
    Poiret, R., Wong, T., Lee, J., Gerdes, K. & Leung, H, Nov 2021, In: Language Resources and Evaluation.
  3. Exploring the Role of Verb Frames in Assessing Semantic Difficulty
    Liu, M. & Zhang, Z., Oct 2021, In: ExLing 2021: Proceedings of 12th International Conference of Experimental Linguistics. Botinis, A. (ed.). Athens, Greece: International Society of Experimental Linguistics, p. 153-156 (ExLing Proceedings).
  4. Text Retrieval for Language Learners : Graded Vocabulary vs. Open Learner Model
    Lee, J. S. Y. & Yeung, C. Y., Sep 2021, In: International Conference Recent Advances in Natural Language Processing, RANLP 2021: Deep Learning for Natural Language Processing Methods and Applications - Proceedings. Angelova, G., Kunilovskaya, M., Mitkov, R. & Nikolova-Koleva, I. (eds.). INCOMA Ltd., p. 798-804 (International Conference Recent Advances in Natural Language Processing, RANLP).
  5. A Cluster Analysis on Associative Meanings of hei (黑) Using Behavioral Profiles Approach
    Liu M, Dou, J, May 2021, In: 2021 Chinese Lexical Semantics Workshop
  6. A Scene-oriented Frame-based Conversation Agent to facilitate Learning Chinese as a Second language
    Zhang, Z, Liu, M, May 2021, In: the 11th international conference and workshops on technology and Chinese language teaching
  7. Character Set Construction for Chinese Language Learning
    Yeung, C. Y. & Lee, J., Apr 2021, In: Proceedings of the 16th Workshop on Innovative Use of NLP for Building Educational Applications. Association for Computational Linguistics (ACL), p. 59–63
  8. Research Design: Profiling the Chinese Modal Verbs Indicating Possibilities and Abilities
    Zhang, Z. and Liu, M, Apr 2021, In: the 7th Young Linguists’ Meeting in Poznan
  9. Using Verb Frames for Text Difficulty Assessment
    Lee, J., Liu, M. & Cai, T.,, May 2020, In: Proceedings of the LREC International FrameNet Workshop 2020: Towards a Global, Multilingual FrameNet. ELDA/ELRA, p. 56-62