Standardization of Chinese Language Environment

At a Glance

Central Software
CityVoD - CSC Forum Archive
Software List on CSC Student LAN
Location and Floor Plan of the CSC Teaching Studio Areas
Opening Hours of the CSC
Systems Maintenance Schedule
List of Blocked Network Cards / IP Addresses
List of CSC Representatives
List of Departmental Network Administrators
Staff Computer Courses
Sitemap

CSC e-Forms

Submit CSC Work Req.
Req. for Printing
Req. for Dump / Restore
Teaching Studio Booking / Cancellation
Email Alias Application
Apply for a New Domain Name
Remove an Existing Domain Name
Modify the Hosting of an Existing Domain Name

Useful Links

OCIO Home
IT Information for Students
IT Information for Staff
IT Information for Alumni

Got any questions, comments or suggestions? Contact the editors at ccnetcom@cityu.edu.hk

Issue 48 - June 2006

Standardization of Chinese Language Environment
By Raymond Poon

Why standardize on a common Chinese language environment

There are many coding standards today that define how an alphabet or a character of a language is represented in a computer. These standards typically define uniquely the corresponding numerical value that represents each alphabet or character (including some special symbols that come or work with a language) of a language as well as the number of bytes necessary to represent all possible numerical values assumed by the entire set of characters or alphabets. The main Chinese language coding standards in the past such as BIG-5, GB, etc were designed to support only a single language while the more recent ones like Unicode support multiple languages. To complicate the matter further, even under the same Chinese coding standard, the same character can be represented by bytes of different length, either in fixed or variable length.

Besides, as there is no "lead-in" characters defined in any of the coding standards that can tell all software components (namely, the operating system, database, application software, etc running on both server and client sides) what kind of coding scheme (i.e. name of the coding standard, the number of bytes that constitutes the corresponding numerical value of a character/alphabet, etc) the subsequent character streams is employed, the coding scheme thus has to be explicitly agreed or made known beforehand in order for these software to interpret and handle correctly each character/alphabet during input, processing and output. Any mishandling of one character in any one of the software components (e.g. improperly inserting a "line feed" character after the "carriage return" character, improperly removing a "null" character or a parity bit, etc) or incorrect translation of one character (e.g. using the incompatible version or wrong coding standard) will render the subsequent characters unreadable or incomprehensible.

As such, it is easy to see that the complexity of ensuring language coding compatibility among software components is proportional to the number of different coding standards multiplied by the number of operating systems supported at client and server sides. Thus, it is desirable, if possible, to at least standardize on a common coding standard (e.g. Unicode) in order to avoid confusion, incompatibilities, and encoding/decoding errors during information exchange as well as to eliminate all unnecessary coding conversions among software on the same or different machines. Moreover, it also shortens software development time, reduce support and maintenance support, etc. when only one single coding standard is needed to be tested, implemented and supported across all software at both client and server sides.

Users' experience without standardization of a common Chinese environment

The main reasons why users encounter Chinese language problems are (1) the software they are using employs different or incompatible Chinese coding standards without their knowing, and (2) one or more characters are being improperly inserted or removed during processing without their knowing by the operating system, database software, or other application software. Without knowing what coding standard is being employed and what character has been changed, added or removed, when users cannot read a Chinese document or an email, they often have to force the viewer or reader software to manually try each coding standard in turn until either the content becomes readable or the coding standards are exhausted. In the case where its unreadable contents has been improperly changed or translated without any hint, it will be difficult, if not impossible, to recover the original content. Thus it is always desirable for the author and the reader of a Chinese document to agree on (1) a common coding standard, and (2) a set of common system software, installed on client and/or server, of identical or compatible version running on a common operating system of the same version to create and read a Chinese document.

The need for standardizing on Chinese environment for office PCs

However, standardizing on a common coding standard, system software, database software are not sufficient to support a language environment. We also have to define a standard set-up for an environment under which (1) a "selected" set of enterprise software including system software and application software especially the legacy ones, on both server and client side are capable of processing and exchanging the characters correctly according to the chosen coding standard, (2) every client machine has the necessary character fonts to properly output the characters to the screen and printer, and (3) a standard set of input methods that can correctly encode input characters using the chosen coding standard is in place.

Given the complexity arisen from the number of operating system and application software involved and the compatibility problem among different versions of the same coding standard, we need to define a standard computer environment under which we can realize our information exchange systematically, quickly, and effectively. However, in view of the fact that the current technology still cannot ensure all Chinese language applications that run perfectly on a native Chinese system can run equally well on a native English system and that most of our staff must work or prefer to work on a native English system, it is therefore necessary that we define a single common standard Chinese environment on a client computer within the University, namely, the native English Windows XP SP2 for running centralized University software such as Blackboard, AIMS, Email etc., which can meet the Chinese language requirements of most users. Those users who have special Chinese language needs or whose application software cannot be run under this common Chinese environment will have to be self-supported by running the software on dedicated workstations that may have their own unique character sets and fonts. In the latter case, users will not need to follow any standards imposed by the common Chinese environment on both the client and server sides.

Summary of standards chosen under the University Chinese language environment

We are happy to report here that the University has already adopted Unicode as the language coding standard, Sun Solaris & Microsoft (MS) Windows server 2003 (or the latest version) as the operating system standard for central servers, Oracle & MS-SQLServer as the database standard, and UTF-8 (or later UTF-16) as the encoding method for contents of University's web pages, e-learning, and database. By adopting the standards within the common Chinese environment of Client PCs described below, we hope to eliminate most, if not all, of the language problems currently experienced by our users. Even when the language problems arise, it will be easy to troubleshoot as there is only a single language environment implemented at the University.

The following summarizes the set-up for the client PCs' general Chinese environment. The set-up should be reviewed regularly to ensure that they can meet the changes in technology advancement and the University's needs.

- Microsoft English Windows XP Professional SP2 (or the latest upgrade) will be adopted as the operating system standard for the client PCs to run the University's administrative systems and communication software such as Email.

- The Hong Kong Special Administrative Region's Supplementary Chinese Character Set (HKSCS) will be adopted as the add-on support for Chinese characters specific to the Hong Kong environment.

- There will not be any local add-on character set being created by the University in general.

- 細明體 (or MingLiu) which comes with the MS Windows XP will be the standard font to be supported for general output of Chinese content. Users may need to acquire other font types for specialized output requirements.

- The MS Office and Adobe Acrobat Reader with Chinese Language Packs will be the standard office tools for the preparation and manipulation of Chinese text content.

- MS Windows XP, 九方(Q9), and 縱橫輸入法 will be the Chinese input methods used on all client PCs.

- Standard procedure will be devised for converting the required data and content not compatible with the current coding standard.

Go to Top

Also in this issue...
Upgrading AIMS: You can make a difference
Guidelines for Managing University Domain Names
Patch Management Tool: Goodbye SUS, Hello WSUS
New Arrangement of Email Service to Former Staff
High Resolution Video on the Web

Current & Back Issues

Search Articles

FAQs

Microsoft Windows10
Microsoft Windows 7
Office 365 ProPlus
Microsoft Office 2013
Microsoft Office 2010
中文支援常見問題
Internet Explorer 11
Internet Explorer 9
Email Services
Confidential Email
Wireless LAN
Virtual Desktop Service (VDS)
USB Flash Drive
Mirroring360
CityU SMS (for Department)
CityU SMS (for Staff & Student)
iPad (iOS 5.x)
Wiping a Mobile Device
Wiping Mass Storage Device
Handling Handheld Smart Devices for Service Maintenance, Recycling Use, and Disposal
Staff Account Renewal
Changing Local Administrator Password
McAfee Endpoint Security
Full Scan of Your Computer for Concealed Computer Virus
Anti-spyware
Computer Warranty Scheme Software Copyright Declaration and Compliance Observation

Technical Guides

AV Facilities User Guide
Connecting to Wireless LAN (WiFi)
VPN Connection Setup Guide BitLocker To Go User Guide