Simplified and Traditional Chinese Web Pages Conversion

by Matthew Hung

Introduction

Our native language, Chinese, comes in two different character sets, namely BIG5 (Traditional Chinese) and GB2312 (Simplified Chinese). No matter what Windows operating system you are using, viewing these Chinese character sets is not a problem as long as the Web browser you are using has the Chinese support components installed and has been configured to view the desired character set. It makes life even easier if you view Chinese Web pages that are composed of the same character set as that supported by Chinese Windows. In other words, without the need of configuration and additional language components, you will have no problem browsing Traditional Chinese Web pages under Chinese windows that supports Traditional Chinese character set (the default Chinese Windows operating system used in Hong Kong and Taiwan). The same applies to Simplified Chinese Windows (commonly used in China).

However, one might begin to ask, when it comes to Web pages written in an unfamiliar Chinese character set, what are the options available? It would be painful for someone who are accustomed to Traditional Chinese characters all his life suddenly had to read Simplified Chinese characters or vice versa. To alleviate this problem, the Computing Services Centre (CSC) now provides a server for converting Simplified/Traditional Chinese Web pages into the preferred character set. The service provides two modes of Chinese characters conversion: text-to-text conversion and text-to-image conversion. The former converts plain text of Traditional Chinese characters in the original Web page into Simplified Chinese characters or vice versa, while the text-to-image conversion converts plain text into an image file (GIF) of the desired Chinese character set. Currently, the text-to-image conversion can only be used under Internet Explorer (IE) 5.0 or above for MS Windows environments.

How does it work?

The conversion is powered by HANWEB Publishing Server running on MS Windows 2000 Server. The diagram below illustrates the working principle of the service.

 

This service is offered to two types of users:

  1. Designers/authors may add icon/text hyperlinks for text-to-text conversion and/or text-to-image conversion on their Chinese Web pages, so that external users can simply click on the desired conversion link to view the Web content in their native character set (Simplified or Traditional Chinese). However this service is restricted to those Chinese Web pages published within the CityU domain only. For more details on how to make a hyperlink in the Chinese Web pages, please select "Read Chinese Page using HanWeb" on the "Utilities & Tools" menu of the CityU Intranet, followed by "Make HanWeb Link" on the left-hand column of the Web page.

  2. General CityU users may browse any Chinese Web page using the text-to-text or text-to-image conversion. However, this service is restricted to networked PCs located on campus only. The example below demonstrates how a general CityU user can use this service to browse a Web page originally composed of Simplified Chinese.

Suppose a CityU user needs to access a Web page in Simplified Chinese (GB2312 character set). If the browser has been set up and configured properly to display Simplified Chinese characters, the user will be able to see Simplified Chinese characters on the screen as shown in Figure 2 below.

 

Figure 2: Displaying a Simplified Chinese Web page in Simplified Chinese characters

However, the user may feel uncomfortable reading Simplified Chinese characters. Besides, the title bar of the browser window displays meaningless name due to a misinterpretation of the character encoding in the Traditional Chinese environment. Here is where the HanWeb server comes in. With this new service, it is now possible to convert the Simplified Chinese Web page into one with Traditional Chinese characters. In addition, the character coding of the title bar of the browser windows will also be converted to BIG5 (Traditional Chinese character coding).

To achieve this, first select "Read Chinese Page using HanWeb" on the "Utilities & Tools" menu of the CityU Intranet. On the Web page shown in Figure 3, click the hyperlink of "View any page (For Campus networked PCs)", then fill in the URL and select the required conversion.

 



Figure 3: Converting any Chinese Web page on the Internet

Finally, click on the button "Convert now" and the requested Chinese Web page will be displayed in a new browser window in a matter of seconds. In the above example, the conversion is from "Simplified Chinese (GB) to Traditional Chinese (BIG5)" and "Text" is selected for a text-to-text conversion. The newly converted Web page is shown in Figure 4 below.

 

Figure 4: The Simplified Chinese Web page is converted into Traditional Chinese characters

Conclusion

The conversion is a platform independent solution for viewing Chinese Web pages. The text-to-text conversion allows users to convert Chinese (GB2312 or BIG5) Web pages into Simplified or Traditional Chinese characters. It is fast and instant, but the browser must be installed and configured with Simplified or Traditional Chinese support feature. Using IE 5.0 or above, the text-to-image conversion allows users to convert Chinese (GB2312 or BIG5) Web pages into images of Simplified or Traditional Chinese characters. Users do not need the Chinese support features in the browser, because the Chinese content has already become images instead of Chinese text. But the conversion will take longer if there are too many Chinese characters on the Web page. Making use of the HanWeb server, users can overcome the limitation of Chinese support features in the browser. Furthermore, this service not only allows CityU users to browse Chinese Web pages in non-Chinese browser environments, it also allows external or overseas users to browse CityU's Chinese Web pages in their preferred character set.