Simplified
and Traditional Chinese Web Pages Conversion
By
Matthew Hung
|
|
|
Introduction
Our native language, Chinese, comes
in two different character sets, namely BIG5 (Traditional Chinese)
and GB2312 (Simplified Chinese). No matter what Windows operating
system you are using, viewing these Chinese character sets is
not a problem as long as the Web browser you are using has the
Chinese support components installed and has been configured to
view the desired character set. It makes life even easier if you
view Chinese Web pages that are composed of the same character
set as that supported by Chinese Windows. In other words, without
the need of configuration and additional language components,
you will have no problem browsing Traditional Chinese Web pages
under Chinese windows that supports Traditional Chinese character
set (the default Chinese Windows operating system used in Hong
Kong and Taiwan). The same applies to Simplified Chinese Windows
(commonly used in China).
However, one might begin to ask,
when it comes to Web pages written in an unfamiliar Chinese character
set, what are the options available? It would be painful for someone
who are accustomed to Traditional Chinese characters all his life
suddenly had to read Simplified Chinese characters or vice versa.
To alleviate this problem, the Computing Services Centre (CSC)
now provides a server for converting Simplified/Traditional Chinese
Web pages into the preferred character set. The service provides
two modes of Chinese characters conversion: text-to-text conversion
and text-to-image conversion. The former converts plain text of
Traditional Chinese characters in the original Web page into Simplified
Chinese characters or vice versa, while the text-to-image conversion
converts plain text into an image file (GIF) of the desired Chinese
character set. Currently, the text-to-image conversion can only
be used under Internet Explorer (IE) 5.0 or above for MS Windows
environments.
How does it work?
The conversion is powered by HANWEB
Publishing Server running on MS Windows 2000 Server. The diagram
below illustrates the working principle of the service.
This service is offered to two types
of users:
Suppose a CityU user needs to access
a Web page in Simplified Chinese (GB2312 character set). If the
browser has been set up and configured properly to display Simplified
Chinese characters, the user will be able to see Simplified Chinese
characters on the screen as shown in Figure 2 below.
Figure 2: Displaying a Simplified
Chinese Web page in Simplified Chinese characters
However, the user may feel uncomfortable
reading Simplified Chinese characters. Besides, the title bar
of the browser window displays meaningless name due to a misinterpretation
of the character encoding in the Traditional Chinese environment.
Here is where the HanWeb server comes in. With this new service,
it is now possible to convert the Simplified Chinese Web page
into one with Traditional Chinese characters. In addition, the
character coding of the title bar of the browser windows will
also be converted to BIG5 (Traditional Chinese character coding).
To achieve this, first select "Read
Chinese Page using HanWeb" on the "Utilities & Tools"
menu of the CityU Intranet. On the Web page shown in Figure
3, click the hyperlink of "View any page (For Campus
networked PCs)", then fill in the URL and select the required
conversion.
Figure 3: Converting any Chinese Web page on the Internet
Finally, click on the button "Convert
now" and the requested Chinese Web page will be displayed
in a new browser window in a matter of seconds. In the above example,
the conversion is from "Simplified Chinese (GB) to Traditional
Chinese (BIG5)" and "Text" is selected for a text-to-text
conversion. The newly converted Web page is shown in Figure
4 below.
Figure 4: The Simplified Chinese
Web page is converted into Traditional Chinese characters
Conclusion
The conversion is a platform independent
solution for viewing Chinese Web pages. The text-to-text conversion
allows users to convert Chinese (GB2312 or BIG5) Web pages into
Simplified or Traditional Chinese characters. It is fast and instant,
but the browser must be installed and configured with Simplified
or Traditional Chinese support feature. Using IE 5.0 or above,
the text-to-image conversion allows users to convert Chinese (GB2312
or BIG5) Web pages into images of Simplified or Traditional Chinese
characters. Users do not need the Chinese support features in
the browser, because the Chinese content has already become images
instead of Chinese text. But the conversion will take longer if
there are too many Chinese characters on the Web page. Making
use of the HanWeb server, users can overcome the limitation of
Chinese support features in the browser. Furthermore, this service
not only allows CityU users to browse Chinese Web pages in non-Chinese
browser environments, it also allows external or overseas users
to browse CityU's Chinese Web pages in their preferred character
set.