Upgrade Strategy for the CityU’s Central Infrastructure Servers

by John Chan
 

The Computer Room in the Computing Services Centre (CSC) is the core location where most of the critical IT equipments are resided. The room, built and designed over 20 years ago together with the whole campus, its space was believed to be more than enough when it was first enacted. However, with the importance and emphasis on the use of IT (as stated clearly in the University strategic plans) throughout these years, the needs for more space to accommodate the ever-increasing number of building devices, cables, cooling devices, storage devices and servers arisen from demands from both existing and new IT services gradually became evident. Thus, optimized designs of space utilization and housing of equipment in the room with no space for expansion becomes a critical problem to tackle.

Back in the late 90’s, the CSC had identified several areas that needed to be standardized in order to tackle the problem which otherwise would affect the implementation and improvement of IT services. At that stage, the CSC had decided that a unified Operating System and hardware platform should be chosen for providing most of the critical services so as to reduce the manpower and the expertise required on system management support, to reduce the complication of space management, and to minimize the applications compatibility problem. The Solaris on Sparc platform by Sun Microsystems Inc. was chosen to be the best choice at that time.

It was about the same time when the IT consolidation had evolved to become a trend with the help of the emerging technology on the virtualizations of server and storage. The CSC had quickly adopted this to become a major factor in formalizing the policy of consolidation of servers. In Sep 1997, the first Sun Enterprise E10k top of the line server was bought. The beauty of this machine was that in a single footprint, a maximum of 16 virtual domains or servers can be configured sharing the same high speed backplane and providing high throughput and performance. With the introduction of the new dynamic reconfiguration technology, which was the first to be offered in the market at the time, allows the re-shuffling of the virtual domains in terms of the number of processors and memory according to different and volatile needs.

The introduction of the UltraSparc III series processors in 2002 was a major leap in terms of processing power over those used in the E10k servers. Thus in Jun 2002, a new Sun Fire E15k server was acquired. Two years later, a similar E25k server was installed replacing the entire suite of E10k servers. Almost all of the critical services, including the email servers, the Banner servers, and the e-Learning servers, were configured on various virtual domains on these two machines. Each service was then set up using one virtual domain from each machine and either clustered together to form a reliable and redundant clustered service or join together to form a load-balancing set with maximized availability.

Five years later, an even more powerful UltraSparc chipset was introduced with the release of the UltraSparc VI processors that support more cores per processor. To be in line with the technology refresh strategy, the Sun Enterprise M9000 (M9k) server, with even better design and advanced features, much faster backplane, a nearly three-folded performance gain over the E15k machine, was bought in the late 2007 in order to replace the old and to be retired E15k machine. Migration to this new machine from the E15k is quite straightforward. The E25k machine is retained so that clustering can still be set up for providing high availability to each critical service. Apart from the performance gain of the M9k over the E15k, user adoption to the usage of this new machine is totally transparent. No user application modifications are required. Furthermore, this hardware upgrade will not incur additional management effort in terms of system and application support. It is anticipated that a completely new enterprise class of machine will be introduced in the next two to three years, which would be a natural replacement for the E25k server. In view of all these advantages, this replacement strategy will likely continue in the future by making use of the new technology, with reduced total cost of ownership and minimum management effort while greatly boosting the performance and improving the user experience for each critical service.