I. Background of Data Centre Management
by JUCC ISTF
/* The following article is extracted from the "Information Security Newsletter" published by the JUCC IS Task Force. */
Data center management is a holistic process to oversee the operational and technical issues within a data centre or server room. It covers environmental control, physical security, hardware server operations and management of the services and applications used for data processing.
Through implementing a series of operational procedures and deploying specialised hardware and software, data centre management provides IT operational staff a clear picture of universities' data centre operating status, including real-time information on health, connectivity and resource utilisation, to effectively manage the data centers. A comprehensive data centre management solution also integrates information technology and facility / infrastructure management disciplines to provide necessary security control to protect universities' information assets from various threats.
Some basic components of data centre management are:
The physical environment of a data centre, including temperature, humidity, power supply, is rigorously controlled since system hardware can only operate normally within defined ranges of temperature, humidity and voltage. Environmental control devices usually include air conditioning, ventilation system, temperature / humidity sensors, power surge suppressors and UPS.
In addition, current data centre management practice also aims at protecting IT assets from environmental hazards, such as fire and floods, by deploying fire suppression systems and raised floor.
Data centre security is becoming an integral part of robust and thriving data centre management solutions. Systems and devices hosted within data centres store sensitive information and support mission critical services of universities. With the increasing reliance on IT services in universities daily operations, data centres have become high-value targets and should be adequately protected from being compromised, both logically and physically.
Centralised Asset Inventory or Repository
A centralised asset inventory or repository is an accurate and authoritative database that records all data centre assets of a university, including application servers, network devices, power equipments (e.g. Uninterrupted Power Supply), and Heating, Ventilating & Air Conditioning ("HVAC") devices. By implementing IT asset discover and tracking function, the IT management is able to accurately record data centre assets' details, such as hardware / software specifications, supported IT services, target user groups and interdependences with other universities' information systems or operations. For large scale data centres, automated asset inventory or repository software is installed to achieve efficiency and avoid human errors.
Capacity planning is performed by data centre IT staff to estimate the amount of information resources required to support the desired levels of services. The estimation leverages the data collected during the capacity monitoring process and requirements gathered from various academic or administrative departments. Effective capacity planning improves the overall performance and availability of the information systems hosted within the data centres through identifying underutilised resources and future capacity needs.
Batch job processing plays a vital role in sustaining the universities' daily operations. IT staff working within Data Centre follow defined operational procedures or ad-hoc user requests to execute batch jobs manually or automatically through the use of job schedulers. Completion status of those batch jobs are closely monitored to ensure that any job failures are timely followed up and resolved.
Real-Time Data Collection and Monitoring
Various system information, hardware status and environmental data are collected through manual monitoring process or by automated tools. With such information, data centre IT staff can perform real-time capacity monitoring, effective usage trend analysis and immediate data centre incident response. Monitoring and control is a critical element of maintaining desired availability for universities' critical IT services. Sophisticated monitoring software can provide proactive surveillance on the status of data centre assets, enable quick assessment of present situation and notify the appropriate IT staff should there be any threats that affect the availability of any information systems hosted within the data centres.
Reporting and Communication
A data centre management solution establishes specific responsibilities for each functional teams or IT staff, and determines clear reporting lines to enable timely exchange of operational status, incidents and management decisions. Right IT staff members are located for handling routine batch tasks, ad-hoc maintenance requests, installation jobs or urgent problems. A regular reporting function with the data centre management team also feeds IT management with up-to-date information for better data centre planning and administration.
Key Benefits Achieved through Effective Data Centre Management
- Cost and Energy Saving - Idle servers or devices do not contribute any value to universities yet they still consume energy. With a centralised asset inventory or repository, universities are able to detect unused equipments and decide whether they should be turned off or re-commissioned for other services.
Meanwhile, with proper capacity planning, universities can determine the exact requirements for the support of daily operations and level of services. It helps to identify underutilised resources so that they can be consolidated or re-purposed, instead of unilaterally increasing the IT operational expenditure for acquiring additional assets.
- Reduced Server Downtime Server downtime impacts on the availability of universities Server downtime impacts on the availability of universities' information systems and reduces the level of services provided to their students, staff or other related parties. A service interruption may be caused by physical sabotage, malfunctioned server, software bugs, environmental hazards or scheduled installation / upgrade / reconfiguration. Well-established physical security controls over the data centres can effectively prevent unauthorised physical access attempts. Comprehensive HAVC equipments enable fast detection or recovery from environmental hazards. Regular monitoring assists IT management in identifying malfunctioned hardware or software timely and initiate the incident response procedure. Capacity planning allows better anticipation of future resource needs and prevents forced addition or hot-swapping of resources due to overload.
- Increased Productivity - A comprehensive data centre management solution brings efficiency in resource usage and allocation. Underutilised or unused equipments are re-commissioned for other resource-demanding services. Overloaded systems are timely detected and allocated with additional capacity. Therefore, higher productivity is gained through efficient provisioning of information resources, including processing, storing and networking.
- Improved Security - Information security of data centres are enhanced when most known vulnerabilities or weakness are considered during a comprehensive data centre planning phase. The mitigating strategies are then incorporated into the data centre management solutions and continuously enforced by IT management.
- Compliance - Universities are required to maintain compliance with regulatory mandates (e.g. Personal Data (Privacy) Ordinance), information security standards (e.g. ISO 27001) or other internal IT governance policies. A data centre management solution can incorporate compliance requirements into its operational procedures and assign specific responsibilities to IT staff members with proper qualifications.