The Key Environmental Factors to Monitor in Data Centres
In a server room or data centre environment, critical infrastructure systems provide the managed and controlled environment in which to run IT operations. The two most critical being power and cooling, followed by emergency protection systems including fire suppression. Continuous monitoring of environmental factors can provide advanced warnings and alert messages before a critical system failure. To ensure that the environment monitoring system covers every critical system it is important to ensure the right metrics are generated and reported on in a timely manner.
Monitoring Data Centre Temperature Levels
One of the biggest operational costs for any data centre manager is electricity and the amount used for IT and IT-related services such as cooling. Regulating the ambient environment is of paramount important within a data centre to prevent fire risks, higher energy usage (from cooling fans) and increased wear and tear. Temperatures can rise quickly due IT workloads and potential issues with a cooling system. Monitoring data and reporting on the ambient metrics can help to identify an issue with a cooling system or air conditioner ahead of failure to allow emergency inspection and preventative maintenance to take place.
Dependent upon the size of the server room or data centre, multiple ambient temperature points should be monitored, best practice approach is to monitor up to 6 points within a server rack or cabinet. These include the top, middle and bottom areas of a server rack, and the front air intake, and exhaust. The data collected from temperature sensors can be used to provide heatmap of the internal temperatures and identify hotspots. Temperature sensor can also be placed outside the racks, within hot/cold aisle containment, and at the end and middle of server cabinet rows.
The ideal room ambient temperature for a server room or data centre is 18-25⁰C.
Humidity Monitoring in Data Centres
Coupled with temperature metrics are those for humidity. Whilst humidity may less of an issue for small server rooms, it can present more of an issue in larger data centres due to the size of the server hall and amount of equipment, and to a degree number of personal. The measure for a data centre cooling system is to maintain a relative humidity (RH) between 40 and 60%. If the humidity is too low, the potential for static electricity and electro-static discharge (EDS) increases. If the humidity is too high, condensation can form on cold metal (and plastic) surfaces leading to corrosion and a potential fire risk.
Most environment monitoring systems offer a choice between humidity sensors and combined temperature and humidity centres. The number and placement of humidity sensors can be less than for temperature sensors. Their placement should
be within open and confined areas such as containment aisles.
Data from humidity monitoring sensors, combined with temperature sensors will provide a good overview of the server room or data centre ambient environment. The placement of several sensors will also help to identity cooling system issues and potential failure points.
For more information on humidification strategies for data centres and serve rooms see:
https://download.schneider-electric.com/files?p_Doc_Ref=SPD_NRAN-5TV85S_EN
Data Centre Water Leakage Monitoring
In addition to temperature and humidity, water leakage detection has become more important for several server room and data centre operators. When designing a new build, its location and avoidance of flood plains is of extreme importance for any critical facility but for brownfield and existing sites this is not always possible. Aside from external floods, internal water and cooling pipes can rupture. For some water or liquid carrying pipework it is possible to have self-sealing mechanisms to provide protection.
Within the server room or data centre the other source of water is the air conditioning or cooling system. Fault or incorrect setting can lead to more humidity in the air and droplets of moisture. These can spill out of the a/c unit or overflow an underneath drip tray.
Water leakage rope run around a server room or within critical areas of the data centre will trigger an alarm condition if it detects even small droplets of water. The alarm notification provides an opportunity to take corrective action and inspect the water droplet source before major damage occurs.
Power Protection Systems
A power protection plan may rely on an uninterruptible power supply (UPS) or a combination UPS and standby generator system. The UPS will provide protection from mains borne electrical pollution and provide battery backup when the mains power supply fails. The battery runtime will be limited, and a standby power generator will provide a source of AC power to the UPS for prolonged power outages.
In terms of monitoring, uninterruptible power supplies for server room and data centre applications can be monitored via SNMP interfaces. The communications and information provided can be brought into a data centre infrastructure management (DCIM) package or UPS monitoring software supplied by the UPS manufacturer. The UPS system may also have a relay card allowing digital inputs (DI) to be taken into an environment monitoring system. If the UPS alarms, the monitoring system can detect the alarm and initiate an alarm script. A similar setup can be achieved for standby generators installed with a suitable interface. Additional sensors can be connected to the environment monitor to detect power failure, and even fuel tank levels for a standby power generator.
UPS Battery Monitoring Systems
Batteries are a critical component within a UPS and generator. In a UPS system, the batteries power the inverter section on mains power supply failure. In a generator, the battery powers the starter monitor. UPS battery monitoring systems can be installed to provide a further level of power monitoring. These can be fixed or wireless and connected to each individual battery bock within a battery string or battery block. Continuous monitoring of batteries can help to identify potential weaknesses in the battery sets that could very quickly lead to battery deterioration when the battery is placed under load; mains power failure or a need to start a generator. The information recorded from each battery block including temperature, voltage, impedance and ohmic values can also help predict the optimum time to replace the battery set. In a professionally managed environment, the use of battery monitoring can help to extend the replacement period by 1-2 years.
Summary
A central data centre environmental monitoring system can be used to initiate corrective and preventative actions when it comes to critical infrastructure systems. Key aspects to monitor include temperature, humidity, water leakage, and power protection related metrics to ensure that the cooling and power protection systems are running as they are designed to and ready to provide emergency cover when the mains power supply fails.