Why Remote Monitoring Solutions Are Important For Business Continuity
Earlier this year, almost every organisation in the UK had to initiate their business continuity plans and move to widescale remote working in a bid to protect the NHS and general population from COVID-19. For many, the speed and scale of the move increased the pressures on already stretched IT teams. It also brought a greater focus to an area associated with larger-scale datacentre operations and one that up to now has typically received more investment than is made in small-to-medium sized computer or server room installations; remote monitoring solutions.
Remote Monitoring Best Practice
For computer and server room operators, datacentres often set the standards including information security and energy efficiency. Though smaller-scale operations than can be found in a datacentre, on-site IT networks will be support by similar critical infrastructure systems. Examples include air conditioners and an uninterruptible power supplies and/or local standby power generation. Fire suppressions systems may also be installed though this is less like for a small on-site computer room with only one or two servers. A fire solution is typically installed when there are more servers and racks within a server room due to the confined space and higher power demand footprint.
In more normal times, IT managers may have the luxury of being able to review the best practices deployed by Datacentre managers. The use of data centre infrastructure management (DCIM) packages like the Schneider EcoStruxure within a datacentre environment is a good example. This comprehensive packaged provides a complete overview of the entire IT, power and cooling estate. Supporting multiple protocols almost any device can be connected to the platform for remote monitoring, control and support.
The speed and scale with which organisations have had to move to remote working prevented this and meant that quick solutions had to be identified and deployed. With fewer workers on-site, there was less opportunity for smaller organisations to identify on-site alarms and respond to them accordingly. Added to this are travel restrictions, which prevented some IT personnel from being able to easily return to their sites in an emergency for an alarm pr system reset and problem diagnosis.
Fortunately a solution already existed and one that is well developed and in use in far more environments than purely IT. Remote monitoring solutions is a specialist niche within the wider IT market place and one with solutions for monitoring a wide range of environmental factors within not just computer and server rooms, but datacentres, industrial, retail & food distribution, pharmaceuticals and telecomms applications.
On-Premise Environment Monitoring Options
Most environment monitoring devices can be installed within a relatively short time frame. Typical examples include the STE2 and Room Alert 4ER. These devices are powered via an AC adapter or can use Power over Ethernet (PoE). They have a built-in webserver to assist connection to the local network. Sensors are either built-in, for example for temperature, or can be connected via external plug-in sensors for temperature, temperature & humidity, water leakage, smoke, fire, air flow and even power outages.
The on-premise software packages available provide similar features to DCIM packages. They provide an overview view of the IT environment related to the sensors and detectors installed but not on the holistic scale required for a larger datacentre. DCIM packages provide more comprehensive information covering cooling, and power usage per server rack to better assist loading and capacity planning.
Smaller organisations may describe their IT operations as a datacentre and to some degree they are right. A datacentre is a managed and secure environment in which to run IT servers. Computer and server rooms will have some if not all the critical infrastructure elements of a datacentre and most will have air conditioning and some form of uninterruptible power. The differences are ones of scale, and the comprehensiveness of their deployed infrastructure solutions as well as their resilience and levels of N+(x) redundancy.
Environment monitoring may be overlooked as a necessity, however. In a typical set-up, IT system components including servers, storage devices and networking switches will be housed in server racks and the racks will be arranged into an array that makes best use of the local air conditioning and cooling. For most this will be a wall mounted air conditioner. Uninterruptible power supplies may also be deployed to provide emergency backup power if the mains power supply fails. The UPS may also be rack mounted or installed as a floor standing tower system in such a way as to provide power to the server racks and their power distribution units (PDUs).
One of the most monitored critical infrastructure devices within the server room will be the uninterruptible power supply. This will typically be installed with a slot-in SNMP card to allow remote monitoring via an HTTPS browser directly or through a locally installed UPS monitoring and control software package.
Most air conditions are installed without any local remote monitoring. Even though most provide a signal contact status and alarms via a plug-in interface card. More modern systems have Wi-Fi capability to remote alarm via mobile App.
During normal working hours and conditions both approaches may be sufficient. In addition, a UPS system and air conditioner will provide visual and audible alarms via their front panels. These alarms may or may not be noticed as employees pass-by computer or server rooms.
The addition of a dedicated environment monitoring solution into this type of installation provides a centralised platform and alert system. Sensors for specific concerns can be added to the monitoring device and local on-premise software used to monitor over the local network and send alert messages when readings move outside pre-set ‘normal operation’ ranges.
Temperature is the most monitored aspect. An overtemperature can indicate that the local air conditioning has failed or is not performing to specification. Higher temperatures in a server room may or may not be critical for short periods and especially unmanned ones. However, temperatures above 25°C can start to ‘cook’ UPS batteries and lead to increased component failures. If the general room ambient is 25°C, there could be far higher ‘hot-spots’ inside server racks which if left unchecked could lead to potential fire risks.
Humidity and water leakage are other areas that are important to monitor. Higher levels of humidity can lead to increased condensation on cooler areas, with the creation of liquid pools which can lead to a short-circuit. Water leakage from poor cooling infrastructure or local plumbing bursts can also disrupt IT operations.
Remote Access Cloud Platform Software
Most environment monitoring platforms offer on-premise and Cloud-based software portals. Cloud-based monitoring portals remote the need to VPN into a local network and are more easily monitored remotely than LAN based platforms. Cloud-based systems can also provide more user-functionality when managing multiple locations and estates, displaying geographic maps and more comprehensive dashboards.
Both types can be configured to provide email alert status updates for any of the sensors installed. If the IP/Ethernet network, they are connected to goes down then a ‘disconnected’ alert is provided.
In addition, it may be possible to configure SMS text alerts and phone calls via a software platform. These may be actioned via a SMS-gateway connected to the network or an email-to-SMS service. Here the Cloud monitoring portal sends an email to a third-party platform where a subscription account is set to convert the email to an SMS text and distribute this to a defined list of mobile phone numbers. For IT managers swamped with emails, a text alert can be more immediately noticed and responded to.
Additional Information Security Considerations
In addition to monitoring the local environment, it is also important that a Business Continuity plan covers other aspects that can affect information security, availability, and uptime:
- Network Security: increased remote working can require the granting of wider network access to more employees than would normally be office based. This can put a strain on existing broadband configurations and require increasing the width of the ‘pipes’ into the IT system. Network firewalls, password & access policies, USB usage, and anti-virus software applications should also be reviewed against framework standards such ISO/IEC 27001 Information Security Management or the UK government backed Cyber Essentials certification.
- Data Backup: multiple data back-up procedures provide resilience in the event of a data breach or data loss. Cloud-based services are commonly deployed now and provide rolling and incremental back-ups. Remote working practices should be reviewed with regards to locally stored data and system usage to ensure that critical data worked on remotely is also backed-up.
- Secure Access Facilities: it is always best practice to ensure access to a computer or server room access controlled but this can also be extended to individual server racks. Not only does this control access via authorised credentials but also an audit trail of access activity. In addition to this local IP cameras can be deployed with motion detection to provide additional security.
- Asset Maintenance: uninterruptible power supplies, air conditioners and fire suppression systems should be under service contracts. These critical systems require annual inspections and have consumable items that must be inspected and replaced to ensure system integrity. Typical examples include batteries, fans, and filters. A local standby generator if installed as part of the critical power protection system may also require 6-monthly inspection and regular start-ups. During lockdowns it is vital that these systems are maintained and that preventative maintenance visits are not postponed indefinitely. Engineers for these types of systems are typically classed as ‘critical workers’ and so can travel to sites to provide routine or emergency cover under appropriate risk assessment and method statements.
- Unified Communications Platforms: there are several available to help organisations communication over the internet. Microsoft Teams is one of the most popular platforms but there are others including Skype, Slack, Google Chat, Google Hangout and Zoom. It is important for the organisation to standardise on a single platform for all employees and to ensure that it is secure. This is important given the amount and type of information that may be shared over the platform. Having a unified approach here can also help with employee welfare and the building of a team spirit and connectivity during extended periods of remote working.
For more information on Information Security Management and Cyber Security standards visit: https://www.bsigroup.com/en-GB/iso-27001-information-security/ or https://www.gov.uk/government/publications/cyber-essentials-scheme-overview.
Summary
Business Continuity plans have rarely been flexed or implemented on a scale as earlier this year. The move to remote working may be temporary or could become the norm for many organisations. Some global multinationals have already stated that their employees will not return to work in their official buildings until at least 2021 if ever. The change in how we work means less people on site and the need to use remote monitoring solutions to their IT infrastructures. With such monitoring systems in place, IT managers are in a stronger position to guarantee the availability and uptime of their computer and server rooms and prevent downtime from air conditioning failures and over temperatures, humidity problems, water leakages and other environmental factors that could disrupt operations and services.