Critical IT systems and data require high availability Datacenters with redundant and fault-tolerant infrastructure, well organized management and reliable and thorough maintenance procedures. The Datacenter can only facilitate high availability and redundancy if the maintenance procedures are defined and the maintenance team is trained on how to use these. With Datacenter Processes and Procedures in place the Datacenter Operations and Technical teams know at any point in time the health status of the Datacenter and can respond in a timely manner to any events occurring.
Comprehensive Maintenance Processes and Procedures are a key component to reducing datacentre outages caused by human error. However, they are only effective when they are complete, up-to-date, accurate and if they are precisely followed.
Datacenter technical specialists must be trained, not only on the devices and systems of the Datacenter infrastructure but also on the different processes, procedures, and policies in place. For procedures in particular, the technical specialists and Datacenter operations staff must know why it is important to use them, where to find them, and how they must be used. If the maintenance procedures are complete and accurate and the technicians follow a disciplined approach to using them, the risk of an outage caused by human error will be minimized.
The Methodology for Datacenter Maintenance was developed based on over 20 years’ experience in IT, Project Management and Datacenter Management & Maintenance. Over the years the need to standardize processes and procedures arose to ensure a high standard and quality is maintained throughout the Datacenter lifetime.
All processes and procedures are customized with the respective Datacenter Operations team and the Datacenter Technical Specialists to get their input, buy-in and guarantee for a smooth transition when implementing the procedures.
Over time the procedures are being updated according to the requirements and lesson-learned. Key is to keep the procedures to the point and avoid overly long procedures and excessive detail as it may motivate technicians to cut corners, accelerate processes, or lose focus. The constant training, peer-review and quality check is required to avoid technicians may become bored or fatigued, or they may attempt to simplify the procedure. Overly complex layouts may frustrate a technician who must search for the right piece of information or subsequent steps in the document, which can lead to errors.
The Datacenter Maintenance Methodology consists of four layers:
Policies are the base for all processes & procedures and are superseding any process or procedure within the Organisation. In some cases policies are defined outside the Datacenter managing department and applicable to a wider part of the organisation.Processes are divided in Core Processes, which are processes which apply to all Datacenter Projects and Operations, and Operational Processes which apply specifically to Datacenter Operations.Procedures describe activities triggered by individual processes related to specific systems or devices. There are procedures for each system for different maintenance levels.Records are documented outputs or processes. All records are kept as proof and evidence of activities performed. Records include memos, service reports, request forms, confidentiality agreements, etc.
All policies and procedures should be reviewed frequently, depending on their defined life cycle to ensure that they are up to date and relevant. They should also be reviewed after any changes to the Datacenter infrastructure to reflect the new configuration and new requirements.For each process & procedures there is a peer review and approval process followed by a training or pilot run to rehearse and verify the procedure and to simulate via dry runs or walk-throughs prior to executing it.