Change Control
When making changes to a house you will not casually knock out a wall to expand a room – it may be a weight bearing wall. You are not going to randomly dig a hole in your front yard – a sewer line may be there. Any modification to your house is going to have great deal of thought and planning assembled for it. The Change control frameworks such as ITIL and Six Sigma are formalized methods to provide the methodology and thought process behind changes in your IT infrastructure. Change Control, even in its most basic form, is critical to the smooth operation and the availability of your network services.
There was an IDC study done in 2004 that showed Operator Error was responsible for a full 60% of all the reasons that Service Levels were not met. The others were Application Failure at 20%, Other non-security events were 15%. Only 5% was from what is traditionally thought of as Security-Related (cyber-attacks).
Change Control concerns the often-neglected leg of the CIA triad – Availability. If your IT systems are not available to authorized users, then this is a security failure. It is the responsibility of a Security Architect to make sure the systems are always available. If 60% of the reasons that service levels are not met are operator error, then a lot of focus should be put on solving this problem.
There are various frameworks for how to implement Change Control such as ITIL, Six Sigma. It takes a very mature IT program to fully implement these frameworks. But to start, focus on reducing operator error as this is perhaps more important for meeting Service Levels in the Availability piece of the CIA triad. If you set controls to reduce operator error, you may well find you reduced the other areas at the same time. Traditionally IT has put almost all the money and time into the 5% that are actual security issues (meaning what is commonly thought of as Cyber Attacks or hacking). This is time and money poorly spent and will not solve the larger problems.
Focusing on reducing operator error with Change Control means that before a change takes place in a production system that due care and due diligence have been performed to make sure the change is successful, that all pertinent parties involved are informed of the change, that there is a way to put everything back like it was if the change fails.
Change Control to reduce operator error will also address the problem of possible mistakes. Can a production console be confused with a development console? Can a habitual command be double checked before execution? Are network wiring rooms cleaned, orderly and does all equipment have accurate labels?
Lastly Change Control will have predictive capabilities, change prevention, change alerts and remediation. Systems should be in place to detect thresholds that may be indicative of problematic changes such as full hard drives or circuits that are nearing full capacity. Systems should be in place that prevent certain kinds of changes (whitelists or read only states). Systems should also be in place that detect changes and additionally put things back the way they were. Of course, these systems should also notify the appropriate parties of such happenings.
Implementing a Change Control process with all these characteristics will also provide a powerful set of mitigations to the remaining reasons for a lack of availability. Application failures (the 20%) can be prevented by predictive change detection and automated change reversal to a known good state. Other events (the 15%) and actual cyber-attacks (the 5%) can be prevented by the change prevention systems.
Least Privilege is your design and Change Control is your measuring tape. You can’t build Secure IT systems without them both.