In the first article, we talked about the importance of network inventory as a first step in ensuring network consistency and security. In the second article, we covered how regular network configuration audits contribute to modern network management.
Configuration Changes: The Primary Source of Network Failures
Given the growing complexities of enterprise networks and their increased scale and importance, it is not surprising to think that they are also a growing source of network failures. Quite a number of studies have been done on the source of these network failures. Sure, there are link failures, power failures, and device component failures. But the biggest source, on the eye-popping order of 40% to 80%, is human error (humorously named Finger Defined Networking by Jason Davis, a network management expert at Cisco).
Configuration backup is an important element of disaster recovery. It doesn’t matter how big the disaster is. It could be the failure of a single device to a lightning storm that destroys multiple devices or it might be a regional failure like a hurricane or earthquake.
In this article we’ll look at some of the details of effective configuration management, the tradeoffs involved, and how you can avoid network failures in the future.
Intelligent Configuration Management: Start with a Config Backup
Tracking network configuration changes is critical to understanding the state of your network and any potential threats, regardless of whether your network consists of tens, hundreds, or thousands of devices. It starts with developing a configuration backup mechanism, which can detect each configuration change and replace it with your “golden” or optimal configuration.
Network changes can be detected in multiple ways. Two low impact, lightweight mechanisms are syslog and SNMP.
Syslog messages report a configuration change on a device. The syslog processing system will trigger a configuration backup task. There are some pros and cons to this method. Syslog can be configured to use either UDP or TCP for its transport and UDP messages may not always make it to the management platform. If this mechanism is used, then it is useful for the management platform to fall back to one of the other mechanisms below. The advantage of syslog is that it scales up very well. Only changed configurations need be retrieved.
Another mechanism that has low impact relies on an SNMP variable or CLI text string that indicates the timestamp of the last change. The backup mechanism tracks the timestamp and downloads a new configuration only when the timestamp is updated. There are some potential caveats with this mechanism. The first is that the network device’s configuration may have not been changed, because simply entering and exiting configuration mode will update the timestamp. To make this mechanism reliable it must detect the new timestamp, download the configuration, and compare it with the previous configuration to detect if any changes were performed. A configuration change notification is generated only when an actual change is detected. An advantage of the timestamp triggered mechanism is that it scales well. The disadvantage of this mechanism is that it requires SNMP or parsing CLI output to detect changes. It also requires polling each network device, which may be a problem in scaling up to handle very large networks.
A simpler mechanism skips the timestamp checking and simply downloads a new configuration on a periodic basis, like every day or every few hours. This mechanism will obviously have problems when scaling up to handle very large networks. Its advantage is that it captures any configuration changes.
With any of the above mechanisms, the new configuration should be compared with the old configuration to verify that a change was actually performed. There is no reason to store a downloaded configuration if it is identical to the previously retrieved configuration.
The scaling problems can often be resolved by using a distributed configuration management system. Very large networks are naturally divided into regions and there are significant advantages to deploying one configuration manager per region.
It is a good idea to store all backup configurations for as long as a device is deployed. Disk storage is inexpensive enough to hold all the configurations you’ll ever make.
Tracking Configuration Changes
Once the configurations are regularly saved, you can use these backups to help reduce the extent and duration of network outages. The configuration management system should create alerts whenever a configuration change is found. Changes are typically displayed using a split-screen reviewer in which the original configuration is in one window and the new configuration is in the second window, with the changes highlighted. The most common highlighting uses one color each for deleted lines, new lines, and changed lines (often red, green, and a third color, respectively).
Since operator error is the most common source of network outages, it makes sense to check for recent configuration changes when a network failure occurs. Be aware that some configuration changes may not create a problem when the change is made – it becomes a problem when another configuration change occurs or when a device reboots. Delayed failure can be a problem with functions that don’t incorporate pre-emption functionality, like root bridge selection.
Tracking changes is also a great way to determine where automation will provide the most benefit. What parts of the configuration change most? Is it an ACL or something else? Are the changes applied across many devices, or only to a few devices? You can also identify which devices are being changed the most. These are all very valuable pieces of information for planning how to apply automation and reduce manual processes.
Gluware Configuration Drift
The Gluware Configuration Drift and Audit application facilitates configuration compliance and change tracking. The combination is a very powerful tool for standardizing configurations and tracking their changes. We discussed the use of configuration audits to standardize configurations in the prior article Put Configuration Audits at the Center of Your Network Management.
Tracking changes is applicable across all configuration management, regardless of standardization. It allows you to quickly answer the question of “What changed about my configuration?” It works in existing devices or brownfield networks and newly built or greenfield networks or, as in the tapestry of old and new hardware, software and firmware that is today’s large enterprise networks.
With Config Drift, downloaded configurations are marked as valid snapshots, which forms a baseline. Config drift identifies any changes that are made to that baseline. Devices with configuration changes are easily identified and the side-by-side, two-window display with color-coding makes it easy to find what changed. Configurations are retrieved on schedules that you define, such as once a day. If a network problem occurs, it is easy to perform an on-demand configuration retrieval on groups of devices to check for changes. Configurations can be saved as the golden standard for future reference as well.
Network engineers and operations with multiple devices will be concerned about the scalability of any solution. The Gluware Intelligent Network Automation architecture supports the use of multiple backup servers that are based on easy-to-install virtual machines. It can handle configuration management for thousands of devices from a central console.
Multi-vendor coverage is also a key requirement. Few networks are based on a single vendor’s equipment. There are frequently major differences between device types (routers, switches, firewalls, load balancers, wireless controllers, etc), even within a single vendor’s portfolio. Gluware covers sixteen vendors and 21 operating systems, making it the best configuration management solution in the industry.
The benefits don’t stop within your organization. Monitoring outsourced configuration management service providers can give you visibility into the frequency, timeliness, and accuracy of their changes. Regulatory compliance is another area in which configuration documentation is valuable. There is significant value in being able to demonstrate configuration control and to provide snapshots of configurations at any point in time to IT auditors.
Gluware Config Drift and Audit is an ideal solution for reducing the burden of configuration management. The combination of configuration audit and drift provides powerful tools for tracking changes and getting configuration management under control.
Interested in learning more?
Try Gluware for yourself – Request a Test Drive