As corporate networks grow in complexity and importance, proper network change and configuration management (NCCM)...
plays an increasingly important role in making sure these networks continue optimal operation, a recent study released by Enterprise Management Associates (EMA) concluded.
"Originally, all of this configuration stuff was ignored; you set it and you forget it," said Jeffrey Nudler, senior analyst at EMA. "Unfortunately, in a complex network with self-healing devices, when a system reconfigures itself, it does it much faster than any human can keep up and you can quickly run into a million-dollar problem."
NCCM devices are cropping up everywhere in large corporate networks: routers that automatically switch traffic off congested ports, wireless connections that automatically re-route when reception is lost, and dozens of other devices that attempt to logically fix problems in real time. Unfortunately, what may work great in isolation can play havoc as system administrators play catch-up with devices that don't file help tickets, log their complaints, or even take more than a split-second in deciding what to do and then actually doing it.
Worse still, these self-correcting changes often snowball across networks as intelligent devices react to one another and create disruptions in the network.
"In a large system, there can be thousands of devices changed a day," said Nudler, adding that without constant monitoring and management, the whole IT infrastructure could tumble into instability.
But it is not just automation that poses a problem: Human error, whether in unauthorised changes or simply undocumented upgrades, can wreak just as much havoc in a system.
"Trying to get people to document everything they're doing is impossible," said Mike Pennacchi, executive network analyst for Network Protocol Specialists. He said even military-grade networks he has seen have sparse documentation on network modifications.
Compounding that problem, Pennacchi said, is the proliferation of non-interoperable protocols that often require Telnet or SNMP connections just to modify, making standardisation of management an imposing task.
Pennacchi said that there have traditionally been two NCCM methodologies. One is a proactive approach that carefully vets network changes before they are made, documenting decisions and conservatively updating the network as needed. The other, more reactive stance includes active monitoring and responsiveness, documenting changes after the fact.
Not mutually exclusive, both methods invariably run into documentation problems, Pennacchi said, as few solutions currently monitor all changes being made in a mixed-vendor network.
Why is documentation so important? According to Nudler, 60% of downtime is due to human error, and in one case he reviewed, one in eight changes led to "catastrophic errors" severely disabling the network.
Such disastrous consequences often make it preferable to "roll back" to a functioning state rather than to try and fix the problem, but Pennacchi said that ability was often lacking from common deployments and that there was no solution available to take "deep freeze" snapshots of a configuration and later restore it.
That may soon change, however, as NCCM tools become more robust and compatible with a wide variety of devices. Evelyn Hubbert, a senior analyst at Forrester Research, said larger corporations have initiated an acquisition frenzy in trying to expand their portfolios to provide comprehensive NCCM solutions. HP and IBM are both notable purchasers, and Nudler predicted that both would begin to make inroads in managed NCCM services in the next two years.
"The tools have to be heterogeneous," Hubbert said. "Otherwise you can make changes on certain devices but not others."
She added that although Cisco has dominant market share, 35 other vendors also have commonly deployed offerings. One solution, aside from acquiring a variety of specialised vendors, is to open up APIs to developers that can then share their interfaces with other IT departments. Hubbert said that this was the path taken by AlterPoint, whose embrace of the open source community has allowed it to support "any type of device imagined."
One way or another, such interoperability will be a necessary feature for NCCM solutions. Nudler said that he also sees an increase in predictive capabilities as HP and IBM integrate their purchases with their existing technologies.
"Prediction and prevention is the future," he said.