In-Depth

MOM is still Watching You

It takes a powerful server to watch over an entire network. Is Microsoft Operations Manager up to the task?

8:05 a.m. It’s Gloria in purchasing. She can’t log on to the network. She phrases the problem to you this way: “The Internet must be down because I can’t sign on. It just won’t accept my password.”

8:10 a.m. You pull up your Microsoft Operations Manager (MOM) console in order to see if any new alerts have been put up for the network. Sure enough, after drilling into the MOM hierarchy a bit, you find that the DHCP service is reporting that it’s out of IP addresses.

8:15 a.m. You add IP addresses using the DHCP admin console and ask Gloria if she’s now able to log on. She is. You lean back in your Sam’s Club BackSaver chair and let out a satisfied sigh. You’re a hero.

Well, not quite a hero. Gloria expects the network, including her logon, the Internet, Internet e-mail and regular e-mail, and all of her applications to be up 24/7. You’re glad MOM was there to alert you that you were out of IP addresses. But how could you use MOM to help you proactively if this problem crops up again—perhaps by automatically creating some new addresses for you?

Product Information
Microsoft Operations Manager 2000
$850/processor for MOM; $950/processor for MOM Application Management Pack
Microsoft Corp.
Redmond, Washington
www.microsoft.com/mom/

Unfortunately, you can’t do that with this first release. MOM’s current function is event and performance monitoring, rather than proactively handling crisis situations. MOM can collect and analyze system events, compile them until a pre-configured threshold counter is met and then alert you, but it can’t actively solve problems you’re having with your network. (MOM alerts can execute scripts—but there might not be any action that you could script to handle the case of DHCP running out of IP addresses.)

Systems Management vs. Operations Management
As network implementations grow and become more complex, a plethora of management issues surface—things like how to make sure thousands of PCs are all running the same version of software or how to know when a server crashes. Today we have a very robust product, Microsoft Systems Management Server (SMS) that allows us to solve the former problem, but the tools we have to monitor the latter aren’t as granular as we’d like. SMS handles change and configuration management. But what it doesn’t do—and never has—is monitor things like non-SMS server services, printing functions, server hardware and other key elements of your network, and then report back to administrators via some alerting mechanism that a system is in trouble. This latter capability is called operations management.

The folks at Microsoft had the change and configuration management thing handled with SMS, but weren’t doing anything with operations management. Now they’ve plugged this hole in their systems management software with MOM. Microsoft purchased the code that underlies MOM from NetIQ (www.netiq.com), a vendor actively involved in Windows NT- and 2000-based metrics and operations management monitoring.

MOM System Requirements
You can install MOM on a single computer or distribute its components among multiple computers. When you do a single-computer installation, you install the MOM administrator console (which, of course, uses MMC as its interface); MOM reporting; and the Web console (which allows administrators to pull up the MOM information from a browser on any computer). A MOM server configured in this manner has some minimum recommended hardware requirements:

  • 550 MHz Intel Pentium III processor.
  • Minimum of 5GB of free disk space.
  • 512MB of RAM.
  • Video adapter capable of rendering 256 colors or more at 800x600 resolution.
  • Minimum of 10MB/second network speed.
  • Windows 2000 Server or Advanced Server running Service Pack 2.

Microsoft minimum requirements are just that—bare minimums. As such, they need to be beefed up by at least 50 percent. If I were building a production MOM box, I’d put it on a two- or four-way Pentium III 1GHz or better with at least a full gigabyte of RAM and as much disk space as I could cram into the box. Better yet, I’d consider splitting the load out to multiple servers. There’s no need to install the software on a domain controller—Win2K Server running as a member server is fine for MOM.

You also need to tell MOM what to monitor. MOM’s monitoring rules are contained in knowledge modules—software components that run atop the MOM engine and contain Knowledge Base articles, preset threshold values, event IDs and other pertinent information designed to keep track of a particular component. I’ll review the included and optional add-on knowledge modules later in this article.

Computers that are being monitored must run an agent and, thus, also have minimum requirements. A monitored computer needs to have at least a 200MHz Pentium CPU, 100MB of free disk space, 64MB of RAM, and Windows NT 4.0 SP4 or Win2K. MOM also contains a modicum of Unix support; it can read Syslog files shipped from a Unix computer.

MOM Installation
One of the interesting features you see when you first run the MOM setup program is the Prerequisite Verifier, which takes a gander at your system and says, “Hey, you need to install or upgrade these things before I can install the product.” Figure 1 shows the Prerequisite Verifier in action. One feature of particular interest is the clickable link at the bottom of the Prerequisite Verifier that will carry out the needed configuration actions. The instructions in the details pane of the Prerequisite Verifier are quite good and tell you exactly what you need to do. I simply cut and pasted them into a Word document, then printed out the whole thing.

Prerequisite Verifier
Figure 1. The Prerequisite Verifier checks a variety of software before it will let you begin a new MOM installation. (Click image to view larger version.)

MOM uses a SQL Server database to store all the information it collects about the systems you choose to monitor. The full edition of SQL Server 2000 is the preferred database solution, but MOM comes with the MSDE version to use for evaluation. MOM uses Microsoft Access 2000 for its reports; you can use the included Runtime edition or install the full product (which you’ll need to modify to create reports).

MOM’s estimated retail price is based solely on the number of processors in each MOM computer, at $850 per processor. There’s an optional Application Management Pack that I’ll talk about in a minute. If you decide to purchase this, you’ll pay another $950 per processor. There’s no need to purchase a license for the computers that you’re monitoring.

What Comes with MOM?
Great question! Remember that MOM’s job is to watch Windows-based computers’ event logs, read the events posted there, and then alert the administrator of that event and possibly even perform some action. Here are the things that MOM brings to the table:

  • Event Management. MOM watches multiple computers and aggregates their events into a central database. Because of this capability, it’s possible for administrators to get an overview of how the server farm is performing (metrics) or to drill down on a specific computer to gather more information about an event (alerting).
  • Rules. The administrator can create rules that can perform a specified function when an event occurs. A pretty cool thing that rules can do is hook up with a Microsoft Knowledge Base (KB) article referencing the difficulty you’re having. Figure 2 shows a DNS dynamic lookup rule that points to a particular KB article. Some of the KB references in the rules that Microsoft supplies are pretty generic; others are in-depth and right to the heart of the problem.
  • Alerts. Administrators may set up alerts that examine a single computer event, a string of events from a given computer, or a string of events from several computers. They can be set up with varying severity levels. Alerts can be drilled down upon to pinpoint the data that led up to the alert, as well as be set up to e-mail or page administrators. They can be set up to send Simple Network Management Protocol (SNMP) traps or be provided with a script that redirects their information to other management systems (such as Hewlett-Packard’s OpenView, CA Unicenter, IBM Tivoli, or BMC Patrol.) You can also view alerts directly in the MOM management console, as shown in Figure 3.
  • Performance. MOM can be configured to monitor performance thresholds by using System Monitor counters. This kind of information is not only useful for alerting, but also for capacity planning and historical tracking of system behavior. Thresholds can be set for local events or you can aggregate the events into a system-wide collection.
Event references
Figure 2. Some events reference the Microsoft Knowledge Base directly for more information. (Click image to view larger version.)

Alerts
Figure 3. Some alerts in the MOM management console. Most of these alerts relate to a SQL Server database that was running out of space. (Click image to view larger version.)

Management packs (collections of knowledge modules) are the brains behind the MOM operation. They’re preconfigured rule-sets and Knowledge Base articles that snap into MOM and provide administrators a foundation for their network monitoring. The rules can be modified according to your specific needs. With the base MOM product, you receive Management Packs that can be set up to monitor these components:

  • Win2K
  • Active Directory
  • File Replication Service (FRS)
  • DNS
  • WINS
  • IIS
  • DHCP
  • RRAS
  • Microsoft Transaction Service (MTS)
  • Microsoft Message Queuing (MSMQ)
  • Microsoft Distributed Transaction Coordinator (MSDTC)
  • SMS
  • MOM
  • Terminal Server
  • Windows NT 4.0 systems logs

You can also purchase the optional Application Management Pack that covers various BackOffice and .NET server products:

  • Exchange Server 2000
  • SQL Server 2000 and 7.0
  • Exchange 5.5
  • Site Server 3.0
  • Proxy Server 2.0
  • SNA Server 4.0

Other vendors can write application-management packs that snap into MOM, as well. NetIQ has been actively involved in this area and offers eXtended Management Packs (XMPs) for MOM to monitor Oracle, Web services and antivirus software, as well as extended capabilities for monitoring Windows networks. NetIQ agents use each application’s API to extract more information than is possible from simply reading a system’s event logs and System Monitor counters, which is really all that MOM does today. As a result, NetIQ’s XMPs (as well as other third-party offerings) will be the real added value that makes MOM go over the top.

The Down Side of Uptime

Many organizations depend on server uptime as a key metric. If you present a report to a group of people interested in the uptime of a given set of servers and you fail to include in your report that certain key server services were down for a brief time during your reporting period, are you lying when you say the server was up?

Think about it this way. You're running an Exchange server that gets a lot of use during the work week. You check the server each morning: Yep, still up and running. One morning you get a call telling you that e-mail isn't working. You go to the server and, horrified, figure out that one of the Exchange server services has stopped, along with its dependent services, for no apparent reason. Your heart skips a beat. What if this thing's in the tank? Fortunately, you're able to restart the services OK and get on with your life. The logs reveal that the thing halted in the middle of the night. E-mail services have been stopped for more than five hours.

Does that episode count as a server outage? The server was up the entire time-but what about its services? Do you see the distinction? When you present your reports, it doesn't mean diddly that your servers were up if the crucial services they were supposed to run weren't running for whatever reason. You technically still had a server outage on your hands because-and here's the key part-clients couldn't access the computer. That's the real deal with uptime. Were clients able to utilize the server? If not, even though the server was perfectly operational the whole time, in actuality you had a server outage on your hands. (Note that clients, in this context, could imply another server needs to access your problem box to do, say, a database lookup. If the service is down, the server's out and your uptime reports need to reflect it that way.)

I've actually had people who'll argue this uptime point with me, even though it seems so blatantly obvious. If the spring was broken on your garage door but the electric garage door opener was perfectly operational, would you say that you could still use the garage door? No! It doesn't matter that the system is running if a key operational component is down.

MOM also includes a Reporting tool with the capability of viewing reports in chart or HTML format. Load-balancing and redundancy are fully supported. MOM’s server/agent architecture keeps network traffic down yet provides a method for central data collection. You can push the agents to targeted groups of computers via an intelligent push and install mechanism, thus reducing the amount of administrative overhead.

When you start MOM to view the alerts for various systems, you’re taken to the default MOM node and given a complete system overview, similar to what you see in Figure 4. In the details pane of this view you’re given an “executive overview” of the status of the system, including the number of alerts you’ve not yet addressed, the performance monitors you’re watching, the events you’ve received, computers you’re monitoring, and so on. Note the excessive number of rules that are processed, by default, within the MOM system.

Monitoring rules
Figure 4. The default node in the MOM management console, showing an overload of rules to be monitored. (Click image to view larger version.)

Minor MOM Annoyances
There are some annoying things about MOM in this release. MOM is extremely verbose and can generate heavy CPU usage and network traffic, as well as copious output. It’s difficult to manage rules due to the depth of the hierarchy that contains them. And installing a knowledge module enables all the rules in that module, making it easy to swamp yourself in data and bog down your network. My advice: Limit the number of knowledge modules you install.

MOM isn’t integrated with SMS, apart from a knowledge module that can monitor SMS computers. MOM personnel in Redmond tell me that there are plans to allow for the integration of SMS with MOM sometime in the future.

Somebody Else's Code Under the Sheets

As I mentioned in the main article, MOM is based on code that Microsoft purchased from NetIQ. Lest you think that buying and repackaging code stuff is something new and sneaky on Microsoft's part, think again. Microsoft has been at this kind of thing for a long time. For example:

  • NTBACKUP.EXE, that old tape backup stalwart that shipped clear back with NT 3.5, was actually written by Veritas.
  • The Windows 2000 disk compression code was taken from a great third-party NT compression tool called Diskeeper.
  • Windows Terminal Services is a little brother to Citrix Metaframe. (Microsoft has a sort of symbiotic relationship going with Citrix. Citrix is the only developer authorized to bundle its code directly over the NT kernel. When you buy Metaframe, you're buying Microsoft OS code disguised as Citrix; when you run Terminal Services, you're running highly scaled-down Citrix code.)

Some would even say that Windows XP's icons look an awful lot like Mac OSX-but I'm not gonna go there! In fact, some would say that Microsoft purchases lots of code: FrontPage, PowerPoint, Visio and a bunch of games, for starters. In fact, an anti-Microsoft site, www.vcnet.com/bms/departments/catalog/yrcatalog.shtml, manages to point out an entire shopping list of code that was developed by someone else and then purchased by Microsoft.

My guess is that a lot of people who start small software development companies would like nothing more than for Microsoft to nail down a deal with them that made the company financially solid and its owners millionaires. I also offer that Microsoft has been pretty good at taking a software product that was initially developed with some sort of standalone use in mind and then integrating it into the appropriate Microsoft suite. So, even if you believe that Microsoft isn't any good at writing code (a claim I don't believe and would never defend), you've got to admit that Microsoft is awfully good at making the code work well and become a viable part of the company's offerings.

The Long and Short of It
So, should you rush to install MOM on your own network?

If you’ve already invested in NetIQ’s AppManager suite of products, stay there until MOM has been through the second release cycle and some of the third-party offerings have been released and tweaked. AppManager’s current functionality beats that of the just-released MOM.

If you haven’t invested yet in NetIQ and its plethora of Windows-based management products and you have a favorable licensing agreement with Microsoft, such as the Select or Enterprise Agreement programs, then it may be to your benefit to investigate the use of MOM instead of AppManager. Even then, you should consider purchasing XMPs for the applications you want to monitor within your MOM system as they’re released. Be sure to leave adequate room in your budget to cover these additional purchases.

Some Microsoft products are excellent in their first release and get better and better as service packs and new revisions come out. I think Exchange and ISA Server are great examples of this. But other Microsoft code seems to come out the door not quite ready for prime time. SMS 2.0 was a great example of this: Microsoft released the product right at Y2K crush time and it wasn’t until SP2 for SMS (now at SP3) that the product really got to where it needed to be. MOM is somewhere in the middle. It’s not the fully robust code that I’d like to see; but neither is it as buggy as SMS 2.0 was when it first shipped. If you’re anxious to get into the operations management game, then you can safely go forward with MOM, but I’d anticipate fairly fast service pack releases coupled with a rush of XMPs.

Personally, I don’t think MOM buys you a heck of a lot at this juncture because it’s more about event and System Monitor counter consolidation than it is about operations management. That said, let me speak out of both sides of my mouth and say that if you’re willing to invest the time to install, tune and understand what MOM’s telling you, you’ll end up with a system that—in 15 minutes—can give you your entire network’s heartbeat. And that may be worth the price of admission.

Featured