In-Depth

7 Tips for MOM

Advice from an in-the-trenches expert for getting the most out of Microsoft Operations Manager.

Server management is critical in nearly any shop, but even more so in larger environments. The larger the environment, the more critical it becomes.

Here at the Kentucky Department of Education Office of Education Technology (OET), we provide technical standards and services to all 1,400 K-12 public schools, for nearly 700,000 student and staff users throughout the Commonwealth. Our infrastructure consists of 180 fully managed and monitored domains ranging in size from 200 to 110,000 users.

Mom #2For the past two years, our three-member OET Directory Services Team has had great success using Microsoft Operations Manager (MOM) to monitor this infrastructure. Since implementing MOM, we've reduced the number of break/fix help desk tickets by more than 90 percent for monitored machines and related services. Just the fact that we can monitor and maintain an environment of about 400 servers and nearly three-quarters of a million users with three people speaks to MOM's abilities in massive enterprise settings. During that time, we've learned a thing or two about using MOM. We hope you can benefit from these tips for getting the most out of MOM.

Tip #1: Take Advantage of the Management Packs
Microsoft currently lists 132 management packs and 13 product connectors, which you can view here. Management packs contain scripts, performance-gathering tools and Knowledge Base information for components MOM can monitor (more about the Knowledge Base later). Product connectors allow MOM information to be forwarded to other management products such as HP OpenView or Tivoli TEC for consolidated alerting.

Bonus Tip
You’ll need to determine which management packs fit into your environment, but be careful to install only the minimum number of packs necessary to fulfill your monitoring requirements. Every management pack adds work to your management servers and adds size to the agents deployed on your managed machines.

The Active Directory management pack has been worth its weight in gold to the OET Directory Services Team. On several occasions MOM has alerted the team to replication problems that were quickly resolved using its Knowledge Base.

And it goes beyond software monitoring. The Dell Hardware management pack (we use identically configured Dell PowerEdge 2600 servers) alerts the team to potential hardware failures from our domain controllers. It provides information about memory errors, predicts hard drive failures, chassis intrusion and many other hardware-related items.

Tip #2: Know Your Ports to Head off the Storm
Firewalls are an integral part of any organization's security infrastructure, but they can also wreak havoc on a MOM deployment. OET found this out the hard way when a rogue firewall rule produced a communications failure between the MOM management servers and a number of their managed servers. Alerts destined for the management servers were dropped by the firewall due to port restrictions, so the MOM operators never knew the alerts were happening.

In the meantime, those same firewall rules were blocking replication. The result was an ugly mess of replication failures that took several days to reconcile once the rogue rule was discovered and corrected. The MOM 2005 Security Guide details all the ports needed for MOM to function properly.

Bonus Tip
MOM 2005 is a more pleasant experience right out of the box than the pervious version, as many of the noisiest rules have been eliminated. Before you make any rule changes, document and test each individually. If you find yourself making several new rules, create a folder specifically for your rules so that other administrators can easily find them. We’ve found that creating a folder for each MOM administrator is helpful. An example is shown in Figure 1.

Tip #3: Play by the Rules
Once you've established communication between the individual MOM components and successfully deployed the agents, you can begin tweaking the MOM rules and scripts. Depending on the size of your environment, this can take 10 minutes or 10 months.

The directory services team at OET added nearly 20 new rules and turned off several noisy rules while running MOM 2000 SP1. Noisy rules are those that spit out events or alerts en masse or unnecessarily. Examples in MOM 2000 SP1 include rules that send successful Netlogon events to the management servers. In an environment with a large number of users, this can grow your MOM database tremendously. We also significantly tuned performance monitoring rules to reduce the size of the database.

Figure 1. Creating a Group Folder.
Figure 1. Creating a Rule Group Folder makes it easier for other administrators to find and use rules. (Click image to view larger version.)

Tip #4: Increase Your Knowledge Base
As you create new rules and groups of rules, MOM lets you add them to its database. When the Operator Console raises alerts, you can add your problem resolution steps into MOM 2005 by selecting the alert, right-clicking on the Company Knowledge Base tab, clicking Edit and entering the properly formatted information.

This has proven very beneficial for OET. It reduces the number of Tier 3 support calls, which translates into lower support costs. Adding the name of the person entering the information (Figure 2) and the date to the Knowledge Base gives the MOM operator a person to contact if there are questions about the solution.

Figure 2. Recall data for troubleshooting
Figure 2. The Office of Educational Technology formats Knowledge Base information so it can recall that data for troubleshooting. (Click image to view larger version.)

Tip #5: Keep MOM Secure
MOM agents stored on domain controllers require special permissions to run vast suites of scripts.

To help keep the security folks happy, MOM 2005 agents can run under a reduced security context on domain controllers without impacting their effectiveness. This is accomplished using a "MOM Action Account."

Mom #3That account—which you can use to install agents, run scripts and gather data from managed machines—must be part of the Local Administrators (not Domain Admins) and Performance Monitor users groups. It must also have the "Log on Locally" and "Manage Auditing and Security Log" rights made active in the Default Domain Controller Security Policy, which the local Administrators group does by default in Windows 2003. All of the security settings and permissions required for properly operating MOM are detailed in the MOM 2005 Security Guide.

Tip #6: Eliminate Replication Headaches
MOM 2005 suffers from some of its predecessor's ailments. The Microsoft Knowledge Base article 889054 references a problem that occurs when the replprov.dll tries to access an invalid pointer. It generates error messages when the file can't determine the replication status of the domain controller.

This alert can cause major headaches if you're monitoring anywhere from a handful to hundreds of domain controllers, but fortunately the hotfix is available and works well. If you see the alert (as presented in Figure 3), you're a prime candidate for this hotfix, which is applicable to both MOM 2000 and 2005.

Figure 3. An alert
Figure 3. If you see this alert, KB article 889054 is where you need to look for answers. (Click image to view larger version.)

Tip #7: Consider Trading Up
If your business only requires "best effort" uptime, then don't worry about purchasing a monitoring product. However, if your customers are as finicky as mine, MOM is a solid tool regardless of the size of your computing environment. With all the changes and new features MOM 2005 has to offer, an upgrade from MOM 2000 SP1 is a must.

Featured