Operations's profileOperations ManagerBlogLists Tools Help

Blog


    6/2/2009

    Scheduling Backups for OpsMgr in Windows Server 2008

    The System Center Operations Manager 2007 Unleashed book included a set of scripts on the CD (Chapter 12 content) that provided an automated backup process you could then schedule to execute as a regular task. These were also discussed at MMS 2009 as part of the SO34 “Don't Flirt With Disaster” presentation by Kerrie Meyler and Andy Dominey.

    With Windows Server 2008 the process to schedule these backups has changed, because the task scheduling mechanism has changed. The following process constitutes the steps involved to schedule this batch job to run on each of the different OpsMgr servers in your environment:

    1. Install the backup programs as discussed on the CD with the OpsMgr 2007 Unleashed book.
      • Copy the full "\backups" folder on the CD accompanying this book to each OpsMgr 2007 server (including database servers). (This content should include backup.bat, exportmp.ps1, and savekey.exe.)
      • On the RMS, copy SecureStorageBackup.exe to the %ProgramFiles%\System Center Operations Manager folder from the installation media within the \SupportTools folder.
    2. Customize the script to enable the installed components on the server for each of your OpsMgr 2007 servers.
    3. Perform the following procedure for Windows Server 2008:
      • Start –> Programs -> Accessories -> System Tools -> Task Scheduler (or go to search and type in Task Scheduler)

    Scheduling OpsMgr Backup in Server 2008 - 01

      • Create a New Basic Task
    Scheduling OpsMgr Backup in Server 2008 - 04
      • Schedule it to run Daily

    Scheduling OpsMgr Backup in Server 2008 - 05

      • Schedule for just before midnight and to recur every day

    Scheduling OpsMgr Backup in Server 2008 - 06

      • Set the action to Start a program

    Scheduling OpsMgr Backup in Server 2008 - 07

      • Specify the location of the batch file (in this example it is stored in c:\backups, but in general it is most likely on another drive such as e:\backups)

    Scheduling OpsMgr Backup in Server 2008 - 08

      • The Finish screen should look like this

    Scheduling OpsMgr Backup in Server 2008 - 09

      • Change the properties running the scheduled task to run as a user account that will have permissions and configure it to run whether a user is or is not logged on.

    Scheduling OpsMgr Backup in Server 2008 - 11   

    And there you go! Backups are now configured to be run daily on the server you were configuring the scheduled task for. The output backup folders should be picked up by an enterprise backup program to complete the process.

    Scheduling OpsMgr Backup in Server 2008 - 10

    4/29/2009

    MMS 2009 S034 Session – Don’t Flirt with Downtime – Presentation Links

    The following URLs are provided in conjunction with the presentation made April 29, 2009 by Kerrie Meyler and Andy Dominey during MMS:

    http://www.out-law.com/page-4464 – Forrester Research Consulting study on the cost of downtime

    Backups:

    Log Shipping:

    High Availability and Clustering:

    AD Integration:

    Database Grooming:

    Partitioning the Data Warehouse database – see Chapter 12, System Center Operations Manager 2007 Unleashed, pp 550-552

    MMS 2009 S032 Session – Common Mistakes When Using Operations Manager and How To Avoid Them – Presentation Links

    The following article provides a summary of the topics presented in the MMS session “Common Mistakes When Using Operations Manager and How to Avoid Them” and a reference for all links which were referenced within the presentation. These are sorted in alphabetical order by the main topic presented below.

     

    Anti-Virus:

    AV Exclusions:

    http://blogs.technet.com/kevinholman/archive/2007/12/12/antivirus-exclusions-for-mom-and-opsmgr.aspx

     

    Authoring Management Packs:

    Debugging Discovery failures:

    http://blogs.msdn.com/boris_yanushpolsky/archive/2008/12/19/what-if-my-discovery-script-fails.aspx  http://ianblythmanagement.wordpress.com/2009/02/25/discoveries/

     

    Hotfixes with OpsMgr:

    Hotfixes that didn’t really install:

    http://www.systemcenterforum.org/news/critical-hotfixes-for/

    http://nocentdocent.wordpress.com/2009/03/11/scom-patching-blues/

    What hotfixes to apply?

    http://blogs.technet.com/kevinholman/archive/2009/01/27/which-hotfixes-should-i-apply.aspx

    Hotfix Matrix:

     http://weblogwally.spaces.live.com/Blog/cns!A913F865098E0556!560.entry

    Hotfix potential side-effects:

    http://nocentdocent.wordpress.com/2009/03/12/patching-blues-qfe-958490-pumps-up-healthservice-cpu-usage-on-agents/.

    Automated deployment of hotfixes with ConfigMgr:

    http://ops-mgr.spaces.live.com/blog/cns!3D3B8489FCAA9B51!1285.entry

     

    Management Packs:

    Alert Storms:

    http://pavleck.net/2008/10/06/its-bound-to-happen-how-to-handle-alert-storms/  

    All-in-One environment for testing:

    http://cameronfuller.spaces.live.com/blog/cns!A231E4EB0417CB76!903.entry

    Cleaning up the default MP:

    http://blogs.technet.com/kevinholman/archive/2008/11/11/cleaning-up-the-default-mp.aspx, http://blogs.technet.com/momteam/archive/2007/05/03/removing-dependencies-on-the-default-management-pack.aspx

    http://blogs.technet.com/tnmag/archive/2009/04/01/blog-tales-cleaning-up-the-default-mp.aspx  

    SCCM/ConfigMgr and Performance counters: http://blogs.technet.com/configmgrteam/archive/2009/02/13/monitoring-configuration-manager-2007-with-operations-manager-2007-in-a-64-bit-environment.aspx  

    MP bugs:

    http://nocentdocent.wordpress.com/2009/02/14/kms-management-pack-bug/

     

    Ops Databases:

    Useful queries in the opsmgr database for Information gathering: http://blogs.technet.com/kevinholman/archive/2007/10/18/useful-operations-manager-2007-sql-queries.aspx

    Modify grooming settings:

    http://ops-mgr.spaces.live.com/blog/cns!3D3B8489FCAA9B51!176.entry

    Database maintenance:

    http://blogs.technet.com/kevinholman/archive/2008/04/12/what-sql-maintenance-should-i-perform-on-my-opsmgr-databases.aspx

    Reducing retention periods:

    http://blogs.technet.com/kevinholman/archive/2008/11/04/boosting-opsmgr-performance-by-reducing-the-opsdb-data-retention.aspx

    Localized text table grows and does not clean up. This is fixed in R2: http://blogs.technet.com/kevinholman/archive/2008/10/13/does-your-opsdb-keep-growing-is-your-localizedtext-table-using-all-the-space.aspx

     

    Password Changes:

    Challenges changing passwords for account in OpsMgr:

    http://thoughtsonopsmgr.blogspot.com/2009/04/resetting-sdk-and-action-account.html

     

    Resetting Monitors:

    Reset all Monitors:

    http://blogs.msdn.com/mariussutara/archive/2009/01/19/how-to-restart-monitoring-of-my-environment-another-version-update.aspx   

    Reset to green:

    http://blogs.technet.com/timhe/archive/2009/01/15/announcing-the-greenmachine-utility-for-operationsmanager-rtm-sp1-and-r2.aspx

     

    Root Management Servers & Management Servers:

    Memory Examples for the RMS:  http://cameronfuller.spaces.live.com/blog/cns!A231E4EB0417CB76!1180.entry

    Moving the Operations Manager Folder: http://cameronfuller.spaces.live.com/blog/cns!A231E4EB0417CB76!1247.entry

    RMS promotion:

    http://technet.microsoft.com/en-us/library/cc540401.aspx

    Deploying agents using ConfigMgr:

    http://ops-mgr.spaces.live.com/blog/cns!3D3B8489FCAA9B51!1034.entry    

    How many consoles are connected:

    http://blogs.technet.com/kevinholman/archive/2008/10/27/how-many-consoles-are-connected-to-my-rms.aspx

    High Handle count KB:

    http://support.microsoft.com/kb/951979

    Work-around for High Handle Count: http://r0nwilliams.spaces.live.com/blog/cns!62A0019E9E556103!330.entry

     

    Virtualization with OpsMgr:

    Thoughts on what to virtualize:

    http://blogs.technet.com/momteam/archive/2007/10/02/virtualizing-opsmgr-2007-roles.aspx

    Virtualization support policy:

    http://support.microsoft.com/kb/897615

     

    Windows 2008:

    5 part series for OpsMgr installation on Windows 2008:

    http://ops-mgr.spaces.live.com/blog/cns!3D3B8489FCAA9B51!768.entry)

    UAC and OpsMgr:

    http://cameronfuller.spaces.live.com/blog/cns!A231E4EB0417CB76!1557.entry,

    Installing Web console on 2008:

    http://blogs.technet.com/momteam/archive/2008/12/17/installing-web-console-on-windows-server-2008.aspx

    Windows 2008 with Hyper-V counter issues: http://cameronfuller.spaces.live.com/Blog/cns!A231E4EB0417CB76!1318.entry  

     

    4/25/2009

    OpsMgr 2007 Breakout sessions at MMS

    If you’re attending MMS, here are the OpsMgr 2007 breakout sessions. Cameron’s session is S032 on Tuesday, and Kerrie and Andy are co-presenting at S034 Wednesday.

    • SO01 – Overview of Ops Mgr 2007 R2 (Tue @ 11:45)
    • SO02 – Ops Mgr 2007 R2 Platform Architecture (Tue @ 2:15)
    • SO03 – Administrative and Implementation Best Practices (Tue @ 4:00)
    • SO04 – Understanding Ops Mgr 2007 Performance & Scalability (Wed @ 10:15)
    • SO05 – Developing Custom Reports and Operational Dashboards (Thu @ 4:00)
    • SO06 – Monitoring .NET Web Applications (Wed @ 4:00)
    • SO07 – Monitoring Windows Server 2008, SQL Server 2008 and Exchange 2007 (Wed @ 11:45)
    • SO08 – Advanced Management Pack Authoring (Fri @ 10:00)
    • SO09 – Monitoring UNIX/Linux (Wed @ 2:15)
    • SO10 – Understanding the LAMP stack and monitoring it (Thu @ 8:30)
    • SO11 – Interop Connectors for System Center (Thu @ 10:15)
    • SO12 –SO15 – Management pack Design Series 1-4 (Mon @12, 1:30, 3:00 and 4:30)
    • SO25 – Targeting in Operations Manager 2007 (Thu @ 2:30)
    • SO26 – Monitors, Aggregations, Dependencies, Rollup (Thu @ 11:45)
    • SO30 – Taking Control of Ops Mgr Alerts (Thu @ 11:45)
    • SO31 – Inside Active Directory Integration in Large Environments (Fri @ 8:30)
    • SO32 – Common Mistakes when using Ops Mgr and how to avoid them (Tue @ 4:00) Cameron Fuller
    • SO33 – Operation Manager 2007 R2: Driving Compliance with ACS (Fri@ 11:30)
    • SO34 – Configuring Ops Mgr for High Availability (Wed @ 10:15) Kerrie Meyler and Andy Dominey

    Cameron, Kerrie, and Andy will be giving out copies of System Center Operations Manager 2007 Unleashed for the best questions asked during their sessions.

    3/25/2009

    OpsMgr 2007 R2 now at Release Candidate status

    Tonight Microsoft announced the availability of the Operations Manger 2007 R2 Release Candidate on Connect (http://connect.microsoft.com). There are a number of enhancements over the beta released in November, including:

    • New Power Management MP template (the monitored system must be either Windows Server 2008 R2 or Windows 7)
    • Updated branding across all user interfaces, including a new skin
    • Improved trace configuration tools to help support issues escalated to Customer Support (if applicable)
    • Improved Run As Account Distribution Configuration
    • Ability to run inline tasks for non-Microsoft servers
    • Support for upgrade from Beta deployments to the Release Candidate
    • New and updated documentation, including the Usage Guide, Design Guide, Deployment Guide, Upgrade Guide, Security Guide, and Operations Guide

    The RC should be upgradeable to the RTM version once that is available, but since this is a test version, do not run it in a production environment unless you have made special arrangements with Microsoft.

    3/22/2009

    Using ConfigMgr to deploy OpsMgr Server 2008 Hotfixes

    There are two hotfixes that must be applied to all Windows Server 2008 systems that will have an OpsMgr agent installed on them. Fellow MVP Blake Mengotto did a really good job summarizing what we needed to know to determine the hotfixes required for Windows Server 2008 with OpsMgr agents.

    The fixes are 952664 & 953290, with full details are available at Blake’s blog, http://discussitnow.spaces.live.com/blog/cns!A4408C121568CAA4!6047.entry.

    Use a Software Update?

    One  thoughts on deploying the hotfixes was to provide them as a software update using Configuration Manager. The interesting thing however is that these two patches do not show up as available in the Update catalog. Working with a ConfigMgr SME, we tried to add these in using the CAB files, as these updates are actually MSU files, and an MSU can be converted into a CAB file. To do this, perform the following steps:

    • Extract the MSU files into the CAB files, for example: expand –f:* “<path>\Windows6.0-KB952664-v3-x64.msu”
    • http://support.microsoft.com/kb/934307/en-US discusses the Windows Update Stand-alone Installer
    • We extracted both of the fixes into their own directories. However, they failed on import into ConfigMgr when attempting to add the CAB files.

    So that approach wasn’t going to be useful here.

    Find Systems Missing the Patches?

    Next, we thought about how to determine what systems do and do not have these hotfixes installed. To address this, we checked to see if ConfigMgr brings back this information. It was verified that ConfigMgr does bring back information from Add/Remove Programs, but does not bring back information about the updates installed on the system (which can be viewed in Add/Remove programs on the Windows Server 2008 systems). There appeared to be no easy way to find this in ConfigMgr. Kevin Holman has good information on how to find this out using an OpsMgr report, documented at http://blogs.technet.com/kevinholman/archive/2008/06/27/a-report-to-show-all-agents-missing-a-specific-hotfix.aspx.

    Add/remove programs will show this information, displayed below:

    Hotfixes installed 

    Using Software Deployment: Collections

    Having determined that software updates were out, the next approach is a software deployment. To facilitate this, we created a custom collection that met the following criteria:

    The top level collection includes Windows Server 2008 systems with the OpsMgr agent that are not OpsMgr management servers (excluding another collection). (This query will eventually have to be changed to disregard all Windows Server 2008 systems with SP1 installed because these hotfixes will be included, but this syntax works currently.)

    Using Software Deployment: Top-Level collection

    We created a top-level collection using the following syntax:

    select SMS_R_SYSTEM.ResourceID,SMS_R_SYSTEM.ResourceType,SMS_R_SYSTEM.Name,SMS_R_SYSTEM.SMSUniqueIdentifier,SMS_R_SYSTEM.ResourceDomainORWorkgroup,SMS_R_SYSTEM.Client from SMS_R_System inner join SMS_G_System_SERVICE on SMS_G_System_SERVICE.ResourceID = SMS_R_System.ResourceId where SMS_R_System.Client = 1 and SMS_R_System.OperatingSystemNameandVersion like "%Server 6.0%" and SMS_G_System_SERVICE.DisplayName = "OpsMgr Health Service"  and SMS_R_System.ResourceId not in (Select ResourceID from SMS_FullCollectionMembership where CollectionID="ABC0003E")  

    Using Software Deployment: Sub-collections

    Next, we created sub-collections for I386, X86, and IA64 based upon the top level collection. Each of these collections was defined to exclude the members of a custom collection created for management servers, which has a collection ID of ABC0003E. The SystemType for each query will vary for the particular operating system.

    The syntax for the sub-collection is:

    select SMS_R_SYSTEM.ResourceID,SMS_R_SYSTEM.ResourceType,SMS_R_SYSTEM.Name,SMS_R_SYSTEM.SMSUniqueIdentifier,SMS_R_SYSTEM.ResourceDomainORWorkgroup,SMS_R_SYSTEM.Client from SMS_R_System inner join SMS_G_System_COMPUTER_SYSTEM on SMS_G_System_COMPUTER_SYSTEM.ResourceId = SMS_R_System.ResourceId where SMS_G_System_COMPUTER_SYSTEM.SystemType = "x64-based PC"

    Using Software Deployment: The Packages &Programs

    We ended up creating four different packages for these; two for each hotfix. The packages were 952644-x64, 952644-x86, 953290-x64 and 953290-x86 as shown below.

    Hotfix01 

    Each was created as a new package (without a definition because this is not a MSI, PDF, or SMS file), and each had a single program defined to install the hotfix. For the package, the customized settings included:

    hotfix05 

    For the program, the syntax looks like:

    Command line: wusa.exe \\<ServerFQDN>\952664-x86\Windows6.0-KB952664-v3-x64.msu /quiet /norestart

    Hotfix02_new 

    This was based from the KB article at http://support.microsoft.com/kb/934307. Each patch needed its own share defined for the hotfix to install. This share is actually referenced in the program command line above to specify from where to run the actual hotfix.

    Hotfix03

    The program was configured to run for a maximum of 10 minutes (above), whether or not a user is logged on (below).

    Hotfix04

    Using Software Deployment: The Advertisement

    The advertisements were configured as mandatory assignments targeted at the appropriate collections (as an example, the 953290-x86 package was targeted to the I386 – OpsMgr Hotfixes collection). These advertisements were set to install regardless of maintenance windows but did not allow system restart outside of maintenance windows, shown below:

    Hotfix07

    We next configured the distribution points so that the programs would run directly from them as shown below:

    Hotfix08

    Software Deployment: Results

    Some of the server 2008 systems reported back an error of 1, but these appear to have been systems where the hotfix was previously deployed successfully. Those systems that did not have the hotfixes installed appeared to have deployed without issue, although they require a reboot to complete the hotfix installation.

    Summary:

    ConfigMgr 2007 can be used to automate the process to deploy the required hotfixes for Windows Server 2008 to machines with the OpsMgr client running on them. This was not exactly a simple process though, hopefully there are better ways out there. If you know about them, please let us know!

    2/25/2009

    The OpsMgr 2007 R2 book

    Update on this for all our readers - the R2 book will be electronic only, and will contain supplemental information to what is currently included in System Center Operations Manager 2007 Unleashed.

    We are planning seven chapters, with updated appendices for the OpsMgr by Example series and Reference URLs. The seven chapters themselves will be entirely new content. Authors are:

    • Kerrie Meyler
    • Cameron Fuller
    • John Joyner
    • Andy Dominey
    • Marco Shaw, contributor

    A link to to purchase the ebook from the Amazon page currently listing System Center Operations Manager 2007 Unleashed (www.tinyurl.com/27mqnm) will be available at a later date.

    1/5/2009

    OpsMgr by Example: The Active Directory 2008 Management Pack

    This blog entry is the next in a series of Operations Manager-related items that review the steps performed to install, configure, and tune management packs in real-world environments:

    Installation

    1) Download the Active Directory Management Pack (http://www.microsoft.com/downloads/details.aspx?FamilyId=008F58A6-DC67-4E59-95C6-D7C7C34A1447&amp;displaylang=en). The Active Directory Management Pack Guide is included in the download and labeled “OM2007_MP_AD2008.doc.”

    2) Read the Management Pack guide – cover to cover. This document spells out in detail some important pieces of information you will need to know.

    3) Import the AD Management Pack (using either the Operations console or PowerShell).

    4) Deploy the OpsMgr agent to all domain controllers (DCs). The agent must be deployed to all DCs. Agentless configurations will NOT work for the AD Management Pack.

    5) Get a list of all domain controllers from the Operations console. In the Authoring space, navigate to Authoring -> Groups -> AD Domain Controller Group (Windows 2008 Server). Right-click on the group(s) and select View Group Members.

    6) Enable Agent Proxy configuration on all Domain Controllers identified from the groups. This is in the Administration space, under Administration -> Device Management -> Agent Managed. Right-click each domain controller, select Properties, click the Security tab, and then check the box labeled “Allow this agent to act as a proxy and discover managed objects on other computers.” Perform this action for every domain controller, even if the DC is added after your initial configuration of OpsMgr.

    7) Configure the Replication account in the Operations console, under Administration -> Security (full details for this are in the AD MP Guide). Do this for every domain controller, even if a DC is added after your initial OpsMgr configuration.

    8) Validate the existence of the “OpsMgrLatencyMonitors” container. Within this container, create sub-folders for each DC, using the name of each domain controller. If the container does not exist, it is often due to insufficient permissions. (See information configuring the Replication account within the AD MP Guide for details.)

    9) Open the Operations console. Go to the Monitoring node and navigate to Monitoring -> Microsoft Windows Active Directory -> Topology Views and validate functionality. (You may have to set the scope to the AD Domain Controllers Group to get these views to populate).

    10) Check to make sure Active Directory shows up under Monitoring -> Distributed Applications as a distributed application that is in the Healthy, Warning or Critical state. If it is in the “Not Monitored” state, check for domain controllers that are not installed or are in a “gray” state.

    11) Create a MicrosoftWindowsActiveDirectory_Overrides management pack to contain any overrides required for the MP (hey, if it’s not created now we’ll never remember to create it and we’ll end up using the default MP and that’s not good – see http://cameronfuller.spaces.live.com/blog/cns!A231E4EB0417CB76!1152.entry or System Center Configuration Manager 2007 Unleashed for details there).

    Deploying the Active Directory 2008 Management Pack was relatively painless. After importing the management pack, there was no significant impact on processors seen on the domain controllers. The Active Directory Topology Root appeared as a distributed application and showed a health state of green. The Active Directory diagram view also worked as expected.

    Tuning/Alerts to Look For

    We encountered and resolved the following alerts while tuning the Active Directory management pack.

    Alert: The AD Last Bind latency is above the configured threshold.

    Issue: One domain controller had consistently high AD Last Bind Latency. Logon to the system showed it as extremely unresponsive.

    From product knowledge, we used the suggested tasks to validate that the bind was not going slowly and no high CPU processes were identified on the system. The view available in product knowledge pointed to a large spike in the time required for the LDAP query (checking the Active Directory Last Bind counter). The spike occurred while there was a very heavy processor utilization occurring on one of the domain controllers. This monitor checks every 5 minutes. Alert auto-resolved itself after the LDAP query was responding in an acceptable timeframe.
    Resolution: Attempts to debug the issue were inconclusive and extremely difficult due to the performance issue with the system. We rebooted the domain controller, it came back online, and the AD Last Bind Latency returned to normal values.

    Alert: A problem has been detected with the trust relationship between two domains.

    Issue: A server in a location (site 1) lost communication with domain controllers that existed in a second location (site 2). This critical alert did NOT auto-resolve. This was detected by the alert rule “A problem has been detected with the trust relationship between the two domains.” We verified that the Last Modified date occurred during the outage (add this column to the display by personalizing the view on the Active Alerts to include the field) and the Repeat Count was not incrementing.

    Resolution: We used the Active Directory Domain Controller Server 2008 Computer Role Task of Enumerate Trusts to validate all trusts were working after site connectivity was re-established. We then logged into the domain controller reporting the error and used the Active Directory Domains and Trusts UI to validate each of the trusts. We closed the alert manually.

    Alert: A problem with the inter-domain trusts has been detected.

    Issue: A server in a location (site 1) lost communication with domain controllers that existed in a second location (site 2). This critical alert did NOT auto-resolve. This was detected by the AD Trust Monitoring monitor which runs every 5 minutes using the AD Monitor Trusts script. We verified that the Last Modified date occurred during the outage (add this column to the display by personalizing the view on the Active Alerts to include the field) and the Repeat Count was not incrementing.

    Resolution: We used the Active Directory Domain Controller Server 2008 Computer Role Task of Enumerate Trusts to validate all trusts were working after site connectivity was re-established. We next logged into the domain controller reporting the error and used the Active Directory Domains and Trusts UI to validate each of the trusts. This alert should auto-resolve when the trust relationships are working, but that functionality does not appear to work. We manually closed the alert.

    Alert: AD Op Master is inconsistent.

    Issue: We tested using the Alert Monitor “Ad Replication Partner Op Master Consistency,” which runs every minute, to verify the incoming replication partners for the domain controller show the same operations masters. We also used the REPADMIN Replsum task in the Active Directory MP.

    Resolution: The REPADMIN Replsum command validated that replication was functioning correctly (we had to override the “Support Tools Install Dir” on Windows 2008 to %windir%\system32 to make the task work correctly). The link between the domain controllers has been running close to fully saturated. The alert auto-resolved once the network utilization slowed down.

    Alert: AD Client Side - Script Based Test Failed to Complete.

    Issue: This alert is generated by the “AD Replication Partner Op Master Consistency” monitor. The system reporting the error was generating an error of event id 45 in the Operations Manager Log from the source of Health Service Script.

    This event is occurring on an hourly basis (12:57, 1:58, and so on):

    AD Replication Partner Op Master Consistency : The script 'AD Replication Partner Op Master Consistency' failed to execute the following LDAP query: '<LDAP://servername.contoso.com/CN=Configuration,DC=CONTOSO,DC=COM>;(&(objectClass=crossRefContainer)(fSMORoleOwner=*));fSMORoleOwner;Subtree'.

    The error returned was 'Table does not exist.' (0x80040E37)

    This alert is linked to “Could not determine the FSMO role holder.” alerts that are occurring.

    Resolution: We believe this was related to a misconfiguration of the anti-virus settings on the domain controllers in the environment.

    Alert: DC has failed to synchronize its naming context with replication partners.

    Issue: A server in a location (site 1) lost communication with domain controllers that existed in a second location (site 2). The rule generating this alert is “DC has failed to synchronize naming context with its replication partner”.

    Resolution: The alerts occurred when connectivity was lost between the sites. These alerts had a Repeat Count of 0. We used the REPADMIN Replsum command to validate that replication was functioning correctly (had to override the “Support Tools Install Dir” on Windows 2008 to %windir%\system32 to make the task work correctly). We closed the alerts manually.

    Alert: Could not determine the FSMO role holder.

    Issue: Each domain controller in the environment reported the error when trying to determine the Schema Op Master on the various domain controllers. The rule generating this was “Could not determine the FSMO role holder”.

    Resolution: We used the NETDOM Query FSMO task (changing the Support Tools Install Dir to %windir%\system32) to validate the FSMO role holders on each domain controller.

    Alert: DC has failed to synchronize its naming context with replication partners.

    Issue: One of the domain controllers in the environment went to a grayed out status.

    The server having the issues reported the “DC has failed to synchronize its naming context with replication partners” issue and “A problem has been detected with the trust relationship between two domains” and “AD Replication is occurring slowly” and “Script Based Test Failed to Complete” (for multiple AD related scripts).

    Other domain controllers reported “Could not determine the FSMO role holder” and “AD Client Side – Script Based Test Failed to Complete”.

    Events also occurred on the client system (21006 OpsMgr Connector, 20057 OpsMgr Connector, 21001 OpsMgr Connector).

    Resolution: We installed the Telnet client feature to test connectivity to the management server. Telnet connectivity failed from this system but not from others. We then restarted the OpsMgr Health service but it had no effect on the gray status. After rebooting the system, the status went back to non-gray.

    Alert: AD Client Side - Script Based Test Failed to Complete.

    Issue: AD Replication Partner Op Master Consistency: The script 'AD Replication Partner Op Master Consistency' could not create object 'McActiveDir.ActiveDirectory'. This is an unexpected error. The error returned was 'ActiveX component can't create object' (0x1AD)

    Resolution: In MOM 2005, this was resolved by changing the Action account. In OpsMgr 2007, this alert occurred in a different domain than the one with the OpsMgr RMS server. To resolve this, we created a Run As Account for the domain (DMZ) and assigned the Run As Account to the AD domain controllers in the DMZ domain.

    Alert: Script Based Test Failed to Complete.

    Issue: AD Lost And Found Object Count: The script 'AD Lost And Found Object Count' failed to create object 'McActiveDir.ActiveDirectory'. This is an unexpected error. The error returned was 'ActiveX component can't create object' (0x1AD)

    Resolution: We configured the AD MP Account (Administration / Security / Run As Profiles) for each of the two servers in the domain that were reporting errors.

    Alert: Script Based Test Failed to Complete.

    Issue: AD Database and Log : The script 'AD Database and Log' failed to create object 'McActiveDir.ActiveDirectory'. The error returned was 'ActiveX component can't create object' (0x1AD).

    Resolution: We configured the AD MP Account (Administration -> Security -> Run As Profiles) for each of the two servers in the domain that were reporting errors.

    Alert: Performance Module could not find a performance counter.

    Issue: In PerfDataSource, could not resolve counter DirectoryServices, KDC AS Requests, Module will be unloaded.

    Resolution: We created a Run As Account and configured the AD MP Account (Administration -> Security -> Run As Profiles) for each of the two servers in the domain that were reporting errors.

    Alert: Script Based Test Failed to Complete.

    Issue: AD Database and Log : The script 'AD Database and Log' failed to create object 'McActiveDir.ActiveDirectory'. The error returned was 'ActiveX component can't create object' (0x1AD)

    Resolution: We installed OOMADS from the OpsMgr 2007 SP 1 CD.

    Alert: This domain controller has been promoted to PDC.

    Issue: No issue, this was an informational message. The message was generated when the PDC emulator role was moved between domain controllers.

    Resolution: No actions required, this message is provided for situations where the PDC emulator role was moved unexpectedly.

    Alert: The Domain Changes report has data available.

    Issue: No issue, this was an informational message. This was generated when the PDC emulator role was moved between domain controllers in the environment.

    Resolution: No actions required, this message is provided for situations where the PDC emulator role was moved unexpectedly.

    Alert: AD Domain Performance Health Degraded.

    Issue: More than 60% of the DCs contained in this AD Domain report a Performance Health problem

    Resolution: This alert indicates that there are alerts that are occurring in more than 60% of the domain controllers in a domain. This alert does not require an action for itself but does require analysis to determine what is causing the domain controllers to be in a degraded state.

    Alert: AD Site Performance Health Degraded.

    Issue: More than 60% of the DCs contained in this AD Site report a Performance Health problem

    Resolution: This alert indicates that there are alerts that are occurring in more than 60% of the domain controllers in a site. This alert does not require an action for itself but does require analysis to determine what is causing the domain controllers to be in a degraded state.

    Alert: Account Changes Report Available.

    Issue: Informational alert, which can be accessed in the AD SAM Account Changes report (available on the right side under Active Directory Domain reports).

    Resolution: No resolution required. We checked the AD SAM Account Changes report (available on the right-side under Active Directory Domain reports) to see the changes that were available.

    During our testing, we had a period of time when we lost network connectivity to a site that had one of the domain controllers. The result was a flurry of alerts listed below:

    Alerts:

    Critical Alerts:

    • A problem with the inter-domain trusts has been detected
    • DNS 2008 Server External Addresses Resolution Alert
    • OleDB: Results Error

    Warnings:

    • A problem has been detected with the trust relationship between two domains
    • AD Client Side - Script Based Test Failed to Complete (multiple)
    • Could not determine the FSMO role holder. (multiple)
    • DC has failed to synchronize its naming context with replication partners (multiple)

    Issue: Loss of network connectivity between one site and another, both of which had domain controllers.

    Resolution: Once network connectivity was re-established, we resolved all issues identified above.

     

    UPDATE: 02/25/09

    Alert:  The Op Master Schema Master Last Bind latency is above the configured threshold.

     

    Issue: A large number of alerts are generated at > 5 seconds for warning and > 15 seconds for error.

     

    Resolution: Per http://technet.microsoft.com/en-us/library/cc749936.aspx the effective thresholds should be changed to warning at > 15 seconds and error at > 30 seconds. Created an override for all types of Active Directory Domain Controller Server 2008 Computer role to change Threshold Error Sec to 30 and Threshold Warning (sec) to 15 and stored it in the ActiveDirectory2008_Overrides management pack.

     

    Alert:  The Op Master Domain Naming Master Last Bind latency is above the configured threshold.

    Issue: A large number of alerts are generated at > 5 seconds for warning and > 15 seconds for error.

     

    Resolution: Per http://technet.microsoft.com/en-us/library/cc749936.aspx the effective thresholds should be changed to warning at > 15 seconds and error at > 30 seconds. Created an override for all types of Active Directory Domain Controller Server 2008 Computer role to change Threshold Error Sec to 30 and Threshold Warning (sec) to 15 and stored it in the ActiveDirectory2008_Overrides management pack.

    1/2/2009

    MOM MVPs - yes we're still here!

    We are happy to announce that Kerrie, Cameron, and Andy were re-elected this month as MOM MVPS for the next 12 months (John renewed in October). The Microsoft MVP designation is awarded on an annual basis. We look forward to continuing to work with the MOM team for the Operations Manager 2007 R2 rollout and keeping you posted on the latest and greatest.

    A happy New Year's to all!

    12/17/2008

    OpsMgr by Example: Agent Deployment using ConfigMgr

    We return to our OpsMgr by Example series with a discussion of deploying the OpsMgr Agent.

    Scenario: If you're looking for an automated way to provision new systems into management and monitoring solutions without human intervention, you will want to consider using Configuration Manager 2007 for deploying the agent, combined with AD Integration to identify the management server.

    We begin the process with required changes for Active Directory.

    Active Directory

    By default, all new computers added to Active Directory are placed in the Computers container. We will be changing this as we need to define a group policy to deactivate the firewall on systems so the ConfigMgr client can be pushed out to those systems. Since Active Directory does not allow applying a group policy to the Computers container, we will change the default location for new systems added to Active Directory.

    We created a new Build OU structure, applied a GPO to remove the firewall policy, and then changed the default location where new computers were added to the domain.

    Defining the Build OU: We started by creating a top-level OU structure called Build.

    GPO to remove the firewall policy: We next defined a group policy that disabled the firewall and linked it to the new OU structure (Build) as shown below:

    ConfigMgrandOpsMgr06

    Setting the default computers location: To change the default location for computer objects, use the redircmp command (sample syntax for this command is redircmp ou=build,dc=contoso,dc=com).

    Once the computers complete their ConfigMgr and OpsMgr agent deployment, they can be moved to their final OU location. Our next step is configuring ConfigMgr agent deployment.

    Configuration Manager 2007

    On the Configuration Manager side, automated agent deployment is pretty straight-forward. We need to configure discovery, agent deployment, and site mode settings within Configuration Manager 2007.

    ConfigMgr Discovery: Discovery is used to discover new systems. Active Directory System Discovery is configured to run daily at midnight for all domains in our ConfigMgr environment.

    ConfigMgr Agent Deployment: Agents will be deployed using the Client Push Installation method for servers, workstations, domain controllers and site systems.

    ConfigMgr Site Mode Configuration: We need ConfigMgr to automatically approve all computers in trusted domains. If computers will be managed outside of the trusted domains, the only method to automate the process is to check the Automatically approve all computers option (not recommended) option - which we do not recommend. Configure the setting on the Properties of the site on the Site Mode tab:

    ConfigMgrandOpsMgr01

    After deploying Configuration Manager agents to the systems previously discovered, we can use collections to gather up the targets to which to distribute the OpsMgr Agent. We will also create the required packages and advertisements.

    OpsMgr Package Creation: To effectively target the correct version of the Operations Manager 2007 agent we created three different packages (AMD64, I386, IA64) as displayed here.

    ConfigMgrandOpsMgr02

    The per-system unattended program for each package is configured on the Requirements tab for each program to only allow the program to be run on specific client platforms (as an example, the I386 package is restricted to only run on I386 systems).

    OpsMgr Exclusion Collection #1: The Operations Manager agent is not designed to be pushed to Operations Manager servers. To avoid this issue, we created a collection including all the OpsMgr servers, based upon the naming convention in place. For this particular environment, all OpsMgr servers were labeled with OMRM (Operations Manager RMS), OMMS (Operations Manager Management Server), and OMGW (Operations Manager Gateway Server). We defined the collection to update on an hourly basis. The syntax for this collection follows:

    select SMS_R_System.ResourceId, SMS_R_System.ResourceType, SMS_R_System.Name, SMS_R_System.SMSUniqueIdentifier, SMS_R_System.ResourceDomainORWorkgroup, SMS_R_System.Client from SMS_R_System where SMS_R_System.Name like "%OMRM%" or SMS_R_System.Name like "%OMMS%" or SMS_R_System.Name like "%OMGW%" order by SMS_R_System.Name

    OpsMgr Exclusion Collection #2: The Operations Manager agent is not designed to be deployed to domain controllers from ConfigMgr, as AD Integration does not work on domain controllers. To work around this, we created a collection that included all the Domain Controllers based on the role of the server. The collection was defined to update on an hourly basis. The syntax for this collection follows:

    select SMS_R_SYSTEM.ResourceID,SMS_R_SYSTEM.ResourceType,SMS_R_SYSTEM.Name,SMS_R_SYSTEM.SMSUniqueIdentifier,SMS_R_SYSTEM.ResourceDomainORWorkgroup,SMS_R_SYSTEM.Client from SMS_R_System inner join SMS_G_System_COMPUTER_SYSTEM on SMS_G_System_COMPUTER_SYSTEM.ResourceID = SMS_R_System.ResourceId where SMS_G_System_COMPUTER_SYSTEM.Roles like "%Domain_Controller%" order by SMS_R_System.Name

    OpsMgr Top-Level Collection: To determine which systems need the OpsMgr agent deployed, we created a top level collection (OpsMgr Client Deployment) designed to include all servers in a specific set of domains that are not domain controllers or OpsMgr servers. This collection was defined to update on an hourly basis. The query is designed to only choose those systems that are Configuration Manager clients, in the appropriate domain, and use a server-level operating system. The last two sections of the next query remove the members of both OpsMgr Exclusion collections that were previously defined.

    select SMS_R_SYSTEM.ResourceID,SMS_R_SYSTEM.ResourceType,SMS_R_SYSTEM.Name,SMS_R_SYSTEM.SMSUniqueIdentifier,SMS_R_SYSTEM.ResourceDomainORWorkgroup,SMS_R_SYSTEM.Client from SMS_R_System where SMS_R_System.Client = 1 and SMS_R_System.ResourceDomainORWorkgroup in ("domain1","domain2","domain3") and SMS_R_System.OperatingSystemNameandVersion like "%Server%" and SMS_R_System.ResourceId not in (Select ResourceID from SMS_FullCollectionMembership where CollectionID="XYZ0003E") and SMS_R_System.ResourceId not in (Select ResourceID from SMS_FullCollectionMembership where CollectionID="XYZ0003D")

    OpsMgr Sub-Collections: Under the top level collection, we created one sub-collection for each of the different types of OpsMgr agent packages that we designed, displayed here: 

    ConfigMgrandOpsMgr03

    Each sub-collections is defined based upon the type of the computer system. As an example, the AMD64 collection used the following:

    select SMS_R_SYSTEM.ResourceID,SMS_R_SYSTEM.ResourceType,SMS_R_SYSTEM.Name,SMS_R_SYSTEM.SMSUniqueIdentifier,SMS_R_SYSTEM.ResourceDomainORWorkgroup,SMS_R_SYSTEM.Client from SMS_R_System inner join SMS_G_System_COMPUTER_SYSTEM on SMS_G_System_COMPUTER_SYSTEM.ResourceId = SMS_R_System.ResourceId where SMS_G_System_COMPUTER_SYSTEM.SystemType = "x64-based PC"

    Advertisements: Now that we have all the other required pieces, we create an advertisement for each package we created to push to their appropriate sub-collection, as displayed:

    ConfigMgrandOpsMgr04

    We defined the advertisements and then assigned them to their appropriate sub-collection. As an example, the Per-system unattended to AMD64 was assigned to the AMD64 sub-collection under the OpsMgr Client Deployment collection.

    Operations Manager 2007

    Next, to push the Operations Manager 2007 agent out to the appropriate systems, we need to configure AD Integration and set auto-approval for agent installations. 

    AD Integration: Using AD Integration, information is stored in Active Directory specifying the management group in which to add an agent. There are some excellent articles available on configuring configure AD Integration. We recommend either System Center Operations Manager 2007 Unleashed, or the guide available at System Center Forum (http://www.systemcenterforum.org/downloads/active-directory-integration-in-ops-mgrs-2007).

    Auto-approve within OpsMgr: Once AD Integration is in place, we need to tell OpsMgr that it needs to automatically approve manually installed agents, since a ConfigMgr-installed OpsMgr agent is treated as a manually installed agent. The configuration is shown below as part of settings in the Administration node of the OpsMgr console:

    ConfigMgrandOpsMgr05

    In configuring OpsMgr to use AD Integration and manually installing agents auto-approved for agent deployment, we have completed the OpsMgr piece of this configuration. Now let's see how it all comes together!

    Auto-Deployment Example:

    Let's say a system was added to one of the new domains at the end of day Monday and discovery started the next morning. The deployment process would look similar to the following:

    12:29 am: Active Directory System Discovery occurs within the new domain, and the new computer system is found.

    12:31 am: ConfigMgr agent installation started.

    12:39 am: ConfigMgr agent reporting in.

    12:47 am: New server appears in the top-level OpsMgr Client Deployment collection.

    12:53 am: New server appears in the I386 sub-collection.

    1:48 am: OpsMgr software installed on the new server.

    2:00 am: New server reporting into OpsMgr correctly.

    Summary

    By combining multiple technologies such as Active Directory, Group Policies, ConfigMgr, AD Integration, and OpsMgr, we can provide a fully automated method for deploying both the ConfigMgr and OpsMgr agents, greatly simplifying the process of managing and monitoring servers.

    12/10/2008

    Deploying Operations Manager 2007 in Highly Available and Distributed Enterprise Environments

    Andy Dominey, of System Center Operations Manager 2007 Unleashed fame and a contributor to this blog, has completed a white paper on best practices and considerations for designing a highly available OpsMgr 2007 infrastructure. Andy has quite a bit of experience in this area, and working with Microsoft, has come up with some unique approaches.

    The paper is available at http://www.it-jedi.net/downloads/Deploy_OpsMgr%20in%20HA_EntEnv-ADEd_v1.0.pdf and discussed at http://www.systemcenterforum.org/news/deploying-opsmgr-in-highly-available-and-distributed-enterprise-environments-download/. Happy reading!

    12/9/2008

    WOW - Ops-Mgr blog with 250,000 Hits!

    It was only at the end of August 2008 that this blog crossed the 200,000 hit mark (see http://ops-mgr.spaces.live.com/blog/cns!3D3B8489FCAA9B51!772.entry). Today we are very excited to report we have over 250,000 page views to date.

    Thank you once again to everyone who has contributed to the blog via articles, comments, or questions, and sent words of encouragement and thanks.

    - Kerrie, Cameron, John, and Andy   12/09/08

    12/1/2008

    Fix for SQL Server Management Pack version 6.0.6441.0

    Version 6.0.6441.0 of SQL Server Management pack, which includes support for monitoring SQL Server 2008 databases, was released to Microsoft's Management Pack catalog October 29th of this year. It has since been discovered that the discovery scripts fail to discover SQL Servers with databases larger than 32 GB in size. Not only that, once the discovery fails no new discovery will occur. (Discovery worked fine in earlier versions of the management pack.)

    A fix has been written for this and tested, but due to delays in getting it posted to the Management Pack catalog, the MOM Team is making the download available from their blog (http://blogs.technet.com/momteam). Once the updated management pack is available on the catalog (http://technet.microsoft.com/en-us/opsmgr/cc539535.aspx), it will be removed from the MOM Team blog.

    If you need the fixed management pack immediately, you can download it now from http://blogs.technet.com/momteam/archive/2008/11/13/the-database-discovery-scripts-discoversql200-db-vbs-fail-on-sql-servers-with-dbs-larger-than-32gb-in-the-6-0-6441-0-version-of-the-sql-mp.aspx.

    Using Watcher Nodes with Synthetic Transactions

    We've been asked by several (see http://ops-mgr.spaces.live.com/blog/cns!3D3B8489FCAA9B51!858.entry) about how to use watcher nodes to run synthetic transactions created withVBScript. Here's an approach:

    1. Create a group, add the watcher node(s) to the group and target the monitor to the group. Note that if you need to add this to a distributed application, it is best to create a custom class in the Authoring console.
    2. Next, configure your script to run on the watcher node and as for the script/monitor, you most likely will need to use a probe rule to run the script.

    If you are using a monitor, the script would also have to do something like log to the event log - with a monitor configured to look for these events for its state change.

    This can entail a bit of work up front, but a lot of the stuff such as the class object could be reused.

    As far as using the Perspective class (asked by some) - that's probably not a good approach. It isn’t really possible to make a web request appear as a member of the Perspective class; when you create a web transaction using the wizard, it goes off and creates a ton of rules/monitors and classes in the background which would be difficult if not impossible to reproduce outside of the wizard.

    Hope this helps!

    Please note that these are perspectives of the coauthors and contributors of System Center Operations Manager 2007 Unleashed.

    11/20/2008

    OpsMgr 2007 R2 Beta 1 available on Connect

    That's right! Yesterday afternoon the MOM Team released Beta 1 to the web. This is a public beta, build 6407, and available for download at http://connect.microsoft.com. We'll be highlighting results of our testing on this blog as we work on the R2 update for System Center Operations Manager 2007 Unleashed.

    P.S. Don't try downloading it using Firefox. A colleague continually received the error Unable to complete application validation... check is overriden until switching to Internet Explorer. :)

    11/5/2008

    OpsMgr 2007 will have a R2

    Yep, its official! Announced at TechEd Barcelona Monday (November 3), with a public beta available towards the end of November. R2 includes the Cross Platform extensions (X-Plat, see http://www.networkworld.com/community/node/27600 and http://ops-mgr.spaces.live.com/blog/cns!3D3B8489FCAA9B51!857.entry) plus a number of other anticipated enhancements. We expect to see enhanced service level monitoring reports and dashboards, User Interface improvements including a new version of the Operations Management Pack Authoring console, and core product enhancements.

    R2 Content for OpsMgr 2007 Unleashed!

    The authors of System Center Operations Manager 2007 Unleashed are planning an update to encompass the new functionality available with OpsMgr 2007 R2, and we will be participating in the beta test cycle as we develop content for the new edition.

    10/19/2008

    Gartner Reports on OpsMgr 2007 Progress

    Dave Berkowitz gives us a link to a Gartner report on Operations Manager 2007 released earlier this month. You can view the report at http://mediaproducts.gartner.com/reprints/microsoft/vol10/article2and3/article2and3.html.

    Dave's article is posted on the System Center Team blog at http://blogs.technet.com/systemcenter/archive/2008/10/09/gartner-reports-on-opsmgr-2007-progress.aspx. We like that he uses the word 'unleashed' in his text! We include his article below as well (emphasis ours):

    Hello All.

    It wasn’t so long ago (March 2007) that Microsoft unleashed System Center Operations Manager 2007 as a successor to Microsoft Operations Manager 2005, and it’s only been a few months (April 2008) since we followed up with a pretty major announcement of a cross-platform strategy for the product.

    Now, Gartner analyst David Williams has published an in-depth look at OpsMgr, evaluating its enhancements against customer requirements, describing how the product is getting traction in the market and making some predictions about future adoption.

    Unfortunately, I am restricted, as the analyst relations manager for the System Center business, from telling you exactly what’s in the report. But I can tell you it is a good read, provides a nice update on what customers have been doing with OpsMgr 2007 and, while we may not agree with all of the conclusions, it does have some interesting thoughts for customers looking at next steps for monitoring or event correlation and analysis (ECA) products.

    Go read the report at:

    http://mediaproducts.gartner.com/reprints/microsoft/vol10/article2and3/article2and3.html

    10/6/2008

    A question from Andy Howell regarding Synthetic Transactions

    The following message was posted as an email to this blog last Thursday:

    In Operations Manager 2007 Unleashed, you say that it is possible to create synthetic transactions using VBscript. You go on to work through a few examples of synthetic transactions based on pre-defined templates in Operations Manager. As part of this, you describe how to use watcher nodes to run these transactions.
    Unfortunately I've not been able to find many examples of people who are actually doing this, so I'm turning to your site in desperation. I have an FTP server and need to prove that I can establish a connection and transfer a file to it. What I thought was the hard bit - I'm not a coder! - is done. A small VBscript establishes a connection, transfers a files and returns the status code to Operations Manager in a property bag.
    What I'm struggling with is the simple bit: where to run it. I don't want to run it on the FTP server itself, as this wouldn't prove anything. Ideally I would like to use a watcher node, but I can't find a way to do this without using one of the pre-defined templates for synthetic transactions, none of which allow me to run my own VBscript. Am I missing something obvious?
    By the way, and regardless of whether you can answer this post or not, System Center Operations Manager 2007 Unleashed is a truly excellent book. In my 12 year career I've read through many IT titles. This one really stands as an example of what a good IT book should be: it covers the theory, operation and real-life scenarios (for example database sizes) in just the right level of detail.
    Andy

    Unfortunately, Andy's communication preference settings don't allow us to respond to him, and we would like some additional details as we may need to have him try some things.

    Andy, can you email ops-mgr@hotmail.com with an email address that we can reply to? Thanks!

    X-Plat: The OpsMgr Gateway to Linux in the Datacenter

    At MMS 2008 last May, Microsoft announced their direction to use Operations Manager to manage non-Windows systems (for more information, see Kerrie’s articles “Of Flying Pigs” at http://www.networkworld.com/community/node/27600 and “The Dynamic Datacenter” at http://www.networkworld.com/community/node/27354). This article discusses our experiences testing a beta version of the (Cross Platform) X-Plat software.

    The Conventional OpsMgr Gateway Role

    Let’s say you have computers at a branch office, in the offices of a partner or customer, or in a datacenter that resides on an untrusted and/or unconnected network. You put an OpsMgr gateway server on that remote network and connect it to your main OpsMgr management group with certificate-based authentication. Cool technology, and you are now monitoring those remote systems from your main location without standing up any new connectivity and potentially increasing the attack surface.

    New OpsMgr/X-Plat Gateway Scenario

    Before Microsoft introduced the Cross-Platform beta 1 refresh, you could not leverage that secure yet lightweight OpsMgr gateway service for monitoring any Linux computers at your remote location with anything more than a basic SNMP heartbeat. This article reviews this new feature of the Microsoft System Center Operations Manager 2007 Cross Platform Extensions Public Beta 1 Refresh. The software allows OpsMgr gateway servers to discover and fully manage non-Windows computers at remote network locations. This capability opens a new market for Operations Manager with a novel solution to extend management to Linux and other X-Plat systems such as HP-UX or Solaris and even AIX, which were previously out of reach of native System Center tools.

    Note: We review here the second released beta for X-Plat. Features and function will change in the released product. Microsoft plans to release X-Plat as part of an update to OpsMgr in 2009.

    Demo environment

    An OpsMgr management group with Internet-facing gateway servers includes a gateway server at a remote datacenter. All gateway servers trust the same Certificate Authority (CA) and use unique identity certificates issued by the mutually trusted NOC CA for encryption and authentication. There is a Red Hat Enterprise Linux server (RHEL) at the remote site. We want to use the gateway server to monitor the Linux server from the NOC.

    Here are the steps we took to discover and manage the RHEL box at the remote datacenter:

    1. Install the X-Plat extensions on a selected management server and consoles. The official name of the installable is “System Center Operations Manager Cross-Platform Extensions.” Prerequisites include OpsMgr 2007 SP1 and WS-Management (WS-Man) 1.1.

      Something we liked a lot is that you don’t need to touch the RMS or any high-value management servers to use X-Plat. You only need to install X-Plat extensions on the management server you will run the discovery wizard from.

      There are 32-bit and X64 versions of X-Plat, and also full server and console only versions (a total of four .MSI files to select from). Install the console-only executable on other OpsMgr consoles you will use to monitor the cross-platform systems from.

      • Import the desired X-Plat management packs. The server X-Plat extensions setup defaults to dumping about 14 management packs (for all the operating systems supported by X-Plat) to the %programfiles%\System Center Management Packs folder. You only need to import the libraries and management packs needed to manage your target systems. To manage the RHEL 5 box, we imported these management packs:
        • WS-Management Library
        • Linux Operating System Library
        • Unix View Library
        • Red Hat Operating System Library
        • Red Had Enterprise Linux Server 5 Operating System management packs
      • Run ImportXSLT.cmd on those computers where you installed the X-Plat extensions (management server and consoles). This small step changes how the task output and diagnostic and recovery messages generated by Health Explorer on Unix and Linux computers are displayed. This step has to take place after the X-Plat management packs are imported or you will receive an error.
      • Install the X-Plat extensions on the gateway server. Repeat the installation, similar to the management server. An additional step is that we create a UnixAgents folder in the AgentManagement folder of the gateway server. Extract the UnixAgents.zip that comes with X-Plat to that folder. When the gateway pushes the agent to the Linux server at the datacenter, the Linux bits will come from that folder.
      • Configure the management group Run As Accounts. There is some manual work for the OpsMgr administrator to let the X-Plat extensions on the gateway server know what the credentials are to access the Linux computer.
        1. In the Administration -> Security -> Run As Accounts node of the Operations console, create two new Run As Accounts of the Basic Authentication type. One is a normal user account on the Linux computer and one is a privileged account. For the demo, we used the same root account and password for both Run As accounts. Name the accounts something that identifies them with the gateway server.
        2. In the Security -> Run As Profile node, locate the Unix Privileged Account and associate it with the privileged Run As Account and the target of the gateway server with X-Plat Extensions. Similarly, associate the Unix Action Account Run As Profile with the normal user Run As Account and the target of the gateway server.
        3. This beta release of X-Plat extensions only provides for a single pair of Run As Accounts per management server or per gateway server that performs the discovery and monitoring. To monitor other Linux computers with different sets of credentials requires an additional management server or gateway server for each set of credentials. This is a product limitation we hope is overcome in future releases.

      1. Discover and accept the Linux server from the management server. This is just like using the Discovery Wizard from the Administration space of the Operations console, except you launch the X-Plat discovery process from the Overview page of the Cross Platform management pack in the Monitoring space. (In later releases X-Plat discovery is expected to migrate to the Administration space and integrate with Windows computer and network device discovery.)
        • An issue with this beta release of X-Plat is that support for discovery of the most current versions of some Linux distributions isn't there. In our environment where the demo Linux computer is located, datacenter security polices require Linux distributions be kept current.

          While RHEL 5.2 is the current release, X-Plat only discovers up to RHEL 5.1. (Our hope and assumption is that the RHEL 5.1 agent will work on 5.2.) We expect that with future releases of X-Plat, there will be a community effort to keep X-Plat management packs updated with discovery support for more versions and releases.

          There is a manual install option for the X-Plat agent, which in this case would be as follows (the RPM file can be found in the UnixAgents folder on the gateway server):

          rpm -i scx.1.0.1-151.rhel.x86.rpm

          Another solution that enables use of the automatic discovery and integrated features of the X-Plat management packs is to 'trick' the discovery into thinking that the RHEL 5.1 version is installed on the target computer. We used this method, and pushed the version RPM file for 5.1 to the target computer running RHEL 5.2 with this command:

          rpm -i --force redhat-release-5Server-5.1.0.2.i386.rpm

          The --force switch is used since there is a file version downgrade. That RPM file is part of the RHEL 5.1 Server distribution. To later restore the RHEL 5.2 version file, it's enough to run the command "yum update redhat-release-5server" for the single package, or "yum update" to update any other pieces with patches since it was installed.

        • Perform the discovery from the console of a management server where X-Plat Extensions is installed. You need privileged access to the Linux server to push the agent. If you don’t have a superuser account, you need to provide the root user password. After you specify the IP address and privileged account information for the target, if the computer is discoverable, it will shortly appear as seen in this screenshot of the Select Computers to Manage step in the Unix and Linux Computer Management Wizard:

      Discovery

      After approving the discovered Linux computer, the gateway server uses SSH to push the System Center Cross-Platform (SCX) agent to the /tmp folder of the Linux computer. After a few minutes you can query the state of the two services that are started by the SCX agent. See this screen shot of an SSH session from the gateway server to the managed Linux server, confirming that the WS-Man daemon and the CIM server are up:

      Putty

      Managing Red Hat Linux with Operations Manager

      Soon after completing these actions, the RHEL computer appeared in the Linux Servers state view of the OpsMgr console. Next, data started appearing in the memory and processor-related views. Some hours later, the disk and network views were populated. We received some alerts regarding invalid SSH authentication attempts, and we immediately had a solid feeling about our ability to really manage Linux boxes from Windows with OpsMgr.

      Here is a screenshot of an alert related to security of the SSH services on the RHEL box:

      SSHAlert

      An Internet-facing web server is going to get a lot of intrusion attempts against any open service. We secured the SSH services on the RHEL box with these host rules (and the alerts stopped!):

      1. Edit /etc/ssh/ssh_config
        1. “vi /etc/ssh/ssh_config”
        2. Press “i” to allow modification of file contents
      1. Modify line to restrict SSH protocol to version 2
        1. Locate line “# Protocol 2,1”
        2. Remove “#” from beginning of line, and “,1” from end of line.
      1. Save the file
          Press “:wq” and press enter
        1. Modify hosts.deny file to deny all hosts access to SSH
          1. “Vi /etc/hosts.deny”
          2. Press “i” to allow modification of file contents
          3. Add this to the next available blank line: “sshd: ALL”
          4. Press “:wq” and press enter
        1. Modify hosts.allow file to permit specific hosts to connect via SSH
          1. “vi /etc/hosts.allow”
          2. Press “i” to allow modification of file contents
          3. Add this to the next available blank line: “sshd: <ip address of permitted host> <ip address of permitted host> …..” (…. = etc, not literal)
          4. Press “:wq” and press enter
                Monitoring Views

                The next screenshot expands all the branches in the Cross Platform Servers view folder (left) created when you import the X-Plat management packs for Red Hat Linux. Focus (right) is on a 24-hour performance view of Physical Disk target “sda” in the RHEL server.

                MonitoringView

                Reports

                When you select a Linux server in the Linux Server State view folder, in the Actions pane you will see a dozen targeted Unix Computer Reports available for on-the-fly generation. Here is the 7-day Memory Performance History (Pages per Sec) report for the RHEL computer:

                Report

                Distributed Application Possibilities

                X-Plat Extensions creates OpsMgr objects for monitored components of discovered Linux computers. This expands the universe of objects available to create Distributed Applications (DAs) to include Linux disks, processors, network interfaces and the like.

                • We created a DA that contains two components of classes Windows 2008 Logical Disks and Linux Logical Disks. This DA represents the health of the logical disks of all the web farm members, regardless of their OS.
                • Relationships are defined as Web Server Farm Logical Disks Uses Linux Logical Disk and Web Server Farm Logical Disks Uses Windows 2008 Logical Disk. See the screenshot of the DA below, open in the Distributed Application Designer:
                  DAD
                  True Cross-Platform Performance Monitoring

                  By creating a Performance view that targets the DA we created, we can assess aggregated logical disk performance across Windows and Linux members of a web server farm in a remote data center. Now we have "apples to apples" metrics in the same pane of management glass! See this screenshot of X-Plat in full motion:

                  DAPerfView

                  Remote Task Execution

                  A final systems management value-add we find in the current X-Plat release is a small collection of Unix Computer Tasks, which are available in both the Operations console and Web console. These tasks are:

                  • Run VMStat (a short report on virtual memory statistics, paging block I/O, traps, system and CPU usage),\
                  • Memory Information (paging and swap data)
                  • Top 10 CPU Processes

                  In this screenshot we demonstrate listing the top 10 CPU processes on the Linux server:

                  Task 


                  Contributors: Thanks to Jacob Linscott, Linux Guru at datacenter provider Softlayer for help on the RHEL versioning; and to Kevin Clark, NOC Manager at managed services provider ClearPointe for the command list that secured the SSH service.

                  9/24/2008

                  A New Home for Walter Chomak's blog

                  Our friend Walter Chomak (http://wchomak.spaces.live.com/) previously posted September 16th, 2008 that his blog would become less active due to some internal projects he was taking on at Microsoft (see http://wchomak.spaces.live.com/blog/cns!F56EFE25599555EC!1657.entry). Well a week later, he's back! - but at a new location. See http://blogs.technet.com/wchomak/ for his latest postings.