Search This Blog

Friday, November 23, 2012

Using Monitors for Automated Disk Space Recovery

Thought I would do a new post today. It's the day after Thanksgiving and guess who drew the after-hours phone a monitoring ticket this week? So, bunch of disk space monitors came in and I decided I didn't want to have to respond to them, especially since I will have after-hours duties during Christmas as well, lucky me.

So the concept I'll illustrate here is to have SCOM take automated actions to clean-up drive space on its own, hopefully averting an impending disaster and the need for you to get up from your peaceful slumber, camping trip or whatever it is you do when you try to have a life outside of daily IT tasks.

As with any Microsoft product, there are about a dozen ways to do this. I give you probably the simplest way, but long term would probably be difficult to maintain in a large server environment. I'll improve on this and post any updates when I do.

First, let us start by creating a folder on the root drive of a target server. I called the folder "C:\Scripts". Create a batch file in the folder. I called mine "cleanup.bat"

System Center 2012 SCOM Disk Cleanup Script Folder


I have included the contents of the sample script. It is pretty basic, essentially clearing out temporary files in a variety of locations. This could certainly be expanded to remove old IIS or Blackberry log files, remove SQL backups, etc. Any action that can be scripted can essentially be run here.


del C:\Temp\*.* /s /q
del %Windir%\Temp /s /q
IF EXIST "C:\Users\" (
    for /D %%x in ("C:\Users\*") do (
        del /f /s /q "%%x\AppData\Local\Temp\"
        del /f /s /q "%%x\AppData\Local\Microsoft\Windows\Temporary Internet Files\"
    )
)
IF EXIST "C:\Documents and Settings\" (
    for /D %%x in ("C:\Documents and Settings\*") do (
        del /f /s /q "%%x\Local Settings\Temp\"
        del /f /s /q "%%x\Local Settings\Temporary Internet Files\"
    )
)
With the script loaded on the target server, go to your SCOM management console and navigate to the authoring tab and then select the "Monitors" option.



After the monitors all load, in th search field, type in "Logical Disk" to narrow the monitors. Once the search finishes, you can then select one of the Logical Disk Free Space monitors, such as Windows Server 2000, Windows Server 2003 or Windows Server 2008. I chose to alter the settings for the Windows Server 2008 Logical Disk Free Space monitor. Right-click on the Logical Disk Free Space monitor and select "Properties".



Navigate to the "Diagnostic and Recovery" tab. Here, we will want to add a recovery task. You can put in the same task for both warning and critical recovery tasks if you like.

In the "Recovery Task Type", select "Run Command" and change the destination management pack to either the default client overrides or an alternate management pack you might have for such customizations.

On the next section, give the recovery task a name that makes sense. You can choose to have the system recalculate the monitor. If disk space falls back into the norm, the alert will clear itself automatically.

On the last screen, enter the path on the target server where the script was located. As I outlined in the beginning, I put the scripts in the "C:\Scripts\" folder with a file called cleanup.bat.

You are basically done at this point. When the alert or warning condition occurs (which ever you setup to respond to), the script will kick off and delete the files.

The nice thing about running a recovery task in this manner is that if the alert persists, you know there is a larger problem as the basic steps of clearing up misc., temporary files has already been accomplished. Now you know there is something out of the ordinary going on.

You can now test this script by running a program called Philip 2.10. This will create a large file in one of the temporary directories cleared by the sample script. You can download the software from the following location:

http://www.softpedia.com/get/System/File-Management/Philip.shtml

4 comments:

  1. This comment has been removed by the author.

    ReplyDelete
  2. Hi Dave,

    Excellent article. I have a quick questions. I am monitoring IIS and disk space on a server and when the server either goes over a pre-defined threshold or when one IIS website is down, I want to remediate it.
    I have created a recovery task for IIS as the following in SCOM:

    Authoring-->Monitors-->IIS 7 Availability-->right click-->Properties-->Diagnostic and recovery-->configure recovery task-->
    Critical-->command line
    Path : C:\Windows\System32\inetsrv\appcmd.exe
    Parameters: start site /site.name:"site name"

    This doesn't work. Should I be configuring the recovery task for Web site availability and not on Availability? Or am
    I making a silly mistake somewhere?
    If I run the command
    C:\Windows\System32\inetsrv\appcmd.exe start site /site.name:"site name" from the server that hosts the IIS locally, I can restart the IIS but not from the monitor in SCOM.
    I am at the end of my knowledge about this. Any help will be greatly appreciated.
    Regards,

    ReplyDelete
  3. The quick and dirty solution for a specific web server would be to shutdown each site, look for the error and then customize a response that turns on the website for that specific alert. For something more generic and broad that could apply to any web server and its sites, check out some of the powershell applets - http://technet.microsoft.com/en-us/library/ee790599.aspx

    You could assign the Get-Websitestate value to a variable and if the variable equals "stopped", run Start-Website. Are you limited to using appcmd.exe or is powershell an option?

    ReplyDelete
  4. Hi Dave..
    Very Nice article. I need help regarding configuring the Recovering Task in SCOM for Web Application Availability Monitor.

    Scenario is: We have multiple application endpoints which are hosted on different App Pool in IIS. We need to set up recovery task in SCOM so that if there is any alert regarding Web Application Availability Monitor for a particular endpoint, it go on restart the App Pool which hosts that particular endpoints and not other App Pools.

    Environment: SCOM MS in Azure, Agents in In-house DC

    Just wanted to check whether such kind of Customization is possible in SCOM? If yes, please guide me how to set up this

    ReplyDelete