Nagios: Starting NSClient++ service fails with the error ‘NSClient++ (x64) is not a valid Win32 application.’

I was recently investigating an issue where the Nagios monitoring agent (NSClient++) service failed to start with the following error message:

The NSClient++ (x64) service failed to start due to the following error:
NSClient++ (x64) is not a valid Win32 application.

In the first instance I attempted to resolve the issue by uninstalling and installing the monitoring agent on the impacted host. To which the same behaviour was experienced. On investigation, I found the following knowledge base article which describes the above symptom, where the cause of the issue is described as below:

  • The path of a service‚Äôs executable file contains spaces.
  • There is a file or folder on your computer‚Äôs hard disk that has the same name as a file or folder in the path to the service’s executable file

The first cause  above describes the condition that is causing the issue, so in order to attempt to resolve the issue I to wrapped the image path filename in quotations, as follows:

1) Browse to the registry key ‘HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\NSClientpp’.

2) Modify the REG_EXPAND_SZ value data for ImagePath to be¬†“C:\Program Files\NSClient++\nsclient++.exe”.

Following the modification I was successfully able to start the NSClient++ service and continue monitoring the host.

Monitor the status of SQL Server Agent jobs with Nagios XI

I am looking into monitoring the status of  SQL Server Agent jobs and reporting them with Nagios with the following requirements

  • For each step report the ‘run_status’.
  • Provide date arithmetic to determine a time span for when the step was last run in days.
  • Return a status of Critical if the run_status is ¬†reported as ‘0’ and Warning if the run_status is reported as ‘1’ and the ‘run_date’ is greater than the time span.

The powershell script will require two parameters for both the SQL Agent step name and the number of days to compare the ‘run_date’ agaisnt a calculated time span object.

Param ([string] $StepName,[string] $Days)

As I will be querying the SQL Server database engine instance and in this case a majority of those are SQL Server 2008 R2 I will need to import the SQLPS module into my session.

Import-Module SQLPS -DisableNameChecking

Below is my the query I will be invoking to return the information for the last run where the step_name is specified as a parameter.

$Query = Invoke-SQLCmd - ServerInstance "." -Query ("SELECT TOP 1 * from msdb..sysjobhistory h WHERE step_name = " + $StepName + " order by instance_id desc")

Once the information has been returned I will need to use¬†DateTime.ParseExact method in order to convert ¬†the ‘run_date’ value to a date string of ‘dd/MM/yyyy’ and store this as a variable.

$LastRun = [datetime]::parseexact(($Query.run_date),"yyyyMMdd",$null).toString('dd/MM/yyyy')

The run_duration value returns the amount of time the step was running in seconds, I want to return this as a string ‘dd:hh:mm:ss’ by using the New-TimeSpan cmdlet and formatting the output.

$Duration = (New-TimeSpan -Seconds $Query.run_duration).ToString()

In order to provide the date arithmetic I will subtract the  number of days specified in the parameter from the current date.

$Timespan = (Get-Date) - (New-TimeSpan -Days $Days -Hours 0 -Minutes 0)

In order to configure the service status and the status information conditional logic will be used to determine the return code. As per my requirements, I require the to report the service status to be ‘OK’ if the run_status is equal to ‘1’ and the ‘run_date’ is greater than the TimeSpan variable generated from the date arithmetic.

If ($Query.run_status -eq "1" -and [datetime]::parse($LastRun ) -ge [datetime]::parse($TimeSpan) )
{ 
$returncode = 0
"Completed succesfully on " + $LastRun + " with run duration " + $Duration
}

We will report a service status of ‘Warning if the ‘run_status’ is equal to ‘1’ and the ‘run_date’ is less then the TimeSpan variable generated from the date arithmetic.

ElseIf ($Query.run_status -eq "1" -and $LastRun -lt $Timespan )
{ 
$returncode = 1
"Completed succesfully on " + $LastRun + " with run duration " + $Duration
}

Finally, if the ‘run_status’ is equal to ‘0’ the service status will be reported as ‘Critical.

ElseIf ($Query.run_status -eq "0")
{ 
$returncode = 2
"Completed with failure on " + $LastRun  + " with run duration " + $Duration
}

Once the status information has been generated the script will exit the powershell session returning the exit code.

exit $returncode

While the script was created to be executed as an external script within Nagios, this can be run standalone from Windows Powershell as below.

./Check-SQLServerAgentJob,ps11 -StepName <step_name> -Days <number of days>

If your are looking to add external scripts to Nagios such as this one see the below link for more information;

https://deangrant.wordpress.com/2013/09/12/creating-and-running-external-scripts-within-nagios-xi/

The full Windows Powershell script can be downloaded from the below link:

https://app.box.com/s/ph6h3qmzaytw5oiel1eb

NSClient reports “Request string contained illegal metachars!”

I as recently compiling an external script within Nagios XI to run using the NSClient where upon execution I was receiving the following message in the nsclient.log:

2013-11-21 17:02:10: error:modules\CheckExternalScripts\CheckExternalScripts.cpp:202: Request string contained illegal metachars!

This is due to the NRPE daemon by default not accepting the use of the the characters `&><‘”\[]{} in arguments.

In order to resolve the issue I had to enable the ‘allow_nasty_meta_chars’ setting for External Scripts and restart the NSClient++ service:

[External Script]
allow_nasty_meta_chars=1