Consolidating unmounted and open snapshots after virtual machine backup

At times, one or more disks from a virtual machine will remain mounted and snapshots will remain following the backup of a virtual machine. The cause of this is generally due to their not being enough time to unmount the disks and consolidate the snapshots when the job completes.

This would require user interaction to firstly unmount the virtual machine disks and then perform a consolidation of the virtual machine virtual disks. As we can agree, this is a manual task that is perfect to be automated.

The below solution has a number of caveats, the backup solution/appliance is a virtual machine and the virtual machine contains no hard disks that are of the disk type ‘Independent – Nonpersistent’.

The script in its entirety can be downloaded from here. However, the below will describe the logic to remove the virtual machine hard disks and consolidate the virtual machines.

Once we have established a connection to the vCenter Server and specified the virtual machine running the backup appliance/solution we will retrieve a collection of virtual machines where the ‘RunTime.ConsolidatedNeeded’ value is equal to ‘True’.

$VMs = Get-View -ViewType VirtualMachine -Property Name, RunTime | Where-Object {$_.RunTime.ConsolidationNeeded -eq $True}

Then we will connect to the virtual machine hosting the backup appliance/solution and retrieve all the virtual machine hard disks where the persistence value is equal to ‘IndependentNonPersistent’.

$HardDisks = Get-VM $ComputerName | Get-HardDisk | Where-Object {$_.Persistence -eq 'IndependentNonPersistent'} 

We will then invoke the Remove-HardDisk cmdlet to remove the filtered virtual machine hard disks. In this instance we are only removing the hard disks and not deleting them permanently.

Get-HardDisk -VM $ComputerName -Name $HardDisks.Name | Remove-HardDisk -Confirm:$False

Once the virtual machine hard disks have been removed, we will now consolidate the virtual machine disks on each of the virtual machines returned in the collection.

(Get-VM $VM.Name).ExtensionData.ConsolidateVMDisks() | Out-Null

The above script will provide console output to report the status of the invocation and can be triggered as a scheduled task returning an exit return code dependant on the status of the invocation.

Configure vRanger Date Format to use system locale

The date format within the vRanger Backup & Replication product, does not use the system locale and by default is configured to use ‘en-US’ .

In order for the date format to use the system locale, you will need to remove the default ‘ForceCulture’ value in the ‘C:\Program Files\Dell\vRanger\Vizioncore.vRanger.Client.Shell.exe.config’ file.

The default value is as below:

 <add key="ForceCulture" value="en-US"/></appSettings>

By removing the value, and restarting the Dell vRanger service, this will now use the system locale.

 <add key="ForceCulture" value=""/></appSettings>

Using a least privilege user account for vRanger

I was recently configuring a vRanger deployment to which I wanted to configure the service account running the various services to run under the context of a least privelage user.

The service account will require log on as a service permissions, db_owner permissions to the vRanger database on the SQL Server instance and also write access to the repositories you have configured.

For reference of the required privelages for the vCenter installation, the following was obtained from the Dell support site (http://documents.software.dell.com/DOC21485).

vRangerPermissions

vRanger Pro Powershell: Could not connect to http://localhost:2480/VAPIHost.svc

I recently compiled a number of powershell scripts (https://deangrant.wordpress.com/2013/10/01/report-job-status-from-vranger-to-nagios/)  with the vRanger Pro Powershell snap-in which queried the job status, using the Get-JobTemplate cmdlet.

I have since noticed I have received an error message on a number of occasions where the status is failed to be retrieved with the following:

Get-JobTemplate : Could not connect to http://localhost:2480/VAPIHost.svc. TCP

In order to resolve this issue and have the status information be returned, I made the following change to the configuration file ‘C:\Program Files (x86)\Quest Software\vRanger\PowerShell\vRanger.API.PowerShell.dll.config’ to increase the openTimeout value from the original configuration of 00:01:00 to 00:03:00 and restarted the vRanger API service and the cmdlet to retrieve the job status no seems to be more stable.

openTimeout="00:03:00" receiveTimeout="00:10:00" sendTimeout="00:01:00"

 

vRanger backup jobs fail with host could not be identified

I recently was investigating an issue where backup jobs in vRanger failed for all VMs with the following error message:

An internal error occurred during execution, please contact Quest support if the error persists. Error Message: <VM>'s host could not be identified

This appears to be an inventory issue between vRanger and the vCenter server which was caused by the VMware VirtualCenter Server service not responding and requiring a restart prior to the jobs being invoked.

The above issue may be resolved by performing the following from the vRanger server:

1) Stop the vRanger API and FLR services.

2) Restart the vRanger service.

3) Start the vRanger API service.

4) Start the vRanger FLR service.

After performing the above I was able to successfully run a backup job which had previously failed.

Report Job Status from vRanger to Nagios

I have been recently creating a number of external scripts within Nagios to bring together a number of services to be monitored. I am now looking at reporting the job status from vRanger backup jobs to Nagios based on the following criteria:

  • Report the last completed run of a backup job.
  • Report the job status and information to Nagios, where successful jobs are reported with the OK status and failed jobs with Critical status.

I will be able to do this by calling a number of cmdlets from the vRanger API Powershell snap-in, which by default is installed with vRanger.

Also the script will be required to return the status of multiple backup jobs, I do not want to create multiple scripts as well as multiple services within Nagios. Therefore, the script defined a mandatory parameter for the backup job name, which is called with the JobName argument.

Param ([parameter(Mandatory = $true)][string] $JobName)

As mentioned, previously we will be using the vRanger API Powershell snap-in to query the backup job status, so we will need to import the snap-ins to the current session:

if (-not (Get-PSSnapin vRanger.API.PowerShell -ErrorAction SilentlyContinue)) 
{ 
Add-PSSnapin vRanger.API.PowerShell > $null
}

Now, we need to filter the backup job name from the job template with the parameter specified.

$Template = Get-JobTemplate | Where-Object {$_.JobName -eq $JobName}

Now that we have the job name we will return the status from the Get-Job cmdlet by comparing the ParentJobTemplateId and the job template TemplateVersionId to return matching jobs where the status is completed and also return the most recent job history.

$Job = Get-Job | Where-Object {$_.ParentJobTemplateId -eq $Template.TemplateVersionID -and $_.JobState -eq "Completed"} | Select -Last 1

Now that we return the job status, we will need to generate return codes for the service status. The criteria as mentioned above is for jobs with the status success to be returned with the service status OK (0) and for jobs not reported as successful for this to be critical (2).

If ($Job.JobStatus -eq "Success")
{ 
$returncode = 0
} 

Else 
{ 
$returncode = 2
}

Finally I want to output the service status information for the backup job being monitored and exit the session returning the error code:

"" + $Job.JobState + " with " + $Job.JobStatus + " on " + $Job.CompletedOn
exit $returncode

One issue I found was that the vRanger API Powershell snap-in is only available in a 32-bit version and therefore requires the snap-in to be imported to the 32-bit version of Windows Powershell. Therefore, this requires the external script to call, the 32-bit executable from the check command:

check_vrangerbackupstatus= cmd /c echo scripts\Get-vRangerBackupStatus.ps1 -JobName "$ARG1$"; exit($lastexitcode) | %SystemRoot%\syswow64\WindowsPowerShell\v1.0\powershell.exe -command -

While the script was created to be executed as an external script within Nagios, this can be run standalone from Windows Powershell as below.

%SystemRoot%\syswow64\WindowsPowerShell\v1.0\powershell.exe -command ./Get-vRangerBackupStatus.ps1 -JobName <Job Name>

If your are looking to add external scripts to Nagios such as this one see the below link for more information;

https://deangrant.wordpress.com/2013/09/12/creating-and-running-external-scripts-within-nagios-xi/

The full Windows Powershell script can be downloaded from the below link:

https://app.box.com/s/3sgeu21nxxvv1oi02zte

vRanger: Backup jobs fail with ‘Message: 2712 – can’t create the directory’

I was recently looking at an issue where a number of virtual machine backups were failing with the following error message:

internal error occurred during execution, please contact Quest support if the error persists. Error Message: 2712 - can't create the directory cifs:<username>@<ip address of repository>/<repository path>

On investigation, I found the following knowledge base article which discussed modifying the ‘CifsTimeout’ value in the ‘C:\Program Files (x86)\Quest Software\vRanger\Service\Vizioncore.vRanger.Service.exe.config’ file (https://support.quest.com/SolutionDetail.aspx?id=SOL99156).

Following the recommended change being applied and a restart of the ‘vRanger Service’ this backup job failed with the above symptom on the next run.

Further investigation, highlighted an issue with the transport selection and the version of the virtual appliance being used. In order to resolve this issue I have provided implemented a workaround to modify the transport selection to be on the vRanger machine rather than the virtual appliance.

The current version of vRanger being used is 5.5.0.25454 and the virtual appliance is 1.7.0, this appears to be resolved in versions above 1.8.0 in the virtual appliance, so the next plan is to upgrade the current version of vRanger and deploy upgraded virtual appliances and then to revert the transport selection to virtual appliance.