Logging – Part Six: Installing and Configuring vSphere ESXi Dump Collector

ESXi host systems can be configured to dump the vmkernel memory to a network server rather than local disk, this is useful in situations where the ESXi host system does not have a local disk or is an auto deployed ESXi host system. During the critical failure on an ESXi host system, the panic routine attempts to write a core dump  using either or both the DiskDump (local disk) or NetDump (Dump Collector) mechanism. If NetDump has been configured, the ESXi host system opens a connection from the VMkernel network to the remote IP address of the vSphere ESXi Dump Collector on UDP service port 6500 and transmits a compressed core dump.

To install the vSphere ESXi Dump Collector launch the vSphere installation media and select  vSphere ESXi Dump Collector to install from the VMware vCenter Support Tools section and follow the below steps:

1) Specify the vSphere ESXi Dump Collector installation and repository location and the repository maximum size and select Next.

Dump-1

 

 

 

 

 

 

2) Specify the installation type for the vSphere ESXi Dump Collector, in this example we will be specifying a VMware vCenter Server installation and select Next.

Dump-2

 

 

 

 

 

 

3) Specify the vCenter Server address and login credential information and select Next.

Dump-3

4

 

 

 

 

 

4) Specify the vSphere ESXi Dump Collector port settings and select Next.

Dump-4

 

 

 

 

 

 

5) Specify the vSphere ESXi Dump Collector identification name on the network and select Next.

Dump-5

 

 

 

 

 

 

6) Select Install.

Now that we have installed and configured the vSphere ESXi Dump Collector we now are required to configure the ESXi host system using the esxcli system coredump namespace to use the Network Dump Collector for their core dumps.

1) Connect to the ESXi host system using a SSH client.

2) Verify connectivity on UDP service port 6500 to the Network Dump Collector.

 nc -uz deanvc1.dean.local 6500
Connection to deanvc1.dean.local 6500 port [udp/*] succeeded!

3) Configure the Network Dump Collector using the esxcli system coredump namespace.

esxcli system coredump network set --interface-name vmk0 --server-ipv4 192.168.112.89 --server-port 6500

4) Finallly, we need to enable the Network Dump Collector.

esxcli system coredump network set --enable true

Once configured, the vSphere ESXi Dump Collector will store dump files in the directory ‘C:\ProgramData\VMware\VMware ESXi Dump Collector\Data’

Logging – Part Five: Analyse and Test Logging Configuration Information

In order to analyse log file and resolve issues we need to view the log files, such as vmkernel.log and vmsummary.log using a program such as tail to  end of the log file. The log file will show related related messages, in this example we will examine the vmkernel.log file which will display messages related to storage and contains SCSI sense codes. The SCSI sense codes are an industry standard maintained by Technical Committee T10 to which the ESXi host system conforms to this standard. For information on SCSI sense codes this  information can be found at http://www.t10.org/lists/1spc-lst.htm.

The SCSI sense codes are sent during the status phase, which occurs prior to the Command Complete Message and indicates a success or failure. For any time a SCSI command is sent to a target, the initiator expects a completion status. The various status codes are displayed below:

Status Code Description
00h Good
02h Check Condition
04h Condition Met
08h Busy
18h Reservation Conflict
28h Task Set Full
30h ACA Active
40h Task Aborted

 

SCSI Sense Key Description
0h No Sense
1h Receovered Error
2h Not Ready
3h Medium Error
4h Hardware Error
5h Illegal Request
6h Unit Attention
7h Data Protect
8h Blank Check
9h Vendor Specific
Ah Copy Aborted
Bh Aborted Command
Dh Volume Overflow
Eh Miscomplete
Fh Completed

So, if I take a log entry in my vmkernel.log file, we can analyse the error message being reported and isolate the SCSI event.

2015-02-14T08:29:47.778Z cpu1:32784)ScsiDeviceIO: 2337: Cmd(0x412e80896940) 0x1a, CmdSN 0x3884 from world 0 to dev "mpx.vmhba32:C0:T0:L0" failed H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0.
H:0x0 D:0x2 P:0x0 Valid sense data: 0x5 0x20 0x0

From the Status Code received we can determine that the device is reporting a Check Condition.

  • Host Status – H:0x0 = Good
  • Device Status – D:0x2 = Check Condition
  • Plugin Status –  P:0x0 = Good

From the SCSI sense key (0x5) we can determine that the error message is being reported due to an Illegal Request. Finally, we need to determine from the cause of the error by analysing the additional sense code (0x20) and ASC qualifier (0x0). The list of additional SCSI sense data is far too detailed to document in this blog, so the following can be used as a reference http://www.t10.org/lists/asc-alph.txt.

From the  additional sense code and ASC qualifier information, we can now determine that the error message reported by the device is due to ‘INVALID COMMAND OPERATION CODE’, which may be due to the device not supporting VPD pages.

Logging – Part Four: Centralized Logging on ESXi Hosts

In vSphere 5 logging is standardised using Syslog for all logging, to handle log messages from the VMkernel, daemons, logger program and other programs and processes. For remote logging, log messages may be sent to a centralized logging system  and system panics can be sent to a remote dump collector. This allows for troubleshooting on the remote Syslog server due to the nature of log files being stored in memory if the ESXi host system is rebooted or crashes, the local log files are not available.

The VMkernel, daemons and programs will send log messages to the /dev/klog socket to which the vmsyslogd queries for incoming messages and on receipt the daemon is required to know what to do with the information.  This is handled by the configuration of the ESXi host system which can be modified using one of the following methods.

Configuring Syslog for an ESXi host using vSphere Web Client

1) Select the ESXi host system and browse to Manage > Settings > Advanced Settings

2) Modify one or more of the following syslog configuration options as below:

Name Value
Syslog.global.logDir Location to store local or remote syslog files
Syslog.global.logHost Destination host for remote log files
Syslog.global.logDirUnique A Boolean value to determine if a child directory for the ESXi host system is created
Syslog.global.defaultRotate Number of log files retainined on the local ESXi host system
Syslog.global.defaultSize Causes the log file to be rotated when it reaches the default size

 

Configuring Syslog for an ESXi host using the esxcli command line

1) Connect to an ESXi host system using a SSH client.

2) In order to modify the Syslog configuration using the esxcli system syslog namespace we can use one of the following set options:

--default-rotate=<long> Default number of rotated local logs to keep
--default-size=<long> Default size of local logs before rotation, in KiB
--default-timeout=<long> Default network retry timeout in seconds if a remote server fails to respond
--logdir=<str> The directory to output local logs to
--logdir-unique Place logs in a unique subdirectory of logdir, based on hostname
--loghost=<str> The remote host(s) to send logs to
--reset=<str> Reset values to default

For Example, if we are required to configure the size of the log file before rotation to be 2048 KiB, we can invoke the following:

esxcli system syslog config set --default-size=2048

3) In order to apply the configuration change the vmsyslogd is required to be restarted to load the changes into memory, as below:

esxcli system syslog reload

Installing and Configuring VMware Syslog Collector 

To configure a location for long term storage of log files we can install and configure a VMware Syslog Collector which on a Windows vCenter Server can be installed from the VMware vSphere installer media as follows:

1) Specify the install directory, repository location, size of the log file before rotation and the log rotation to keep and select Next.

Syslog-1

 

 

 

 

 

 

2) Select the installation type for the Syslog Collector and select Next. In this example we will be selecting a VMware vCenter Server installation.

Syslog-2

 

 

 

 

 

 

3) Specify the IP address, HTTP port and login credentials for the vCenter Server system and select Next.

Syslog-3

 

 

 

 

 

 

4) Specify the Syslog Collector port settings and select Next.

Syslog-4

 

 

 

 

 

 

5) Specify how the Syslog Collector should be identified on the network and select Next.

Syslog-5

 

 

 

 

 

 

6) Select Install.

Now that we have installed and configured the Syslog Collector we can now configure the ESXi host system to use a remote host. Firstly, we should confirm that the ESXi host system can communicate with the Syslog Collector as by default the Firewall service will enable outbound connections on TCP and UDP service ports 514.

We can confirm the connectivity on the ESXi host system by invoking the following command to determine if the host system can communicate with the remote host on TCP service port 514.

nc -z deanvc1.dean.local 514

Once we have verified a successful connection we can configure the syslog settings as follows:

esxcli system syslog config set --loghost=tcp://deanvc1.dean.local:514
esxcli system syslog reload

Logging – Part Three: Log Files

On an ESXi host system there a number of log files that can be used for troubleshooting, which contain configuration and operational errors as well as performance information. Some of the most important ESXi log files are listed as below. In order to view the log file this can be performed at the ESXi shell running a utility to view files such as tail which allows for changes to the file to be viewed as they are written to the log file. Alternatively, you can also log directly onto the ESXi host system using the vSphere Client to view the log files.

Log File Description
/var/log/auth.log ESXi shell authentication success and failure
/var/log/dhclient.log DHCP client logs, includes discovery, address lease, requests and renewals
/var/log/esxupdate.log ESXi pathches and update installation logs
/var/log/hostd.log Host management service logs includes virtual machine and host tasks and events, communication with the vSphere client and vCenter Server vpxa agent and SDK connections
/var/log/shell.lo Contains a history of all commands that have been run on the host
/var/log/sysboot.log VMkernel statup and module loading
/var/log/boot.gz Compressed file containing boot log information
/var/log/syslog.log Management service initialization, watch dogs, scheduled tasks and DCUI use
/var/log/usb.log USB device events, such as discovery and pass-through to virtual machines
/var/log/vob.log VMkernel observation events
/var/log/vmkernel.log Core VMkernel logs, includes device discovery, storage and networking device and driver events and virtual machine startup
/var/log/vmkwarning.log VMkernel warning and alert log messages
/var/log/vmksummary.log Summary of ESXi host startup and shutdown and hourly heartbeat with uptime, number of virtual machines running and service resource consumption

The vCenter Server system log files can be located by browsing to ‘C:\ProgramData\VMware\VMware VirtualCenter\Logs’ to which the following log files contain data to perform troubleshooting.

Log File Description
vpxd.log Communication data between the vCenter Server Agent (vpxa) on the ESXi host system
vpsx-profiler Operational and performance counters used to profile vCenter Server operation.
\drmdump\clusternnn\ DRS actions grouped by DRS cluster. The log files are gzipped.

 

Logging – Part Two: Configuring vCenter Server Logging Level

Depending on the level of information you require the vCenter Server System to collect you may decide to adjust the logging level option. The logging levels for vCenter Server are as follows:

  • None (Disable Logging) – Turns off logging
  • Error (Errors Only) – Displays only error log entries
  • Warning (Errors and Warnings) – Displays warning and error log entries
  • Information (Default Logging) – Displays information, error and warning log entries. This is the default and recommended setting.
  • Verbose (Verbose) – Displays information, error, warning and verbose log entries.
  • Trivia (Extended Verbose) – Displays information, error, warning, verbose and trivia log entries. This will generate the most detail and may be most beneficial in troubleshooting.

In order to modify the vCenter Server logging level using the vSphere Client:

1) Browse  to Manage > Settings > General and select Edit.

2) Select Logging.

3) Select the logging level required from the drop down menu and select OK.

LoggingLevel