Quantcast
Channel: VMware Communities : Popular Discussions - vSphere Hypervisor
Viewing all articles
Browse latest Browse all 47699

Loss in guest heartbeat

$
0
0
    I have ESXi 4.1.0 GA (build-260247) Kernel 4.1.0 (x86_64)       installed on Dell PowerEdge 2950 server (two Xeon CPU 5160 @       3.00GHz with 24GB memory). Datastore is on local storage (PERC 5/i       Integrated RAID Controller). Dell Server Administrator 6.3.0       System Management Software is installed on ESXi.
     Dell server has been thoroughly tested before installation, so       hardware malfunction is very unlikely.
     Ten virtual machines are running on this server. Guest OS on       this machines are Debian Lenny and CentOS 5.5 with latest updates,       and installed latest VMware tools with matching version.
     Everything seemed to be fine until we have configured ESXi to       send syslog messages and SNMP traps to our syslog and SNMP trap       server. There were disturbing messages from ESXi. Few times a day       ESXi sends messages regarding some of virtual machine. For       instance (syslog wrongfully shows time one hour later):


********** syslog  (grep -i heartbeat vmware.log)  ***********

Feb 24 03:10:50 <local4.info> 10.5.0.41 Hostd: [2011-02-24       03:10:50.718 707DAB90 verbose       'vm:/vmfs/volumes/4c91ff62-f82b4bf1-5081-00188b2f6167/aaa2.oktv.hr/aaa2.oktv.hr.vmx']        Updating current heartbeatStatus: yellow
Feb 24 03:10:50 <local4.info> 10.5.0.41 Vpxa: [2011-02-24       03:10:50.720 1EEF5B90 verbose 'App'] [VpxaHalVmHostagent] 304:       guestHeartbeatStatus changed to yellow

Feb 24 03:10:50 <local4.info> 10.5.0.41 Vpxa: [2011-02-24       03:10:50.720 1EEF5B90 verbose 'App'] [VpxaHalServices]       VmHeartbeatChange Event for vm(17) 304
Feb 24 03:10:50 <local4.info> 10.5.0.41 Vpxa: [2011-02-24       03:10:50.720 1EEF5B90 verbose 'App'] [VpxaInvtVmChangeListener]       Guest HeartbeatStatus Changed

Feb 24 03:11:20 <local4.info> 10.5.0.41 Hostd: [2011-02-24       03:11:20.718 70CD2B90 verbose       'vm:/vmfs/volumes/4c91ff62-f82b4bf1-5081-00188b2f6167/aaa2.oktv.hr/aaa2.oktv.hr.vmx']        Updating current heartbeatStatus: green
Feb 24 03:11:20 <local4.info> 10.5.0.41 Vpxa: [2011-02-24       03:11:20.720 1EF77B90 verbose 'App'] [VpxaHalVmHostagent] 304:       guestHeartbeatStatus changed to green

Feb 24 03:11:20 <local4.info> 10.5.0.41 Vpxa: [2011-02-24       03:11:20.721 1EF77B90 verbose 'App'] [VpxaHalServices]       VmHeartbeatChange Event for vm(17) 304
Feb 24 03:11:20 <local4.info> 10.5.0.41 Vpxa: [2011-02-24       03:11:20.721 1EF77B90 verbose 'App'] [VpxaInvtVmChangeListener]       Guest HeartbeatStatus Changed

Feb 24 03:11:20 <local4.info> 10.5.0.41 Vpxa: [2011-02-24       03:11:20.721 1EF77B90 verbose 'App'] [VpxaInvtHost] Increment       master gen. no to (88663): VmRuntime:GuestHeartbeatStatusChanged

******************************************************************************


********************************** SNMP trap       **********************************

Feb 24 04:10:50 <local3.warn> nstorage1 snmptrapd[26851]:       10.5.0.41: Enterprise Specific Trap       (VMWARE-VMINFO-MIB::vmwVmHBLost) Uptime: 19 days, 11:51:45.87,       VMWARE-VMINFO-MIB::vmwVmID = INTEGER: 1,       VMWARE-VMINFO-MIB::vmwVmConfigFilePath = STRING:       /vmfs/volumes/4c91ff62-f82b4bf1-5081-00188b2f6167/aaa2.oktv.hr/aaa2.oktv.hr.vmx,        VMWARE-VMINFO-MIB::vmwVmDisplayName.1 = STRING: aaa2.oktv.hr-2

Feb 24 04:11:20 <local3.warn> nstorage1 snmptrapd[26851]:       10.5.0.41: Enterprise Specific Trap       (VMWARE-VMINFO-MIB::vmwVmHBDetected) Uptime: 19 days, 11:52:15.88,       VMWARE-VMINFO-MIB::vmwVmID = INTEGER: 1,       VMWARE-VMINFO-MIB::vmwVmConfigFilePath = STRING:       /vmfs/volumes/4c91ff62-f82b4bf1-5081-00188b2f6167/aaa2.oktv.hr/aaa2.oktv.hr.vmx,        VMWARE-VMINFO-MIB::vmwVmDisplayName.1 = STRING: aaa2.oktv.hr-2

******************************************************************************

     vSphere client shows that server CPU usage is only 1/4 of       maximum, and memory usage is only 1/2 of maximum without       significant peaks. Guest OS system log doesn't notes any problems.       VMware says that heartbeat loss could be caused by improperly       installed VMware tools or unresponsive guest OS. I'm pretty sure       that guest OS and VMware tools are configured and updated       properly.

    
I have found on        VMware communities few posts with similar problem: "Lots of Lost       VM heartbeat snmp alerts" (http://communities.vmware.com/thread/196092)       and "Problem with VM Heartbeat / Heartbeat Alarms - Alerts all the       time?!"  (http://communities.vmware.com/thread/231717)       , but there was not any solution for this problem. In the end of       the second post tsc09 claims: "VMware have finally acknowledged       the problem as a bug (#616568).  Apparently it affect 4.1 too. No       expected release date for a fix yet.". I could not find officially       description of this bug on VMware web pages.

     I wonder is VMware
officially acknowledged this problem as a bug, and on where on       VMware is that bug description. If it is not a bug, what is       explanation for that behavior, and how to solve this problem. My biggest concern is VM with VoIP server on this ESXi.       We had experienced some problems with VoIP which we could not       explain. This messages could mean that sometimes virtual machines       just freezes for short period of time, and become unresponsive,       although ESXi has enought resources to run VM withouth problem!



Viewing all articles
Browse latest Browse all 47699

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>