Monday, January 13. 2014
High co-stop values seen during virtual machine snapshot activities
The red virtual machine has a %CSTP value of 19.59%. This means that one fifth of the time this VM is waiting for something. I noticed the red VM had a snapshot of 30 Gigabytes. After committing (deleting) the snapshot and lowering the number of VCPU’s the virtual machine was running fine again.
This behaviour is also described in the knowledge base article “High co-stop values seen during virtual machine snapshot activities”. Duco Jaspars has also written an article about it. Virtual machine performance can be adversely affected during snapshot operations for a number of reasons, both due to their nature and due to environmental issues.
Snapshots introduce complexity to storage I/O. Due to the nature of snapshots, every read operation must traverse every snapshot disk and then the base disk in order to verify the appropriate disk block to return. Because these extended read operations are required, snapshots are the most performance-intensive disk format for virtual disks (as opposed to thin-provisioned, thick-provisioned, or eager-zeroed thick-provisioned virtual disks).
As storage I/O for snapshots grows, co-stop (%CSTP) values for a VM with multiple vCPUs can increase as the vCPUs wait on I/O completion. To reduce the high %CSTP values and increase virtual machine performance, consolidate any snapshots into the main virtual disk. After consolidation, the %CSTP value is reduced or eliminated and VM performance is improved.
Thursday, January 9. 2014
Are you afraid of the ballooning ghost?
What about this ballooning ghost? Well, when a virtual machine is configured with a memory limit lower than the amount of configured virtual machine memory, the VM will experience ballooning, compression and swapping. How does it work? If the virtual machine sees 3 gigabyte of configured memory and tries to access it, it will only get physical memory until the limit is reached. So if you configure a virtual machine limit at 2 Gigabyte and the virtual machine is trying to use 3 gigabyte, 1 gigabyte will be ballooned, compressed and eventually swapped.
You can easily track down virtual machine configured with a limit by using vMemory tab in RVTools. I’ve never heard a valid reason why people would use memory limits so get rid of them. This behaviour is also described in knowledge base article “Impact of virtual machine memory and CPU resource limits”.
When a memory limit is set lower than the virtual machine's provisioned memory, it is considered the upper boundary for the amount of physical memory that can be directly assigned to this particular virtual machine. The guest operating system is not aware of this limit, and it optimizes memory management options to the assigned memory size.
When the limit is reached or exceeded, the guest operating system can still request new pages, but due to the limit the VMkernel does not allow the guest to directly consume more physical memory and treats the virtual machine as if the resource is under contention. As such, memory reclamation techniques are used to enable the virtual machine to consume what it has requested. Depending on the amount of pages requested by the virtual machine, the VMkernel might, in the worst case scenario, resort to VMkernel swap to fulfil the request.
The VMkernel first tries to reclaim memory by inflating the Balloon Driver to let the guest memory manager decide what to page out. In ESX 4.1, the VMkernel also tries to compress memory pages before swapping them out. You can verify the impact of a memory limit by running esxtop and looking at MCTLSZd and MCTLTGT, SWCUR and SWTGT, and CACHEUSD.
Sunday, December 15. 2013
VMware Virtual SAN explained by Melina McLarty
Senior staff engineer Melina McLarty discusses Virtual SAN functionality which virtualizes local physical storage resources of ESXi hosts and turns them into storage pools that can be carved up and assigned to virtual machines and applications according to their quality of service requirements.
Friday, November 8. 2013
Video - VMware vSphere 5 Memory Management and Diagram
This video expands on the diagram provided in knowledge base article: "VMware vSphere 5 Memory Management and Monitoring diagram (2017642)". It provides a comprehensive look into the ESXi memory management mechanisms and reclamation methods, and also provides the relevant monitoring components in vCenter Server and the troubleshooting tools like ESXTOP.
Tuesday, July 23. 2013
VMware Knowledge Base (KB) - Linux 2.6 kernel-based virtual machines experience slow disk I/O performance (2011861)
If you're using Linux 2.6 kernel-based virtual machines and you're experiencing slow storage performance as compared to physical hosts, the might be a problem with the I/O Scheduler. As of the Linux 2.6 kernel, the default I/O Scheduler is Completely Fair Queuing (CFQ).
The default scheduler will affect all disk I/O for VMDK and RDM-based virtual storage solutions. In virtualized environments, it is often not beneficial to schedule I/O at both the host and guest layers.
If multiple guests use storage on a filesystem or block device managed by the host operating system, the host may be able to schedule I/O more efficiently because it is aware of requests from all guests and knows the physical layout of storage, which may not map linearly to the guests' virtual storage.
Saturday, September 29. 2012
Mike Laverick's VMwareWag - vSphere Storage 5.1 - Part 1 with Cormac Hogan
In part one Mike Laverick discusses with Cormac Hogan the new vSphere5.1 features on the area of storage including: VMFS File Sharing increased, Space Efficent Sparse Virtual Disks, New vSphere Storage APIs and Support for 5-node MSCS Clusters. Follow Cormac at http://www.cormachogan.com and at @VMwareStorage