As containers revolutionize how applications are built and deployed, security challenges have grown just as quickly. The VMware Tanzu Platform offers a comprehensive, layered approach to container security—designed to protect applications throughout their entire lifecycle.
Why Container Security Matters
Containers are lightweight, fast, and efficient—but they come with unique risks. They share the host’s kernel and resources, making them more exposed than traditional virtual machines. Misconfigured containers, outdated images, and exposed secrets are just a few of the common vulnerabilities that attackers exploit.
Tanzu’s Approach to Container Isolation
Tanzu Platform enforces strong isolation by using Linux namespaces for each container. This includes separate IPC, PID, user, network, and mount namespaces, effectively shielding containers from one another and from the host system. Each container runs unprivileged by default, minimizing the risk of breakout attacks.
Filesystem and Resource Isolation
Using a combination of OverlayFS and XFS, Tanzu ensures each container has a read-only root filesystem and tightly controlled write access. It enforces disk quotas at the directory level and leverages cgroups to control CPU and memory usage. Tanzu also prevents fork-bomb attacks and restricts access to critical devices.
Reducing the Attack Surface
Tanzu eliminates unnecessary Linux capabilities to minimize privilege escalation opportunities. By default, capabilities like CAP_SYS_ADMIN
, CAP_NET_ADMIN
, and CAP_SYS_PTRACE
are dropped. Tanzu also hardens its OS stemcells and root filesystem, removing unnecessary packages and disabling risky protocols.
Defense-in-Depth with AppArmor and Seccomp
Tanzu uses AppArmor to enforce access controls at the process level, blocking access to sensitive files like /proc/kcore
and /proc/sysrq-trigger
. In addition, Seccomp filters system calls, allowing only the ones explicitly needed by the container, further reducing the risk of kernel-level exploits.
Protecting Secrets and Communications
Tanzu integrates CredHub for secure credential management and rotates application identity certificates every 24 hours. It also includes built-in support for mTLS (mutual TLS), ensuring encrypted communication between containers and the platform’s routing layer.
Immutable Infrastructure and Automated Patching
By treating containers and VMs as disposable and replacing them from golden images, Tanzu avoids configuration drift and speeds up patching. The platform supports zero-downtime upgrades using BOSH, enabling security updates without disrupting application availability.
Secure by Default
From buildpacks that manage application dependencies to sidecar proxies that enforce TLS, Tanzu ensures that security is built-in—not bolted on. The platform is constantly scanned for vulnerabilities, and patches are released rapidly and deployed automatically.
Conclusion
VMware Tanzu Platform provides an enterprise-grade foundation for securely running containerized applications. With its defense-in-depth model, Tanzu reduces risk at every layer of the stack—giving InfoSec and platform teams peace of mind and developers the freedom to innovate.
Sunday, July 6. 2025
Container Security in VMware Tanzu Platform: Defense in Depth for Modern Apps
Friday, July 4. 2025
New Release: VMware Cloud Foundation 9 Design Guide Now Available
The much-anticipated VMware Cloud Foundation (VCF) 9 Design Guide is now officially released, providing cloud architects, platform engineers, and VI admins with a comprehensive framework for designing robust, scalable, and efficient private cloud infrastructures using VCF.
Whether you're building a new deployment from scratch or optimizing an existing environment, this guide delivers actionable insights, structured blueprints, and a robust decision-making framework tailored to the latest VCF release
The guide is meticulously structured into several core sections, each designed to help you make informed decisions during every phase of your VCF design and deployment journey.
Explore a high-level overview of each VCF component — including compute, storage, and networking — along with the trade-offs, benefits, and implications of various architectural choices.
Jumpstart your implementation using pre-defined blueprints tailored to specific use cases. These blueprints provide end-to-end design recommendations that can be fully adopted or used as a foundation to customize based on your organizational needs.
Available blueprints include:
-
VCF Fleet in a Single Site with Minimal Footprint
-
VCF Fleet in a Single Site
-
VCF Fleet with Multiple Sites in a Single Region
-
VCF Fleet with Multiple Sites Across Multiple Regions
-
VCF Fleet with Multiple Sites in a Single Region plus Additional Regions
Dive deep into the design considerations for each individual VCF component. Each entry includes:
-
Design Requirements – Mandatory configurations necessary for VCF to operate as intended.
-
Design Recommendations – Best practices based on field experience and engineering validation.
-
Design Choices – Decision points where multiple valid options exist, along with guidance on when to choose each.
Tuesday, June 24. 2025
Resilience Capabilities for vSAN in VMware Cloud Foundation 9.0
vSAN availability technologies are designed to ensure data resilience and minimize downtime in the event of failures. These technologies work in tandem with VMware vSphere HA and the storage policy-based management (SPBM) framework to maintain the integrity of data and the availability of virtual machines. Understanding how vSAN handles failures at various levels is critical to maintaining a highly available environment.
vSAN achieves data availability by distributing data components across multiple hosts and fault domains based on policies. The most commonly used policy is failures to tolerate. This defines the number of host, disk, or fault domain failures a virtual machine object can withstand without data loss. For example, a RAID-1 mirror with one failure to tolerate (FTT=1) stores two copies of data on separate hosts. If one host fails, the other retains availability.
Beyond mirroring, vSAN also supports erasure coding with RAID-5 and RAID-6 configurations. RAID-5 supports one failure while RAID-6 tolerates two. These configurations offer space efficiency but require more hosts to satisfy policy requirements. Erasure coding is ideal for capacity-focused workloads that can tolerate slightly higher write latencies compared to mirroring.
One of the key availability mechanisms is vSAN’s ability to detect and respond to failures. When a component becomes inaccessible, vSAN determines whether the failure is transient or permanent. It uses a time-based mechanism to wait before rebuilding data, preventing unnecessary resyncs due to brief outages. The default delay is 60 minutes, after which a repair operation may commence if the component remains unavailable.
In addition to failure handling, vSAN includes automatic rebalancing features. When disk space across the cluster becomes uneven or after repairs, vSAN may redistribute components to optimize performance and avoid hotspots. This helps ensure balanced use of storage resources and sustained performance.
Witness components play an important role in maintaining quorum during failures. These are small metadata objects that participate in cluster decisions. For example, in a 2-node configuration, a witness resides on a separate host or appliance to act as a tie-breaker. This ensures that split-brain scenarios are avoided, and availability decisions are made accurately.
vSAN also incorporates features to protect against storage device failures. It can mark devices as degraded or failed and migrate components as needed. When a disk exhibits signs of wear or IO errors, vSAN initiates evacuation processes and alerts administrators via health checks and vCenter alarms.
Stretched cluster configurations allow for availability across two sites with a third witness site. Each site has its own fault domain, and vSAN ensures synchronous writes between them. In the event of a site failure, vSAN maintains availability by failing over to the other site. The witness in a third site helps determine which site should remain active, ensuring consistency and preventing data corruption.
Maintenance mode operations are also availability-aware. vSAN supports three modes: ensure accessibility, evacuate all data, and no data migration. The chosen mode affects whether VMs remain accessible during host maintenance and how data is protected. For example, ensure accessibility keeps data available but may reduce resilience until the host returns. Evacuate all data fully preserves availability by migrating components off the host in advance.
Proactive monitoring and alerting are critical to vSAN availability. Skyline Health checks for cluster, network, and hardware issues, while vCenter provides event logs and alerts for deeper investigation. Administrators can define alarms for capacity thresholds, degraded hardware, and other risk indicators to prevent outages.
In summary, vSAN offers a comprehensive set of availability technologies that span hardware failure detection, policy-driven resilience, automated repair and rebalancing, and integration with vSphere HA. These mechanisms ensure that virtualized workloads remain protected, resilient, and highly available, even in the face of component or site failures.
Diagnosing and mitigating vSAN performance issues in VCF 9.0 environments
Troubleshooting vSAN performance requires a structured approach that considers the various components involved in a vSAN cluster. Performance issues can originate from compute, network, or storage layers, and identifying the root cause demands visibility into all three. Administrators should begin with clear problem definitions. Determine whether the issue is affecting a single VM, a set of workloads, or the entire cluster. It’s important to assess whether the issue is read or write related, and whether it is consistent or intermittent. These distinctions help narrow down the scope and focus the investigation.
vSAN Health and Skyline Health provide the first layer of diagnostics. These tools can indicate common misconfigurations, hardware issues, and network latency. Health checks should be reviewed for signs of contention, failed components, or cluster imbalance. Next, use vSAN Performance Service to gather metrics on IOPS, latency, and throughput at various levels including the cluster, disk group, and object layers. This visibility helps identify hotspots or bottlenecks that may not be immediately apparent.
Key metrics to analyze include write buffer usage, backend throughput, and resync activity. Elevated resync traffic from recent host failures or policy changes can reduce available bandwidth for regular I/O. Administrators should examine whether the congestion is due to background operations such as data rebuilds, rebalancing, or deduplication tasks. Disk group saturation, especially when cache drives are full or underperforming, is another common cause of elevated latency.
Network troubleshooting is essential. Check for dropped packets, high retransmissions, or latency spikes. vSAN is sensitive to network performance, and even minor disruptions can lead to delays in data replication or acknowledgments. Confirm that network hardware supports adequate throughput and low latency. Ensure all vSAN VMkernel interfaces are properly configured, with consistent MTU settings and failover policies.
Another important area to investigate is workload behavior. Some performance issues are tied to application characteristics rather than infrastructure faults. Identify whether the workload is generating large block writes, unaligned I/O, or excessive metadata operations. Test whether performance improves when running on alternative hardware or in a different vSAN cluster. This can help isolate whether the issue is systemic or localized.
Where possible, leverage performance graphs in vCenter, esxtop, or vSAN Observer for deeper analysis. Esxtop can be used to monitor CPU, memory, and disk contention in real time. Look for signs such as high device latency (DAVG), queue latency (QAVG), or kernel latency (KAVG). vSAN Observer can display the internal behavior of vSAN such as component ownership and object layout, giving insights into imbalance or misdistribution of data.
If a performance problem remains unresolved, consider collecting support bundles and engaging VMware GSS. Include a clear problem statement, the specific time range of impact, and any changes made in the environment that may have contributed. VMware recommends enabling advanced logging options before recreating the issue to capture sufficient detail.
Proactive steps can also prevent performance issues. Keep firmware and drivers up to date, validate hardware compatibility using the VMware Compatibility Guide, and avoid unnecessary policy changes during peak workloads. Regularly review capacity usage and aim to maintain at least 30% free space in vSAN to allow room for internal operations.
By combining structured analysis, proper tooling, and best practices, administrators can identify and resolve vSAN performance issues efficiently and minimize downtime across critical workloads.
Delivering Integrated File Services with vSAN in VMware Cloud Foundation 9.0
vSAN File Services in VMware Cloud Foundation 9.0 introduces a powerful and streamlined way for administrators to provide file sharing capabilities directly from within the hypervisor layer. Whether serving SMB shares to Windows clients or NFS exports to Linux systems and cloud-native applications, vSAN File Services eliminates the need for traditional filers or separate virtual appliances. Its integration with vSphere makes deployment and ongoing administration simple, with configuration and management fully accessible through the vSphere Client UI.
This built-in service supports up to 500 file shares per cluster, with a limit of 100 SMB shares for Windows environments. Shares are intelligently distributed across the cluster using vSAN’s Cluster-Level Object Manager, and a dedicated protocol services layer ensures fair access and automatic load balancing. vSAN File Services supports Kerberos authentication for both NFS and SMB protocols, enhancing security and integration with enterprise environments.
Mounting shares is simplified through guided syntax in the UI, providing administrators with exact commands for mounting in Linux or connecting in Windows. This feature accounts for protocol version differences, including NFS v3 and NFS v4.1, and handles the appropriate redirection for optimal access.
vSAN File Services supports multiple deployment models, including hyperconverged clusters, disaggregated storage clusters, stretched clusters, and 2-node clusters. This flexibility makes it suitable for everything from core data centers to edge sites. Administrators can configure placement policies to ensure optimal site affinity, and services are automatically balanced to maintain performance and availability.
From a management perspective, everything is handled centrally via vCenter. Administrators can adjust storage policies, set quotas, configure access controls, and monitor usage. Integrated health checks appear in Skyline Health, covering file server availability, infrastructure health, and share status. Performance metrics such as IOPS, latency, and throughput are provided per share, alongside capacity usage statistics.
Security and governance features include quota enforcement and Access Based Enumeration, which hides files and folders users do not have permission to access. These features help reduce data exposure and support compliance goals.
Architecturally, vSAN File Services runs protocol services as stateless containers within agent VMs on each host. These containers handle I/O presentation for NFS or SMB shares but do not store data. Instead, the underlying vSAN Virtual Distributed File System handles data placement and storage, using vSAN objects to represent shares and apply policies for resilience and availability.
The system uses a zero-copy data path to minimize latency and reduce processing overhead. Failover mechanisms automatically reinstantiate containers on healthy hosts if needed, and containers are rebalanced every 30 minutes based on the number of shares served.
Scaling is straightforward. Adding hosts increases both capacity and share distribution, while growing share size triggers automatic creation of additional backing objects. This allows vSAN File Services to scale up or out depending on workload requirements.
Key considerations include reserving IP addresses for containers, ensuring proper DNS setup, and selecting either NFS or SMB per share. vSAN File Services does not support presenting NFS datastores to ESXi for VM storage, and snapshot capabilities are only available via API. Replication must be performed using external tools such as rsync or Robocopy.
vSAN File Services is a modern, hypervisor-integrated solution for providing enterprise-grade file shares across a wide range of deployment scenarios. It reduces complexity, improves manageability, and offers flexible performance and capacity benefits without the cost and overhead of separate file servers or appliances. For organizations already using vSAN, enabling File Services can simplify infrastructure and consolidate storage services into a single platform.
Boosting Storage Efficiency with VMware vSAN in Cloud Foundation 9
As data volumes continue to grow, space efficiency becomes a key concern for modern IT infrastructures. With the release of VMware Cloud Foundation 9.0, organizations can take advantage of new capabilities in vSAN to significantly optimize storage consumption without compromising performance.
In this article, we explore the opportunistic and deterministic space efficiency features built into vSAN, comparing the capabilities of the Original Storage Architecture (OSA) and the newer Express Storage Architecture (ESA).
Opportunistic features deliver savings based on data compressibility and structure, but results may vary. Key capabilities include:
Compression (ESA): In ESA, compression happens immediately at the top of the vSAN stack, before data is written to disk or transmitted across the network. This drastically reduces CPU and network utilization while improving performance—especially in stretched clusters.
Global Deduplication (ESA): Unlike OSA, where deduplication is disk group-bound, ESA enables cluster-wide deduplication with adaptive throttling. Deduplication is performed post-write and prioritizes cold data, minimizing performance impact.
Deduplication & Compression (OSA): In OSA, deduplication and compression are performed during destaging, which may degrade performance depending on the capacity tier. It’s a viable option for legacy environments but introduces latency in high-throughput workloads.
Compression-Only Mode (OSA): Introduced in vSAN 7 U1, this mode offers a middle ground—providing space savings without the overhead of deduplication. It’s particularly beneficial for performance-sensitive environments using modern flash storage.
These features guarantee capacity savings by using RAID-level redundancy instead of simple mirroring.
RAID-5/6 (ESA): With ESA, RAID-5 and RAID-6 deliver deterministic space efficiency without performance penalties. For instance, RAID-5 (4+1) consumes just 1.25x the data footprint, making it ideal for clusters with 6+ nodes.
RAID-5/6 (OSA): OSA uses a 3+1 or 4+2 data placement strategy. While this offers a significant reduction in capacity consumption compared to RAID-1, it comes at the cost of higher CPU, network, and I/O overhead—especially under high write loads.
Additional vSAN Space Saving Tools:
-
Thin Provisioning: Efficient allocation of disk capacity on demand.
-
TRIM/UNMAP: Reclaims unused space after guest OS deletions.
-
Storage Policies: Fine-tuned controls for per-object resiliency and efficiency.
Recommendations:
Choose ESA whenever possible – It delivers superior space savings with no performance trade-offs.
Use Compression-only (OSA) for performance-sensitive workloads on legacy hardware.
Apply RAID-5/6 erasure coding in ESA to reduce capacity usage while maintaining high resilience.
Monitor space savings via vCenter Capacity View, and stay current with the latest vSAN releases to benefit from performance and efficiency enhancements.
With vSAN in VMware Cloud Foundation 9, organizations can confidently strike a balance between performance, resilience, and capacity optimization. Whether you’re modernizing with ESA or managing legacy OSA clusters, understanding and applying the right space efficiency techniques is key to maximizing your investment.
For hands-on workshops or custom vSAN training sessions, feel free to get in touch.
Tuesday, June 17. 2025
What's New in VCF 9 in 10 slides
What's New in
VCF 9
VCF 9 Overview
Simplifying Modern Infrastructure Deployment and Operations
Core Mission
VCF 9 streamlines the transition from siloed IT environments to a unified, integrated private cloud platform, making deployment, consumption, and management faster and easier than ever before.
Dual Persona Support
Key Improvements
Unified Management
Single system for entire infrastructure management
Enhanced Security
Advanced security features and compliance
Simplified Operations
Streamlined deployment and management workflows
Multi-Tenancy
Secure, isolated environments on shared infrastructure
Identity Management in VCF Operations
Configure identity federation for VCF with enhanced security and SSO capabilities
Capabilities
Configure identity federation for VCF
Single Sign On (SSO) for VCF stack
High-availability of identity broker
Service accounts for communications between VCF components
Integrations with industry-standard IDPs
Benefits
Choice to use the embedded identity broker in vCenter or deploy an external one through VCF Operations
Multiple VCF Identity Broker deployments for geo-separation or other use cases
Eliminate password-expiry problem for inter-service communication
Integrate with IDPs like Okta, PingIdentity etc
Certificate Management
Visibility, rotate, automate and schedule certificate management across VCF components
Capabilities
Centralized management of TLS certs of VCF components
Single Sign On (SSO) for VCF stack
OOTB alerts for certificate expiry
Automated workflows to replace certificates using MSCA, VMCA & OpenSSL CA
Third Party Signed Certificate support
Benefits
Single pane of glass view for VCF component certificates
Minimizes downtime risks due to expired certificates
Auto-renewal with non disruptive certificate upgrade for VCF components
Password Management
Centralized, automated password management across VCF components
Capabilities
Centralized password mgmt. of local accounts of VCF Components
Out of the box alerts and notifications for password expiry
Password update and rotations
Password Status Dashboard
Benefits
Overall visibility across VCF components
Simplifies management for administrators
Minimizes downtime risks and ensure compliance and security protocols
Configuration Management
Manage vCenter configuration using desired state templates
Capabilities
View drift summary across the environment
Monitor configuration across vCenters
Desired state template from vCenters
Reporting / notification based on policy
Integration with Git repository
Benefits
Detect configuration drifts across the environment
Schedule template-based drift detection
View all drifts in a single console
Control template versions with source control integration
Configuration Status
Tag Management
Centralized View and Management of tags across VCF components
Capabilities
Create, edit, delete categories and tags from a single pane of glass
Import brownfield categories and tags from vCenters and evaluate conflicts
Push categories and tags to vCenters
Tag Categories
Benefits
Single Pane of Glass for Tag Management
Centralized consistent behaviour across services to create and manage tags
Easy identification and elimination of duplicate tags across vCenters
Pushed tags are persisted across vCenters after vMotion
Tag Management Workflow
Integrated Operations Suite
Comprehensive operations capabilities for VCF environments
VCF Health & Diagnostics
Discover and remediate issues impacting VCF software
- • Single console to diagnose known issues
- • View security risks based on CVE
- • Curated remediation steps
Integrated Log Operations
In-context logs for monitoring and troubleshooting
- • Auto collect logs from all VCF components
- • Powerful queries and visualization
- • Create alerts based on operational data
Integrated Network Operations
Network monitoring and troubleshooting
- • Overview of VCF network inventory
- • Monitor health of network components
- • Traffic summary and flow analysis
Storage Operations
Unified operations across storage technologies
- • Federated view of storage components
- • vSAN cluster health monitoring
- • Performance insights and planning
Security Operations
User and Infrastructure Security
- • Holistic view of security stance
- • Overview dashboard for security
- • VCF deployment security posture
Troubleshooting & Observability
Enhanced visibility and faster issue resolution
- • Faster time to value for customers
- • Accelerated troubleshooting
- • Reduced support requests
Compute Enhancements
Advanced compute capabilities for modern workloads
Advanced Memory Tiering with NVMe
Optimizes memory management by offloading cold data to NVMe storage while keeping hot data in DRAM.
Confidential Computing
Leveraging Intel TDX and AMD's SEV-SNP for advanced security by isolating and encrypting workloads.
vSphere Kubernetes Service Enhancements
Key Benefits
Enhanced Performance
Better resource utilization and workload efficiency
Advanced Security
Hardware-level encryption and isolation
Container Flexibility
Support for diverse containerized applications
Storage & Networking Improvements
Enhanced vSAN capabilities and NSX networking innovations
vSAN Storage Enhancements
Native vSAN-to-vSAN Data Protection with Deep Snapshots
Integrated vSAN Global Deduplication
vSAN ESA Stretched Site Recovery
Business continuity during dual-site failures
NSX Networking Innovations
Native VPCs in vCenter and VCF Automation
Simplifies creation and management of secure, isolated networks
High-Performance Network Switching with NSX Enhanced Data Path
Easy Transition from VLAN to VPC
Streamlines migration to modern network architecture
Combined Benefits
What’s New in VMware Cloud Foundation 9
VMware Cloud Foundation (VCF) 9 is here—and it’s a game-changer for private cloud operations. With an architecture built for simplicity, security, and unified management, VCF 9 addresses long-standing operational pain points and sets the new standard for modern datacenter automation. Whether you're a cloud administrator or a platform engineer, this release brings capabilities that dramatically reduce overhead and enhance control.
A Unified Vision for Modern Infrastructure
At its core, VCF 9 delivers on the promise of a unified private cloud. It eliminates traditional silos by offering a single control plane for deploying, consuming, and managing infrastructure at scale. Whether you're building greenfield environments or managing brownfield deployments, VCF 9 adapts to your operational model.
Highlights of VCF 9
🔐 Enhanced Security
-
Confidential Computing: Leverages Intel TDX and AMD SEV-SNP for hardware-enforced workload isolation.
-
Identity Federation: Supports SSO and high-availability configurations with integration to identity providers like Okta and PingIdentity.
-
Certificate Lifecycle Management: Central dashboard for visibility, rotation, and automation across all VCF components.
-
Centralized Password Management: Tracks and rotates local passwords across the stack with visibility dashboards and expiry alerts.
🚀 Simplified Operations
-
Tag Management: Create, edit, and synchronize tags from a single interface with vMotion persistence.
-
Configuration Management: Monitor and enforce desired state configurations across vCenters, including drift detection with GitOps-style workflows.
-
Integrated Operations Suite:
-
Health & Diagnostics: Diagnose VCF stack issues with curated remediation steps and CVE exposure insights.
-
Log Management: Auto-collection, powerful querying, and alerting across VCF services.
-
Network Operations: Monitor health, traffic flow, and inventory for NSX-based components.
-
Storage Visibility: Unified view of vSAN health, deduplication, performance, and cluster status.
-
🧠 Compute & Kubernetes Enhancements
-
Advanced Memory Tiering: Offloads cold data to NVMe, yielding a 40% improvement in server consolidation.
-
Windows Container Support: Fully integrated Kubernetes Service now includes Windows containerization and OVF support.
-
Direct Networking for Containers: Native VPC integration enables simplified and secure container networking.
🗄️ Storage & Networking Innovations
-
vSAN Deep Snapshots & RPO: Achieve 1-minute RPO with native vSAN-to-vSAN protection.
-
Global Deduplication: Reduces cost per TB by up to 46%.
-
vSAN ESA Stretched Clusters: Ensures continuity across dual-site failure scenarios.
-
NSX VPCs: Native VPCs simplify network design and accelerate VLAN-to-VPC transitions.
-
Enhanced Data Path (EDP): NSX delivers up to 3x improved switching performance for high-throughput workloads.
Real Benefits, Real Impact
Area | Outcome |
---|---|
🔐 Security | Hardware-based isolation, federated identity |
⚙️ Operations | Drift detection, log observability |
📊 Visibility | Single-pane monitoring and compliance |
💾 Storage Costs | Up to 46% savings via deduplication |
📡 Network | 3x performance, simplified architecture |
Designed for Dual Personas
VCF 9 recognizes the distinct needs of Cloud Administrators and Platform Engineers, offering:
-
Automation and infrastructure lifecycle management for operators.
-
Consumption-ready, secure environments for developers and tenants.
Final Thoughts
With VCF 9, VMware pushes beyond incremental improvements. It represents a comprehensive redesign of how modern infrastructure should be deployed, secured, and operated. From identity to observability, from vSAN to NSX, and from SSO to GitOps—the platform has been reengineered for agility and resilience.
Stay tuned for more deep-dives and how-to guides once the embargo lifts after the official release this Tuesday.
Wednesday, May 28. 2025
New Cerification: VMware Certified Professional - Private Cloud Security Administrator

What does the certification cover?
The VCP – Private Cloud Security Administrator validates your ability to secure a private cloud environment using:
-
distributed and gateway firewalls
-
advanced threat prevention
-
security intelligence
-
zero-trust architecture with VMware vDefend
How to earn the certification
The certification path includes two steps:
Step 1: Gain experience and complete training
-
Take the VMware vDefend Security for VCF 5.x Administrator training course
-
Gain hands-on experience with VMware vDefend in real-world scenarios
Step 2: Pass the certification exam
-
Exam title: VMware vDefend Security for VCF 5.x Administrator (6V0-21.25)
-
Duration: 90 minutes
-
Number of questions: 75
-
Passing score: 70%
-
Format: multiple-choice and multiple-select
-
Language: English
-
Price: $250 USD
-
Exam registration: available through the VMware certification portal
My experience
I took the exam last Friday and successfully passed it. It was a tough exam, featuring a number of nuanced and tricky questions that tested both theoretical knowledge and practical insight. Proper preparation is key — if you're planning to take the exam, I strongly recommend reviewing the NSX 4.x training again, even though “vDefend” is now the updated branding for NSX security features within VCF.
Also worth noting: the description on the Credly badge for this certification currently suggests that attending the training course alone is sufficient to earn the badge. This is incorrect — passing the certification exam is required. Furthermore, the training is only available to VMware partners and paying customers, making self-study and hands-on lab experience even more critical.
What’s new: Detailed Score Report
Another recent and useful addition is the provisional examination score report, which breaks down your performance per domain, based on the sections listed in the official exam guide. This provides valuable insight into your strengths and areas for improvement.
For example, my section scores included:
-
VMware vDefend Firewall Architecture: 75%
-
Lateral Protection with vDefend: 100%
-
Security Automation: 50%
-
IDPS & Threat Prevention: 83%
-
Network Detection and Response (NTA/NDR): 78%
This detailed feedback makes the exam not just a validation but also a learning opportunity — helping professionals identify which NSX security components may need further study or practice. The Exam Guide contains the full list.
It’s also highly recommended to study the sample questions provided in the official exam guide very carefully — some of them (or very similar variants) may appear on the actual exam. Reviewing these questions can give you a clear idea of how VMware frames its scenarios and where the focus areas lie.
Tuesday, April 15. 2025
InfiniBand on VMware vSphere 8: Updated Setup and Performance Insights
With the increasing demand for high-performance networking in virtualized environments, configuring InfiniBand on VMware vSphere 8 has become a critical task for many IT teams. This blog explores the updated process and considerations for setting up InfiniBand using the latest tools and practices, providing insights into performance, SR-IOV configuration, and common troubleshooting scenarios.
The integration of InfiniBand on vSphere 8 has been streamlined with enhancements to the vSphere Lifecycle Manager and improved native driver support. For those familiar with vSphere 7, many procedures remain consistent, but vSphere 8 introduces some important changes worth noting.
The first step is to confirm that the native Mellanox driver is already present on the ESXi host. With vSphere 8, the driver comes pre-installed, and verification can be done using a simple esxcli command.
Installing Mellanox Firmware Tools, specifically MFT and NMST, is now much easier thanks to vSphere Lifecycle Manager. Instead of deploying packages manually across hosts, admins can use Lifecycle Manager to import and remediate clusters efficiently. These packages can be downloaded from NVIDIA’s website, and after uploading them to the vSphere Client, the entire cluster can be updated with minimal downtime.
In some cases, InfiniBand cards may not be visible via the mst status command after installation. This can typically be resolved by putting the native Mellanox driver into recovery mode using specific esxcli module parameters, followed by a couple of host reboots.
Enabling SR-IOV is particularly relevant for workloads like large language model training that require multiple IB cards. A script using mlxconfig can be used to enable advanced PCI settings and configure virtual functions on each device. It's important to remember that in vSphere 8, InfiniBand VFs may appear as 'Down' in the UI, which is expected behavior. The actual link state should be verified at the VM level.
For environments that continue to use PCI passthrough rather than SR-IOV, disabling SR-IOV can be done with a similar script that reverts the card settings and resets the ESXi module parameters.
On the switching side, administrators should ensure their IB switches are running an up-to-date MLNX-OS. Compatibility between switch firmware and adapter firmware is key to avoiding communication issues. Enabling Open-SM virtualization support on the switches is also critical to support SR-IOV functionality.
The paper outlines several behavioral nuances when using passthrough versus SR-IOV. For instance, certain Open-SM utilities may not function when a virtual function is passed to a VM, which is normal. Likewise, mst status behavior may differ depending on whether a physical or virtual function is used.
Troubleshooting steps are also provided for cases where MLX cards fail to appear in the mst status output. These include reloading the appropriate kernel modules and signaling system processes, after which recovery mode is temporarily enabled until the next host reboot.
Performance tests revealed that unidirectional bandwidth reached 396.5 Gbps using four queue pairs, nearly saturating the theoretical line rate of the InfiniBand cards. Bidirectional bandwidth tests showed performance scaling up to 790 Gbps with two cards, confirming the setup’s ability to handle demanding HPC and AI workloads.
In conclusion, VMware vSphere 8 enhances the experience of deploying InfiniBand by introducing automation through Lifecycle Manager and retaining robust performance tuning capabilities. With updated best practices, simplified installation, and clear guidance on SR-IOV and troubleshooting, IT teams can now fully leverage InfiniBand’s potential in virtualized environments, including VMware Cloud Foundation.
This blog captures the essence of the technical paper authored by Yuankun Fu, who has a strong background in HPC and AI performance optimization within VMware. His guidance in this paper provides both practical instructions and valuable performance data for teams looking to adopt or enhance InfiniBand in their vSphere environments.