Some milestones make you stop and think. receiving the VMware vExpert 2026 award — for the eighteenth consecutive year — is one of them.
Eighteen years. That's a long time in any industry, but in the world of enterprise virtualization and cloud infrastructure, eighteen years feels like several lifetimes. When I received my first vExpert recognition, VMware ESX was still a thing, vSphere was just emerging, and the idea of software-defined networking was barely on the horizon. Today, the conversation is about VMware Cloud Foundation, NSX, automation pipelines, and multi-cloud architectures. The technology has changed enormously. The community spirit, fortunately, has not.
What the vExpert Program Actually Means
The vExpert award isn't a certification. It's not something you study for or pass an exam to earn. It's a recognition by VMware — now part of Broadcom — of ongoing, real-world contributions to the broader VMware community. That can take many forms: blogging, presenting at VMUG events, creating training content, helping peers troubleshoot complex issues, or simply being a consistent and reliable voice in the community.
For me, it has always been about sharing what I learn in the field. Enterprise environments are messy, complex, and full of edge cases that no documentation ever quite covers. Over the years, the things I've written about, the sessions I've delivered, and the conversations I've had with fellow engineers have all come from that same place: "I just solved something hard, and someone else will run into this too."
That spirit of practical knowledge sharing is exactly what the vExpert program recognizes, and it's why being part of this community still feels meaningful after nearly two decades.
Looking Back — and Forward
Reflecting on eighteen years in the vExpert program means reflecting on eighteen years of the industry itself. I've watched VMware grow from a virtualization powerhouse into a full-stack enterprise platform vendor. I've seen NSX go from a niche SDN product to the backbone of zero-trust network segmentation in major enterprises. I've lived through the shift from manual deployments to infrastructure-as-code, from on-premises-only thinking to hybrid and multi-cloud reality.
Through all of it, the core of my work has remained consistent: helping organizations make sense of complex VMware environments, whether that means designing a VCF deployment, troubleshooting an elusive NSX routing issue, or building automation that makes life easier for platform teams.
What has changed is the pace. The ecosystem moves faster now. Broadcom's acquisition of VMware brought significant changes to licensing, product bundling, and the roadmap — changes that many customers are still navigating. That makes community knowledge sharing even more important than it used to be. When the vendor documentation lags behind reality, the community fills the gap.
What's Coming in 2026
My focus for the year ahead will be squarely on VMware Cloud Foundation and NSX, with a strong emphasis on automation and practical troubleshooting. VCF has become the strategic platform for many of the organizations I work with, and there's enormous demand for content that goes beyond the marketing layer — content that addresses real deployment challenges, upgrade paths, integration considerations, and day-two operations.
I'll also continue contributing to training. One of the most rewarding parts of this work is seeing someone go from confused to confident on a technology they've been struggling with. If a blog post, video, or training session makes that happen for even a handful of engineers, it's time well spent.
Thank You
To the vExpert program team — Corey Romero and the rest of the community and advocacy team — thank you for continuing to run a program that genuinely values practitioners. And to everyone in the VMware community who reads, comments, shares, or simply nods along because you've been in the same situation: this is for you.
Here's to year eighteen, and to whatever challenges year nineteen will bring.
Broadcom recently published an updated technical paper on VMware vSAN Stretched Clusters, written by Pete Koehler (December 2025). This comprehensive guide covers everything you need to know about designing, deploying, and operating stretched clusters across two geographically separated sites using VMware Cloud Foundation (VCF) 9.0 with vSAN 9.
In this article, I summarize the key concepts, architecture decisions, and best practices from the whitepaper, and share my own thoughts on what matters most when planning a vSAN stretched cluster deployment.
You can download the full whitepaper from Broadcom here.
What Is a vSAN Stretched Cluster?
A vSAN stretched cluster extends a single vSAN cluster across two physical sites (referred to as the preferred and secondary fault domains), with a lightweight witness appliance running at a third location. This architecture provides site-level disaster recovery with near-zero RPO and automated failover using native vSphere HA — without requiring third-party replication software.
The key benefit is that data is synchronously mirrored between the two sites, meaning that if one site goes down completely, the surviving site has a full copy of all data and workloads can be restarted automatically.
Supported Architectures: OSA and ESA
vSAN stretched clusters are supported on both the Original Storage Architecture (OSA) and the Express Storage Architecture (ESA). However, there are important differences.
With ESA, the storage architecture is more efficient by design. ESA uses a single-tier storage pool (NVMe only), which simplifies disk group management and eliminates the caching tier complexity found in OSA. For stretched clusters specifically, ESA supports adaptive erasure coding within each site — meaning you can use RAID-5 or RAID-6 locally at each site while still mirroring data across sites. This combination provides excellent space efficiency without sacrificing resilience.
OSA stretched clusters still work well, but ESA is the recommended path forward for new deployments.
The Witness Appliance
The witness appliance is a critical component of any vSAN stretched cluster. It does not store actual VM data — instead, it holds witness components that act as a tiebreaker during network partitions or site failures. The witness must be deployed at a third location, separate from both data sites.
Key points about the witness:
It runs as a small virtual appliance (tiny, medium, or large, depending on the number of components)
It requires network connectivity to both data sites, but latency requirements are more relaxed (up to 200ms RTT to each site)
It should never run inside the stretched cluster itself
Multiple stretched clusters can share a single witness host, but each cluster needs its own witness appliance
The witness does not need high bandwidth — 100 Mbps is sufficient for most deployments
A common mistake is placing the witness at one of the data sites. This defeats the purpose of having a third fault domain and creates a single point of failure during site isolation events.
Network Requirements
Networking is arguably the most critical design consideration for vSAN stretched clusters. The two data sites must have low-latency, high-bandwidth connectivity between them.
The requirements are straightforward but non-negotiable. Between the two data sites, you need a maximum of 5ms RTT latency for vSAN traffic. A minimum of 10 Gbps bandwidth for vSAN traffic is required, though 25 Gbps is recommended. Between each data site and the witness, up to 200ms RTT latency is acceptable, and 100 Mbps bandwidth is sufficient.
vSAN traffic between sites should be on a dedicated or isolated network segment, and it is highly recommended to use jumbo frames (MTU 9000) for optimal performance. For ESA deployments with RDMA, both sites need to support RoCE v2 within each site (RDMA is not used across sites).
Fault Domains and Site Affinity
In a vSAN stretched cluster, you define two fault domains — one for each data site. Every ESXi host is assigned to either the preferred or secondary fault domain. The witness appliance operates as its own implicit fault domain.
When a VM is created, vSAN places one full copy of data at the preferred site and one full copy at the secondary site, plus a witness component on the witness appliance. This ensures that any single site failure still leaves a quorum of components available.
Site affinity is an important concept for workloads that should preferably run at a specific site. You can use VM-Host affinity rules in DRS to keep VMs at their designated site during normal operations, while still allowing failover to the other site during a disaster.
vSphere HA and DRS Configuration
Getting the HA and DRS settings right is essential for proper stretched cluster behavior.
For vSphere HA, the recommendation is to set the admission control policy to 50% for both CPU and memory. This reserves enough capacity at each site to absorb the full workload from the other site during a failure. Host monitoring should use vSAN network heartbeating, and the isolation response should be set to "Power off and restart VMs."
For DRS, the automation level should be set to "Fully Automated." Use VM-Host affinity rules (should rules, not must rules) to prefer VMs at their designated site. This allows DRS to override placement during a failover event. During normal operations, DRS will respect the affinity rules and keep VMs at their preferred site.
Storage Policies for Stretched Clusters
vSAN stretched clusters use a specific storage policy setting: the site disaster tolerance policy. This is set separately from the standard FTT (failures to tolerate) setting.
The site disaster tolerance defines how data is mirrored across sites. The most common configuration is "Site mirroring" (also called dual site mirroring), which creates one full copy at each data site.
Within each site, you can additionally configure local FTT protection. For example, on ESA you can set FTT=1 with RAID-5 at each site, combined with site mirroring across sites. This means data is protected against both a full site failure and an additional host failure at the surviving site.
This layered protection model is one of the major advantages of vSAN stretched clusters — you get both local resilience and site-level disaster recovery in a single, unified storage policy.
Maintenance and Day-2 Operations
Operating a stretched cluster requires understanding the impact of maintenance activities at each site.
When placing a host in maintenance mode, you should choose "Ensure accessibility" for routine maintenance tasks. This avoids unnecessary data migrations between sites. For permanent host removal, use "Full data migration" to evacuate all components.
During a planned site maintenance event (such as power maintenance at one data center), you can use the "Decommission Fault Domain" workflow. This gracefully migrates all VMs to the other site and ensures data integrity before the site goes offline.
Lifecycle management through VCF SDDC Manager handles firmware and software upgrades in a rolling fashion, maintaining availability throughout the update process.
Failure Scenarios and Recovery
The whitepaper covers multiple failure scenarios in detail. Here are the most important ones.
In a single host failure, VMs are restarted on remaining hosts at the same site. vSAN rebuilds missing components using hosts at the same site if possible.
In a full site failure, vSphere HA restarts all affected VMs at the surviving site. Because a full data copy exists at the surviving site, VMs can restart immediately without waiting for data rebuilds.
In a network partition between sites, the preferred site retains quorum (because it has data + witness), and VMs at the secondary site are powered off and restarted at the preferred site. This is why the "preferred" designation matters.
In a witness isolation, both data sites continue operating normally. No VM impact occurs. The witness is only needed during site-level events.
My Recommendations
Based on my experience with vSAN stretched clusters, here are a few practical recommendations.
First, invest in network quality. The number one cause of stretched cluster issues is network problems between sites. Ensure you have redundant, low-latency links with proper QoS for vSAN traffic.
Second, use ESA if you are deploying new infrastructure. The performance and efficiency benefits of ESA are significant, especially the ability to use erasure coding within each site.
Third, test your failure scenarios. Before going into production, simulate site failures, network partitions, and witness outages. Verify that HA and DRS behave as expected.
Fourth, document your affinity rules. Keep a clear mapping of which VMs belong to which site, and review this regularly as workloads change.
Finally, size your witness appropriately. For large environments with many VMs, use the large witness appliance to handle the additional component count.
Conclusion
vSAN stretched clusters remain one of the most elegant solutions for site-level disaster recovery in a VMware environment. With VCF 9.0 and vSAN 9, the technology has matured significantly — particularly with ESA bringing better performance and simpler operations.
The official Broadcom whitepaper is an excellent resource for anyone planning or operating a stretched cluster. I highly recommend reading it in full.
Broadcom has announced three specialized VMware Certified Advanced Professional (VCAP) certifications for VMware Cloud Foundation, coming very soon. These new credentials - Administrator, Architect, and Support - validate practical, role-specific expertise in full-stack private cloud operations.
As organizations consolidate infrastructure investments around integrated platforms, VCF has become the strategic foundation for enterprise workloads. These certification paths reflect this shift, moving beyond component-level knowledge toward holistic platform competency.
The New VCAP-VCF Certification Trio
Broadcom has launched three distinct advanced professional certifications, each targeting a specific operational role within VCF environments:
VMware Certified Advanced Professional – VMware Cloud Foundation Administrator (3V0-11.26)
This certification validates the skills required to manage day-to-day operations across the VCF stack. Administrators holding this credential demonstrate proficiency in lifecycle management, workload domain operations, automation workflows, and platform monitoring. The exam focuses on operational tasks that keep VCF environments running efficiently—from deploying new services to managing capacity and ensuring compliance with organizational policies.
VMware Certified Advanced Professional – VMware Cloud Foundation Architect (3V0-12.26)
The Architect track addresses design and strategic planning for VCF deployments. This certification confirms expertise in architecting scalable, resilient private cloud infrastructures that align with business requirements. Architects are expected to make informed decisions about workload domain strategies, network segmentation, storage architecture, and disaster recovery design. The exam emphasizes design principles, solution sizing, and architectural patterns specific to VCF environments.
VMware Certified Advanced Professional – VMware Cloud Foundation Support (3V0-13.26)
The Support certification is tailored for professionals responsible for troubleshooting and maintaining VCF platforms. This credential validates hands-on diagnostic skills, performance optimization techniques, and root cause analysis capabilities. Support engineers need deep technical knowledge of VCF components, log analysis, and remediation procedures to maintain service level objectives and resolve complex incidents efficiently.
Availability
All three exams are listed as "coming very soon" on the Broadcom certification website, with registration expected to open in the near future. Each exam follows a consistent format: 60 questions delivered over 135 minutes, with a passing score of 300 on a scaled scoring system. Each exam costs $250 USD and will be proctored through Pearson VUE.
Why These Certifications Matter
Role-specific VCF certifications reflect a fundamental shift from managing individual components to operating integrated platforms. VCF professionals need full-stack understanding—how vSphere compute, NSX networking, vSAN storage, and automation interact as a unified system.
Each exam covers common technical foundations but evaluates them differently: administrators on operational execution, architects on design decisions, and support engineers on diagnostic methodologies. This structure aligns certification with real-world responsibilities.
For enterprises, these credentials provide hiring benchmarks and skills validation. As VCF adoption accelerates in sovereign cloud and security-focused deployments, organizations need verified expertise. The VCAP-VCF certifications deliver that assurance.
Role-Specific Expertise Breakdown
Administrator Track: Operational Excellence
The Administrator certification validates day-2 operations: workload domain lifecycle management, certificate rotation, password management, and platform upgrades. Administrators demonstrate competency with VCF Operations monitoring, diagnostic findings, and health checks.
Automation capabilities are critical—VCF environments operate through declarative APIs and workflows. The exam tests practical knowledge of automation frameworks, API interactions, and orchestration tool integration. Network and storage operations, including network pool management, distributed switch configuration, and vSAN health monitoring, complete the operational skillset.
Support Track: Diagnostic Mastery
The Support certification emphasizes troubleshooting methodology across compute, network, storage, and platform services. Support engineers must master VCF log structures, diagnostic tools, and common failure patterns.
Performance optimization extends beyond component-level analysis to understanding integrated stack behavior under load. Root cause analysis skills—correlating symptoms across VCF layers, interpreting logs, and applying structured troubleshooting—are central to this track. The exam validates when issues require internal resolution versus vendor escalation.
Architect Track: Strategic Design
The Architect certification validates design decisions impacting VCF deployment success: workload domain models, network segmentation strategies, and scalability planning. Architects balance technical constraints with business requirements, making informed trade-offs between performance, cost, and operational complexity.
Resilience and disaster recovery design are substantial exam components, including high availability mechanisms, backup strategies, and multi-site deployment patterns. Capacity planning—forecasting growth, cluster sizing, and workload distribution—completes the architectural skillset.
Market Impact and Industry Relevance
These certifications align with organizations consolidating infrastructure investments toward integrated private cloud platforms. VCF addresses this need through a validated stack bundling compute, network, storage, and management.
Sovereign cloud initiatives drive VCF adoption in regions with data residency requirements. Government agencies and regulated industries need private cloud solutions offering public cloud automation with complete infrastructure control. Security-focused deployments leverage VCF's integrated security posture, including NSX micro-segmentation and vSAN encryption, combined with centralized lifecycle management.
Large-scale deployments benefit from VCF's operational scalability through declarative APIs and automated lifecycle management—critical for organizations managing hundreds or thousands of workloads.
Who Should Pursue These Certifications
These VCAP certifications target mid to senior-level professionals with hands-on VCF experience. The exams assume foundational knowledge validated by VCP-VCF credentials and test advanced, practical skills developed through real-world platform management.
Administrators currently managing VCF environments should consider the Administrator track to validate their operational expertise. This certification confirms proficiency with the daily tasks required to maintain VCF deployments and positions professionals for senior operational roles.
Engineers responsible for troubleshooting and maintaining VCF platforms benefit from the Support certification. This credential demonstrates diagnostic capabilities and technical problem-solving skills valued in support engineering and site reliability engineering positions.
Architects designing private cloud solutions should pursue the Architect track. This certification validates strategic design skills and positions professionals for senior architecture roles, pre-sales engineering, and consulting engagements.
For organizations, investing in certified staff reduces operational risk and improves platform reliability. Teams with validated VCF expertise deliver better outcomes, from smoother deployments to faster incident resolution.
Looking Forward
These specialized VCAP certifications signal Broadcom's commitment to VCF as the strategic private cloud platform. As VCF evolves with enhanced automation and deeper Tanzu and Aria integration, certification paths will advance alongside the platform.
For VMware professionals, these credentials differentiate expertise in a consolidating market. The shift from product-specific to integrated platform certifications reflects infrastructure's evolution toward full-stack requirements. As private cloud adoption scales, demand for certified professionals grows correspondingly.
The investment extends beyond credential acquisition—preparation builds capabilities translating directly to improved performance. The certification validates expertise, but real value lies in skills developed through the process.
I had the absolute honor of hosting this week's knowledge transfer sessions and mock exams for the VCF Professional Services Airlift event at the Sheraton Amsterdam Airport Hotel (Feb 2-6, 2026).
It was truly inspiring to facilitate the learning journey for so many dedicated professionals as they prepared for their VMware Cloud Foundation certifications. The energy, commitment, and expertise in the room was extraordinary!
I'm incredibly proud of all participants who successfully earned their VCF certifications this week - whether it was the VCF Administrator VCP, the VCF Architect VCP or VCAP. Your dedication and hard work paid off! 🎉
The week covered:
VMware Cloud Foundation: Build, Manage and Secure
VMware Cloud Foundation: Automate and Operate
VMware Cloud Foundation: Solution Architecture & Design
On a personal note, I'm thrilled to share that I also achieved two significant milestones myself during this week: ✅ VCAP - Storage ✅ VCAP - Networking