Azure VMware Solution for Azure NetApp Files Data Warehouse Performance (2023)

  • article

This article provides performance considerations for Azure VMware Solution (AVS) datastore design and sizing when used with Azure NetApp Files. This content is intended for virtualization administrators, cloud architects, or storage architects.

The considerations outlined in this article can help you achieve the highest levels of performance from your cost-optimized applications.

Azure NetApp Files provides highly reliable, high-performance, and instantly scalable storage services for AVS. Testing included many different configurations between AVS and Azure NetApp Files. These tests were able to generate over 10,500 MiB/s and over 585,000 input/output operations per second (IOPS) using just four AVS/ESXi hosts and an Azure NetApp Files capacity pool.

Achieving Higher Storage Performance for AVS with NetApp Azure Archives

Provisioning multiple, possibly larger, data stores in one service tier can reduce costs while providing higher performance. The reason is due to load sharing between multiple TCP streams from the AVS host to multiple datastores. you can use itAzure NetApp Files for Azure VMware Solution TCO EstimatorEstimate potential cost savings by uploading an RVTools report or manually entering average VM sizes.

When deciding how to configure the data store, the easiest solution from an administrative perspective is to create an Azure NetApp Files store, mount it, and then place all the virtual machines. This strategy works well in many situations until more throughput or IOPS is required. .To determine different thresholds, the tests used a synthetic workload generator, the programfio, to evaluate a range of workloads for each scenario. This analysis can help you determine how to configure Azure NetApp Archive volumes as data stores to maximize performance and optimize cost.

before you start

For Azure NetApp Files performance data, see:

  • Azure NetApp Files: How to get the most out of cloud storage

    On the AVS host, create a network connection for each NFS datastore, similar to usingnconnect=1In the Linux tests mentioned in Section 6 (adjustment options). This fact is key to understanding how AVS scales performance so well across multiple data stores.

  • Azure NetApp Files data storage performance benchmarks for Azure VMware Solution

Test Methods

This section describes the methods used for testing.

Test scripts and iterations

This test follows the "four corners" methodology, which includes read and write operations for each sequential and random input/output (IO). Test variables included one-to-many AVS hosts, Azure NetApp Files datastores, VMs (per host), and VM disks (VMDKs) per VM. The following scaling data points were chosen to find the maximum throughput and IOPS for a given scenario:

  • Extended VMDKs, each in its own datastore, for a single virtual machine.
  • Scale the number of virtual machines per host on a single Azure NetApp storage.
  • A growing number of AVS hosts, each with a virtual machine sharing a single Azure NetApp Files storage.
  • Increasing number of Azure NetApp Files datastores, each VMDK equally distributed across AVS hosts.

Testing small and large block operations and iterating on sequential and random workloads ensures that all components in the compute, storage and networking stacks are tested "at the edge". To cover the four corners with block size and randomization, the following common combinations are used:

  • Sequential test 64 KB
    • Large file streaming workloads typically read and write large block sizes, as well as the default MSSQL extent size.
    • Large block tests generally yield the highest throughput (in MiB/s).
  • 8KB random test
    • This setting is a common block size for database software, including those from Microsoft, Oracle, and PostgreSQL.
    • Small block tests typically yield the highest number of IOPS.

use

This article covers testing for Azure NetApp Files only. It does not cover vSAN storage that comes with AVS.

environment details

The results in this paper were achieved using the following environment settings:

  • AVS host:
    • size:AV36
    • Number of hosts: 4
    • VMware ESXi version 7u3
  • AVS Private Cloud Connection - UltraPerformance Gateway with FastPath
  • Guest VM:
    • ΛΣ: Free 20.04
    • CPU/Memory: 16 vCPU/64 GB memory
    • Virtual SCSI SAS LSI controller with 16 GB OS disk on AVS vSAN datastore
    • Paravirtual SCSI driver for testing VMDK
    • LVM/disk configuration:
      • One physical volume per disk
      • One volume group per physical volume
      • One logical partition per volume group
      • One XFS file system per logical partition
  • AVS files in the Azure NetApp protocol:NFS version 3
  • Workload generator:fioVersion 3.16
  • Broken No.:Cable Analyzer

Test results

This section describes the results of the tests performed.

Single VM Scaling

When configuring the storage provided by a datastore in an AVS virtual machine, the impact of the file system layout must be considered. Configuring multiple VMDKs spread across multiple datastores provides the most available bandwidth. Configuring one-to-many VMDKs residing on a single datastore ensures maximum simplicity for backup and disaster recovery operations, at the expense of a lower performance ceiling. This article provides empirical data to help you make your decision.

To maximize performance, it is common to scale a single VM across multiple VMDKs and spread those VMDKs across multiple datastores. An NFS datastore can accelerate a single VM with only one or two VMDKs because it is attached to a single TCP connection to a given AVS host.

For example, engineers often provide one VMDK for database records, and then one or more VMDKs for database files. For multiple VMDKs, there are two options. The first option is to use each VDMK as a separate file system. The second option is to use a storage management utility such as LVM, MSSQL filegroups, or Oracle ASM to balance the I/O through striping across the VMDKs. When VMDKs are used as separate file systems, distributing workloads across multiple datastores is a manual effort and can be cumbersome. Workload scalability is achieved by distributing files across VMDKs using storage management utilities.

If you are splitting volumes across multiple disks, make sure your backup or disaster recovery software supports backing up multiple virtual disks at once. Since a single write is spread across multiple disks, the file system must ensure that the disks are "frozen" during backup or snapshot operations. Most modern filesystems include freeze or snapshot features such asxfs(xfs_freeze) and NTFS (Shadow Volume Copies), which can be utilized by backup software.

To see how well a single AVS virtual machine scales as more virtual disks are added, tests were performed with one, two, four, and eight datastores (each with a VMDK). The graph below shows that the average IOPS for a single disk is approximately 73,040 (from 100% write/0% read to 0% write/100% read). When this test was increased to two units, performance increased by 75.8% to 128,420 IOPS. Increasing to four drives starts to show diminishing returns from what a single virtual system at test scale can provide. Observed peak IOPS was 147,000 IOPS with 100% random reads.

Single Host Scale - Single Datastore

It is difficult to scale to increase the number of VMs driving IO from a single host to a single datastore. This is due to unique network traffic. When maximum performance is achieved for a given workload, it is usually the result of using a single queue when routing to the host's only NFS datastore over a single TCP connection. At 8 KB block size, when scaling from one VM with a single VMDK to four VMs with a total of 16 VMDKs (four per VM, all on one datastore), the total IOPS increased by 3 -16%.

Increasing the block size (to 64 KB) for large block workloads had similar results, peaking at 2,148 MiB/s (single VM, one VMDK) and 2,138 MiB/s (4 VMs, 16 VMDK).

Single host scaling - multiple datastores

In the context of a single AVS host, while a single datastore allowed a virtual machine to drive approximately 76,000 IOPS, spreading the workload across two datastores increased overall performance by an average of 76%. Going from two to four data warehouses is a 163% increase (over one data warehouse, a 49% increase from two to four), as shown in the chart below. While there are still performance improvements, there are diminishing returns to deploying more than 8 datastores.

Multi-master scaling - single datastore

A single datastore from a single host yields a 64 KB sequential throughput of over 2,000 MiB/s. Distributing the same workload across all four hosts yielded the largest gain of 135%, driving over 5,000 MiB/s. This result may represent the upper limit of the throughput performance of Azure NetApp Files volumes.

Reducing the block size from 64 KB to 8 KB and repeating the same iteration, the four virtual machines yielded 195,000 IOPS, as shown in the figure below. Performance scales with the number of hosts and data stores as the number of network flows increases. Performance is improved by multiplying the number of hosts by the number of datastores, since the network traffic count is a factor for each datastore host.

Multi-master scaling - multiple datastores

A single datastore with four virtual machines spread across four hosts generated over 5,000 MiB/s of 64 KB sequential I/O. For more demanding workloads, each virtual machine was moved to a dedicated datastore, generating a total of over 10,500 MiB/s of data, as shown in the figure below.

For random small block workloads, a single datastore yields 195,000 8KB random IOPS. Scaling across four datastores resulted in over 530,000 random 8K IOPS.

Impact and Recommendations

This section explains whyDistributing virtual machines across multiple datastores has significant performance benefits.

as the picture showsTest results, the performance of Azure NetApp Files is very rich:

  • Tests show that, on average, data warehouses can drive~148 980 IOPS 8 KB or ~4147 MiB/s64 KB IOPS for four-host configuration (average of all %write/%read tests).
  • Virtual machines on datastores:
    • If you have separate virtual machines, you may need more than~75K IOPS 8KB or more ~1700 MiB/s, to spread file systems across multiple VMDKs to scale virtual machine storage performance.
  • One VM across multiple datastores: One virtual machine spans 8 datastores~147 000 IOPS 8 KB or ~2786 MiB/sThe block size is 64 KB.
  • One host: each host can support~198 060 8 KB IOPS or ~2351 MiB/sAt least 4 Azure NetApp Files datastores if each host uses at least 4 virtual machines. Therefore, you can choose to balance provisioning enough datastores for potentially explosive peak performance against cost and management complexity.

suggestion

When the performance of a single datastore is not enough, spread your virtual machines across multiple datastores for further scaling. Simple is usually best, but performance and scalability may justify the extra but limited complexity.

Four Azure NetApp Files datastores provide up to 10GBps of usable bandwidth for large sequential I/O or the ability to forward up to 500,000 8,000 random IOPS, for best performance start with at least four datastores.

For fine-grained performance tuning, Windows and Linux guest operating systems support multi-disk scaling. Therefore, you should split the file system into multiple VMDKs spread across multiple datastores. However, if application instance consistency is an issue and cannot be resolved with LVM or Storage Spaces, consider installing Azure NetApp Files from the guest OS or exploring application-level scaling, of which Azure has many good options.

If you are splitting volumes across multiple disks, make sure your backup or disaster recovery software supports backing up multiple virtual disks at once. Since a single write is spread across multiple disks, the file system must ensure that the disks are "frozen" during backup or snapshot operations. Most modern filesystems include freeze or snapshot functionality, such as xfs (xfs_freeze) and NTFS (Shadow Copies), which backup software can take advantage of.

Since Azure NetApp Files charges for provisioned capacity in capacity storage rather than allocated capacity (datastores), for example, you will pay the same for 4 x 20TB datastores or 20 x 4TB datastores. If needed, the capacity and performance of the data warehouse can be adjusted on demand,Dynamically via Azure API/Console.

For example, as the end of the fiscal year approaches, you discover that your typical data warehouse requires increased storage performance. You can increase the service level of datastores by one month to allow more usable performance for all virtual machines on those datastores, while keeping other datastores at lower service levels. By spreading the workload over more TCP connections between each datastore and each AVS host, you not only save money but also get higher performance.

You can monitor your data warehouse metrics through vCenter or through the Azure console/API. From vCenter, you can monitor the average cumulative IOPS of the datastorePerformance/Advanced Charts, as long as you enable the collection of storage I/O audit metrics on the datastore. blueAPIyescomfortcurrent measurementwrite operation,read operation,performance reading, yeswrite performance, among other things, measure your effort at the data warehouse level. With Azure Metrics, you can define notification rules and actions to automatically resize data stores through Azure functions, webhooks, or other actions.

Next step

  • Disk striping in Azure
  • Create a striped volume on Windows Server
  • Azure VMware Solution storage concepts
  • Attach an Azure NetApp Files datastore to an Azure VMware Solution host
  • Attach Azure NetApp Files to an Azure VMware Solution virtual machine
  • Performance issues with Azure NetApp Files
  • Linux NFS mount options for Azure NetApp Files best practices

References

Top Articles
Latest Posts
Article information

Author: Carmelo Roob

Last Updated: 06/19/2023

Views: 6040

Rating: 4.4 / 5 (45 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Carmelo Roob

Birthday: 1995-01-09

Address: Apt. 915 481 Sipes Cliff, New Gonzalobury, CO 80176

Phone: +6773780339780

Job: Sales Executive

Hobby: Gaming, Jogging, Rugby, Video gaming, Handball, Ice skating, Web surfing

Introduction: My name is Carmelo Roob, I am a modern, handsome, delightful, comfortable, attractive, vast, good person who loves writing and wants to share my knowledge and understanding with you.