Nutanix Announces Xtract for VM’s, Simplifying the Migration to AHV

Today Nutanix announced a new product called “Xtract for VM’s”, which is a tool to simplify the migration from other hypervisors (currently ESXi only) to Nutanix AHV.

While several options currently exist for migrating from ESXi to AHV, such as in place cluster conversions, Cross Hypervisor DR (if both source and destination are Nutanix clusters), a more manual svMotion/import, Xtract for VM’s facilitates a prescriptive and controlled approach for moving workloads from ESXi to AHV.

Xtract for VM’s uses a simple “one click” wizard approach to target one or more VM’s for migration.  To migrate a VM, a “Migration Plan” is created using a simple wizard and the following criteria are configured:

  • Select or more VM’s (batch processing for efficiency)
  • Specify guest operating system credentials (to install AHV device drivers)
  • Specify network mappings to retain network configuration (correlate the source network in vSphere to destination network in AHV)
  • Specify a migration schedule, if required, to seed data in advance


When a VM is configured for migration, a copy of the source ESXi VM is created in AHV, and then any changes to the source VM are synchronized to the destination VM up until the point of cutover.  Downtime for the cutover is minimized and only incurred when the source ESXi VM is powered down and the destination AHV VM is powered up.  The source VM is left on the ESXi host in an unmodified state so that it can be reverted to if an issue is encountered during testing.  Migration can be paused, resumed, or canceled at any time.


Xtract for VM’s is available at no additional charge to all Nutanix customers.  However, there are some caveats and requirements for use which can be found on Nutanix’s support site including, but not limited to:

  • Source node must be running ESXi 5.5 or higher
  • vCenter 5.5 or higher must be present and used for migration
    • migration direct from ESXi hosts is not possible
  • There are certain disk limitations that aren’t supported, such as:
    • Independent disks
    • Raw Device Mappings (RDM’s)
    • Multi writer (shared) disks
  • Guest OS must be supported by AHV

Nutanix has built an increasingly compelling argument for migrating to AHV from other hypervisors, as acquisitions (Calm, automation and orchestration) and product enhancement (such network visualization, micro-segmentation, self service portal, AFS/ABS services, etc.) have made their solution more than “just another hypervisor” and now have an answer to most any use case or requirement.

Customers who were hesitant to adopt a “relatively new” hypervisor a couple years ago, or that had a particular use case (such as micro-segmentation via VMware NSX tying them to vSphere) may now have a viable alternative and I suspect that more customers will, at a minimum, be investigating the possibility of migrating away from their existing hypervisor.

If you’re like me, you like options and flexibility in a solution.  Competition is good, and if nothing else, maybe your next VMware renewal will be a little bit cheaper 😉  Easy to use in-box tools such as Xtract for VM’s that simplify the migration process and increase its probability for success make it an easier “sell” aside from the “dollars and cents” argument.

Read more about Xtract for VM’s on the Nutanix Blog, download it HERE, or read the user guide HERE.




Nutanix Announces Support for HPE Servers and a New Consumption Model

Today Nutanix announced support for HPE ProLiant server hardware and a new consumption model called “Nutanix Go”.  Both announcements support Nutanix’s position that the “enterprise cloud” should be flexible, easy to consume, and with the power of the public cloud….what I like to call the “have it your way” model.


HPE ProLiant Support

The announcement of support for HPE server hardware probably doesn’t come as a surprise to many because it’s very similar in nature to the announcement of support for Cisco UCS hardware just a few months ago.  While Nutanix had OEM agreements in place with both Dell and Lenovo hardware, customers wanted the flexibility to use Cisco UCS – their existing server hardware standard,  and after a validation process, Nutanix offered a “meet in the channel” procurement model where a customer buys the Nutanix software from an authorized Nutanix reseller and then buys the validated server hardware from an authorized Cisco reseller.  The announcement for HPE follows this same model using select HPE ProLiant server hardware (currently DL360-G9 and DL380-G9).

While it’s safe to say that there will probably be some gnashing of the teeth regarding this announcement just like there was from the Cisco UCS one (especially in light of HPE’s recent acquisition of SimpliVity), I see it as a win for everyone involved – the customer gets another choice for server hardware and the software that runs on it, channel partners have more “tools in their tool chests” to offer best in class solutions to their customers, and vendors get to move more boxes.

As mentioned earlier, Nutanix plans to support two HPE ProLiant server models initially – DL360-G9 and DL380-G9.  The DL360 is a 1U server with 8 small form factor drive slots and 24 DIMM slots.  The targeted workload for this server (VDI, middleware, web services) would be similar to the Nutanix branded NX3175…things that may be more CPU intensive than storage IO/capacity intensive.  The DL380 is a 2U server with 12 large form factor drive slots and 24 DIMM slots.  The targeted workload for this server would be similar to the Nutanix branded NX6155/8035…things that may generate larger amounts of IO or require more storage capacity.

Nutanix will offer both Acropolis Pro and Ultimate editions in conjunction with the HPE Proliant server hardware.  Starter and Xpress editions will not be available at this time.  However, one interesting tidbit is the fact that software entitlements are transferable across platforms, meaning that a customer could leverage Nutanix software on an existing HPE server hardware investment (assuming it met the validated criteria) and at a later date “slide” that software on over to a different HPE server model or perhaps a Cisco UCS server at the time of a server hardware refresh, if they so chose.

Support is bundled with the software license as a subscription in 1, 3, or 5 year terms.  Just like the model with Nutanix running on Cisco UCS hardware, the server hardware vendor still fields hardware concerns, Nutanix will support the software side, and when in doubt, call Nutanix – if the issue is on the hardware side, concerns will be escalated through TSA Net for handoff to HPE support.

As far as availability timelines are concerned, it should be possible to get quotes for this solution at announcement (today – May 3 2017), with the ability to place orders expected for Q3 2017, and general availability targeted for Q4 2017.

Nutanix Go

Nutanix labels Nutanix Go as “On-premises Enterprise Cloud infrastructure with pay-as-you-Go billing”.  In a nutshell, a customer now has the ability to “rent” a certain number of servers for a defined term, ranging from 6 months to 5 years depending on configuration and model, with pricing incentives for longer term agreements, and billing / payment occurring monthly.

While an outright purchase is probably still the most advantageous in terms of price, there are plenty of scenarios beyond price where the flexibility of quickly scaling up or down in a short time period without keeping hardware with a 3 or 5 year lifecycle on the books…having costs fall under OPEX instead of CAPEX, “de-risking” projects with uncertain futures, augmenting existing owned Nutanix clusters, etc.  Customers will have the ability to mix “rented” nodes with “owned” nodes within the same cluster, enabling a sort of “on premises cloud bursting” capability.

The pricing for Nutanix Go is structured in such a way that the TCO is supposed to be significantly less than running a similar workload in AWS while mitigating some of the “use cases” that may traditionally necessitate consuming a public cloud.

Nutanix Go includes hardware, software, entitlements, and support under one SKU.  It’s priced per block, per term length, and as mentioned previously, billing and payment occur monthly.  Currently, there is a minimum of 12 nodes required for an agreement which in my opinion is a bit high.  I’d like to see something more a long the lines of what is the required minimum for a Nutanix cluster…something like 3 or 4 nodes that might be more attractive to small and medium sized business.  On the flip side, since it is Nutanix keeping the hardware on their books and allowing the customer to rent it, I can see why they’d want a certain minimum to make it worth their while.  Perhaps this will change in the future.

As far as availability is concerned, Nutanix Go is initially only available to US customers, with rollout country by country for the rest of the world in the second half of 2017.


In summary, “more choices” is always a good thing, and further proof that the “power” is in software.  I’m sure many customers, both potential and existing, will find these new consumption models to be a welcome addition.

Installing Nutanix NFS VAAI .vib on ESXi Lab Hosts

This post covers the installation of a Nutanix NFS VAAI .vib on some “non-Nutanix” lab hosts.

Why would one do this?  Several months ago I stood up a three node lab environment accessing “shared” storage using a Nutanix filesystem whitelist (allows defined external clients to access the Nutanix filesystem via NFS).  While the Nutanix VAAI plugin for NFS would normally be installed on the host as part of the Nutanix deployment, it obviously was not there on my vanilla ESXi 6.0 Dell R720 servers accessing the whitelist….which made things like deploying VM’s from template and other tasks normally offloaded to the storage unnecessarily slow.

Since Nutanix just released “Acropolis Block Services / ABS” GA in AOS 4.7 (read more about it at the Nutanix blog) there’s probably less of a reason to use filesystem whitelists for this purpose now, but alas, maybe someone will find it useful (*edit* – it’s worth noting that ABS doesn’t currently support ESXi.  I haven’t tried to see if it actually workyet but needless to say, don’t do it from a production environment and expect Nutanix to help you *edit 1/27/17* as of AOS 5.0 released earlier this month, ESXi is supported using ABS)  At the time of this blog post, Windows 2008 R2/2012 R2, Microsoft SQL and Exchange, Red Hat Enterprise Linux 6+, and Oracle RAC are supported.  NFS whitelists aren’t supported by Nutanix for the purpose of running VM’s, either.

  1. The first step is to SCP the Nutanix NFS VAAI .vib from one of your existing CVM’s.  Point your favorite SCP client to the CVM’s IP, enter the appropriate credentials, and browse to the following directory:/home/nutanix/data/installer/%version_of_software%/pkg2016-06-27 07_49_20-PhotosCopy the “nfs-vaai-plugin.vib” file to your workstation so that it can be uploaded to storage connected to your ESXi hosts using the vSphere Client.
  2. Once the .vib is uploaded to storage accessible by all ESXi hosts, SSH to the first host to begin installation.  You may need to enable SSH access on the host as it’s disabled by default.  This can be done by starting the SSH service in %host% > Configuration > Security Profile > Services “Properties” in the vSphere Client.
  3. Once logged in to your ESXi host, we can verify that the NFS VAAI .vib is missing by issuing the “esxcli software vib list” command.vib-listIf the .vib were present, we’d see it at the top of the list.
  4. Now we need to get the exact path to location you placed the .vib on your storage.  This can be done by issuing the “esxcli storage filesystem list” command.  You will be presented with a list of all storage accessible to the host, the mount point, the volume name, and the “mount point” of the appropriate storage volume so that we can paste it into the next command.  Alternatively, you could use the “volume name” in place of the UUID in the mount point path, but this was easier for me.
  5. Next, we will  install the .vib file using the “esxcli software vib install -v “/vmfs/volumes/%UUID_or_volume_name%/%subdir_name%/nfs-vaai-plugin.vib”” command.  I created a subdirectory called “VIBs” and placed the nfs-vaai-plugin.vib file in it.  Be careful as the path to the file is case sensitive.vib-installIf the install was successful, you should see a message indicating it completed successfully and a reboot is required for it to take effect.  Assuming your host is in maintenance mode and has no running VM’s on it, go ahead and reboot now.
  6. Once the host has rebooted and is back online, start a new SSH session and issue the “esxcli software vib list” command again and you should see the new .vib at the top of the list.install-confirmationVoila!  You can now deploy VM’s from template in seconds.itsbeautifulmeme

Nutanix Acropolis Base Software 4.5 (NOS release)…be still my heart!

Today Nutanix announced the release of “Acropolis Base Software” 4.5…the software formerly known as NOS.  I happened to be in the Nutanix Portal this morning and didn’t even notice the release in the “Downloads” section due to the name change…thankfully @tbuckholz was nice enough to alert me to this wonderful news.


I read through the Release Notes and was pretty excited with what I found – a bunch of new features that solve some challenges and enhance the environment I’m responsible for caring and feeding on a daily basis.  Some of these features I knew were coming, others were a surprise.  There’s a ton of good stuff in this release so I encourage you to check them out for yourself.

A short list of some of the things particularly interesting to me in no particular order…

  1. Cloud Connect for Azure – prior to this release, Nutanix Cloud Connect supported AWS…it’s good to have options.  I was actually having a conversation with a coworker yesterday about the possibility of sending certain data at our DR site up to cloud storage for longer / more resilient retention.

    The cloud connect feature for Azure enables you to back up and restore copies of virtual machines and files to and from an on-premise cluster and a Nutanix Controller VM located on the Microsoft Azure cloud. Once configured through the Prism web console, the remote site cluster is managed and monitored through the Data Protection dashboard like any other remote site you have created and configured. This feature is currently supported for ESXi hypervisor environments only. [FEAT-684]

  2. Erasure Coding – lots of good info out there on this feature that was announced this summer so I won’t go into too much detail.  Long story short it can allow you to get further effective capacity out of your Nutanix cluster.  A lower $ : GB ratio is always welcome. @andreleibovici has a good blog post describing this feature at his site.

    Erasure CodingComplementary to deduplication and compression, erasure coding increases the effective or usable cluster storage capacity. [FEAT-1096]

  3. MPIO Access to iSCSI Disks – another thing I was fighting Microsoft support and a couple other misinformed people about just last week.  One word:  Exchange.  Hopefully this will finally put to rest any pushback by Microsoft or others about “NFS” being “unsupported”.  I spent a bunch of time last week researching the whole “NFS thing” and it was a very interesting discussion.  @josh_odgers spent a lot of time “fighting the FUD” if you will and detailing why Microsoft should support Exchange with “NFS” backed storage.  A few of my favorite links THIS, THIS (my favorite), and THIS WHOLE SERIES.

    Acropolis base software 4.5 feature to help enforce access control to volume groups and expose volume group disks as dual namespace disks.

  4. File Level Restore (Tech Preview) – this was one of the “surprises” and also one of my favorites.  We are leveraging Nutanix Protection Domains for local and remote snapshots for VM level recovery and Veeam for longer term retention / file based recovery.  However, the storage appliance that houses our backup data can be rather slow for large restores so the ability to recover SOME or ALL of a VM using the Nutanix snapshots I already have in place is a big deal for me.

    The file level restore feature allows a virtual machine user to restore a file within a virtual machine from the Nutanix protected snapshot with minimal Nutanix administrator intervention. [FEAT-680]

  5. Support for Minor Release Upgrades for ESXi hosts – this is nice for those random times that you need to do a minor revision upgrade to ESXi because “when ____ hardware is combined with ______ version of software ______, ______ happens”.  We’ve all been there.  Nutanix still qualifies certain releases for one click upgrade, but there is now support for upgrades using the Controller “VM cluster” command.

    Acropolis base software 4.5 enables you to patch upgrade ESXi hosts with minor release versions of ESXi host software through the Controller VM cluster command. Nutanix qualifies specific VMware updates and provides a related JSON metadata upgrade file for one-click upgrade, but now customers can patch hosts by using the offline bundle and md5sum checksum available from VMware, and using the Controller VM cluster command. [ENG-31506]

It’s always nice to get “new stuff” with a super simple software upgrade.  Thanks for taking the time to read and I encourage you to check out some of the other features that might be of interest to your environment.

Veeam + Nutanix: “Active snapshots limit reached for datastore”

Last night I ran into an interesting “quirk” using Veeam v8 to back up my virtual machines that live on a Nutanix cluster.  We’d just moved the majority of our production workload over to the new Nutanix hardware this past weekend and last night marked the first round of backups using Veeam on it.

We ended up deploying a new Veeam backup server and proxy set on the Nutanix cluster in parallel to our existing environment.  When there were multiple jobs running concurrently overnight, many of them were in a “0% completion” state, and the individual VM’s that make up the jobs had a “Resource not ready: Active snapshots limit reached for datastore” message on them.

veeam 1

I turned to the all-knowing Google and happened across a Veeam forum post that sounded very similar to the issue I was experiencing.  I decided to open up a ticket with Veeam support since the forum post in question referenced Veeam v7, and the support engineer confirmed that there was indeed a self-imposed limit of 4 active snapshots per datastore – a “protection method” of sorts to avoid filling up a datastore.  On our previous platform, the VM’s were spread across 10+ volumes and this issue was never experienced.  However, our Nutanix cluster is configured with a single storage pool and a single container with all VM’s living on it, so we hit that limit quickly with concurrent backup jobs.

The default 4 active snapshot per datastore value can be modified by creating a registry DWORD value in ‘HKEY_LOCAL_MACHINE\SOFTWARE\Veeam\Veeam Backup and Replication\’ called MaxSnapshotsPerDatastore and use the appropriate hex or decimal value.  I started off with ’20’ but will move up or down as necessary.  We have plenty of capacity at this time and I’m not worried at all about filling up the storage container.  However, caveat emptor here because it is still a possibility.

This “issue” wasn’t anything specific to Nutanix at all, but is increasingly likely with any platform that uses a scale-out file system that can store hundreds or thousands of virtual machines on a single container.

A tale of two firmware upgrades…

On this fine Friday afternoon, I thought I’d have a little fun comparing and contrasting the firmware upgrade process on two different storage solutions. We recently bought some Nutanix 8035 nodes to replace the existing storage platform. While I wouldn’t necessarily call Nutanix “just” a storage platform, the topic of this discussion will be the storage side of the house. For the sake of anonymity, we’ll call our existing storage platform the “CME XNV 0035”.

One of the biggest factors in choosing the Nutanix platform for our new compute and storage solution was “ease of use”. And there’s a reason for that – the amount of administrative effort required to care and feed the “CME XNV 0035” was far too high, in my opinion. Even a “simple” firmware upgrade took days or weeks of pre-planning/scheduling and 8 hours to complete in our maintenance window. Now that I’ve been through the firmware upgrade on our Nutanix platform, I thought a compare and contrast was in order.

First, let me take you through the firmware upgrade on the “CME XNV 0035”

  1. Reach out to CME Support, open a ticket, and request a firmware update. They might have reached out to you proactively if there was a major bug/stability issue found. (Their product support structure is methodical and thorough, I will give them that) 30 minutes
  2. An upgrade engineer was scheduled to do the “pre upgrade health check” at a later date.
  3. The “pre upgrade health check” occurs, logs and support data are gathered for later analysis. Eventually it occurred frequently enough I’d just go ahead and gather and upload this data on my own and attach it to the ticket. 1 hour
  4. A few hours to a few days later, we’d get the green light from the support analysis that we were “go for upgrade”. In the mean time, the actual upgrade was scheduled with an upgrade engineer for a later date during our maintenance window…typically a week or so after the “pre upgrade health check” happened.
  5. Day of the upgrade – hop on a Webex with the upgrade engineer, and begin the upgrade process.  Logs were gathered again and reviewed.  This was a “unified” XNV 0035, though we weren’t using the file side…..I’m…..not sure why file was even bought at all, but I digress….which meant we still had to upgrade the data movers and THEN move onto the block side.  One storage processor was upgraded and rebooted…took about an hour, and then the other storage processor was upgraded and rebooted…took another hour.  Support logs were gathered again, reviewed by the upgrade engineer, and as long as there were no outstanding issues, the “green light” was given.  6-8 hours

Whew……7.5 – 9 hours of my life down the drain…


Now, let’s review the firmware upgrade process on the Nutanix cluster

  1. Log into Prism, click “Upgrade Software” 10 seconds
  2. Click “Download” if it hasn’t done it automatically 1 minute (longer if you’re still on dial up)
  3. Click “Upgrade”, then click the “Yes, I really really do want to upgrade” button (I paraphrase) 5 seconds
  4. Play “2048”, drink a beer or coffee, etc. 30 minutes
  5. Run a “Nutanix Cluster Check (NCC)”
  6. Done

There you have it, 31 minutes and 15 seconds later, you’re running on the latest firmware.  Nutanix touts “One click upgrades”, but I counted four, technically.  I can live with that.

Yes, this post is rather tongue in cheek, but it is reflective of the actual upgrade process for each solution.  Aside from the initial “four clicks”, Nutanix handles everything else for you and the firmware upgrade occurs completely non-disruptively.


EMC VNX Pool LUNs + VMware vSphere + VAAI = Storage DEATH

**Cliffs notes – a bug in the VNX OE causes massive storage latency when using vSphere with VAAI enabled – disabling VAAI fixes issue**

Hello, and welcome to my very first blog post! I’ve owned this domain and WordPress subscription for nearly a year and a half and am finally getting around to posting something on it. Considering I’ve spent the last 3 years focused on end user computing, and the majority of that being done with Citrix products, I always figured my first post would be in that domain…but alas, that was not the case.

The problem…

I recently started a new gig and one of the first orders of business was untangling some storage and performance issues in a vSphere 5.5 environment running on top of a Gen 1 EMC VNX 5300.  It was reported that there was very high storage latency, often resulting in LUNs being disconnected from the hosts, during certain operations like a Storage vMotion or deploying a new VM from template.

After a general review of the environment I was able to rule out a glaringly obvious mis-configuration, so I turned to a couple useful performance monitoring tools – Esxtop and Unisphere Analyzer.  While I am by no means an expert with either tool, with a little bit of Google-fu and the assistance of a couple great blogs (which I’ll link to later in this post), I was able to get the info I needed to verify my theory – a bug involving VAAI that was supposed to be addressed in the latest VNX Operating Environment (which at the time of this posting is still exists.

I started out by doing some performance baselining with VisualEsxtop ( so I could get a picture of what the hosts were seeing during operations that involved VAAI (Storage vMotion, deploy-from-template/clone, etc.)  As you can see in the below screenshot, the VNX is quite pissed off.  The “DAVG” value represents disk latency (in milliseconds) that is likely storage processor or array related.  The “KAVG” value represents disk latency (in milliseconds) associated with the VMkernel.  Obviously, the latency on either side of the equation is nowhere near a reasonable number.  Duncan Epping has a great overview of Esxtop (, I highly recommend you give it a read if you’re newer to the tool like I am.


The next step was to use EMC’s Unisphere Analyzer to get a picture of what was occurring on the storage side during these operations.  If you’re not familiar with Unisphere Analyzer, an EMC employee created a brief video on how to capture and review data with it ( – it’s a relatively simple tool that you can garner a lot of valuable information from.  I used it to capture storage side performance metrics during the two following tests.


The first test consisted of a Storage vMotion of a VM with VAAI enabled on the host (1 Gb iSCSI to the VNX).  This test moved the VM from LUN_0 to LUN_6, starting at 9:46:44 AM and finishing at 10:01:53 AM.  If you look at the corresponding time period on the Unisphere Analyzer graph you’ll see that response time is through the roof.  While it did not occur during this test, the hosts would often lose their connection to the LUNs during these periods of high latency…not good, obviously.

These warnings always show up in the vSphere Client when this issue occurs (yeah yeah, I’m not using the Web Client for this):


The second test consisted of a Storage vMotion of the same VM with VAAI disabled.  This test moved the VM back to LUN_0 from LUN_6, starting at 10:04:13 AM and finishing at 10:14:03 AM.  This time, the Unisphere Analyzer data looks MUCH better.


Here is some an example of what Esxtop looked like during the test with VAAI disabled:


The LUNs with ~ 1400 read/write IO are obviously the ones involved in the Storage vMotion…notice the lack of “SAN choking”.  I re-ran this test multiple times using other LUNs with identical results…it was obvious at this point that there is still an issue with VAAI being used on this VNX OE.  Fortunately, our production datacenter utilizes 10 Gbe for the iSCSI network and Storage vMotions finish in just a minute or two.  I could see this flaw being particularly problematic in larger environments where Storage vMotion is frequent or something like VDI where VM’s are frequently spun up, tore down, or updated.

The solution…

Obviously, disabling VAAI in vSphere is a guaranteed “work around”.  I wouldn’t necessarily call this a “fix” as the VAAI feature is unusable, but it will stop the high latency and disconnects when vSphere tries to offload certain storage tasks to the array.  Once I had some hard evidence in hand, I did open up a ticket with EMC, and the support engineer was able to confirm this was indeed still a bug and has not been addressed by the latest OE version.

This VMware KB article details the process of disabling VAAI (  I found that just the “DataMover.HardwareAcceleratedMove” parameter in the article had to be disabled.  The EMC support engineer also mentioned they had some success increasing the “MaxHWTransferSize” parameter while leaving VAAI enabled, but that it hadn’t worked for everyone.

You can see more information from this KB article ( – you may need an active support account, I had to login to view this page).  I decided to just disable VAAI and call it a day until a valid bug fix was released in some future OE version.  ***update 11/06/15*** it has come to my attention the preceding EMC KB191685 can no longer be accessed at the supplied link…I searched through the support portal and could not find a replacement so I don’t know if they pulled the KB documenting this issue entirely or if it’s been merged into another.  I did however find a support bulletin from June 2015 saying that the VAAI improvements had been added into the .217 firmware.  At one point I did request the .217 firmware only to find out they’d pulled it due to some issue it was causing.  I can only assume the VAAI improvements would’ve been added into some subsequent firmware version but no longer have my VNX’s in production, nor are they under support, and I won’t be able to personally test.

Hopefully this information will be beneficial to someone at there…luckily I found my way through the rabbit hole, but there wasn’t a whole lot publicly available regarding this issue when I was initially seeking a cause.