Nuke Ads from Orbit with Raspberry Pi and Pi-hole [it’s the only way to be sure]

The Problem:

Earlier this week, I turned on my new Samsung “smart TV” and was greeted with a notification that I needed to accept “new terms and conditions”, and that I would “be sent targeted ads” [i.e. they’re collecting your data and selling it (you)]. Not only that, but if I declined, “certain smart features of the TV may no longer work”.  After “declining”, none of my apps worked.  Say what now?  You mean to tell me you’re going to neuter the TV I paid for if I don’t agree to being pimped out?  Bad move, Samsung.  This has obviously ruffled a lot of feathers, and rightly so.  There is a “mega thread” on Reddit that goes into great detail about it.

Imagine watching Netflix and having an ad or “commercial” pop up thanks to your smart TV?  Yeah no.  What’s likely occurring is that Samsung is subsidizing the cost of their TV’s with the revenue generated by their advertisements.  The fact I just bought a mid-upper range 40″ Samsung 4K smart TV for only $275, which just a few years ago would’ve cost over $1,000, is not lost on me.  A clever idea on paper, but horrible in practice, and whichever executive signed off on that idea should be reassigned to the toaster division.

While I don’t like the idea of what Samsung is doing at all, I can either disconnected my TV from my local network and break all the handy smart TV functionality like Netflix, Amazon Prime Video, Youtube, etc. or “deal with it”.  For now, I’ve chosen to “deal with it” by using something called “Pi-hole”, which essentially turns a Raspberry Pi into a DNS server for your local network, which intercepts advertisements being sent to your client devices and replaces them with white space.  While I suppose this may not stop Samsung from collecting the data, it prevents it from disrupting my use of the TV.

While there are browser based plugins like AdBlock that work quite well for computers and mobile devices, that doesn’t help me much with my TV.  There is only one option – nuke the ads from orbit…it’s the only way to be sure.

nuke-from-orbit

The Gear:

I ordered a Raspberry Pi 3 kit from Amazon for about $50, which came with a clear case, power supply, two heat sinks, and the Raspberry Pi board itself.  In addition, I purchased a 16 GB class 10 micro SD card for about $6 bucks (side note, can’t believe how cheap high capacity removable flash media is now).

I won’t bother with the assembly instructions as it’s pretty straight forward, but figured I would create a post that detailed the steps required to install the Raspberry Pi operating system (Raspbian Jessie Lite which is the GUI / desktop-less version), configure basic settings, and install Pi-hole.

The Solution:

  1. Download the latest version of Raspbian Jessie Lite from https://www.raspberrypi.org/downloads/raspbian/ and extract the .IMG file from the .ZIP archive.  The light version contains no GUI, which is probably unnecessary anyway for the way this Raspberry Pi will be used.
  2. Download Win32 Disk Imager from https://sourceforge.net/projects/win32diskimager/ and install it.
  3. Download SD Card Formatter from https://www.sdcard.org/downloads/formatter_4/ and install it.
  4. Plug in the SD card to your computer and launch the SDFormatter App.  Click the “Format Option – Option” button and set “Format Size Adjustment” to “ON”, and then click “Format”.
    rbp1
  5. Next, run Win32 Disk Imager.  Click the “Browse” button and browse to Raspbian Jesse .IMG file you extracted from the .ZIP archive.
    rbp2
    Ensure the correct drive letter for your SD card is selected under “Device” and then click “Write” to burn the .IMG file to it.  This may take several minutes to complete.
    rbp3
    Once it is done writing the image to your SD card, you should receive a success message.
  6. Eject your SD card, insert it in the Raspberry Pi, and power the unit up.  It will be easiest if you hard wire the Raspberry Pi to your router or switch using the ethernet port.  You will need to attach a monitor and keyboard as well so that you can enable SSH for remote administration.
  7. Login to the Raspberry Pi with the default credentials – username = pi and the password = raspberry
  8. Perform the initial configuration by entering the command “sudo raspi-config”.
  9. The “Raspberry Pi Software Configuration Tool” window will open, and a variety of options for configuration will be displayed.The first step we will do is to enable SSH so that we can access the Raspberry Pi from a remote system for the remainder of the configuration.Select “7 Advanced Options” and then “A4 SSH”.  Answer “Yes” when asked “Would you like the SSH server to be enabled?”.

    Select “Finish” from the raspi-config menu and answer “Yes” when asked to “Reboot now?”.  From this point on, you should be able to perform the rest of the configuration from a remote system at the comfort of your own desk.

  10. Assuming you have DHCP enabled on your router or switch, your Raspberry Pi should’ve received an IP address. You will need to know this IP address to connect to use SSH from another system.  I logged into my router’s admin portal and found a device named “raspberrypi”, which is the default name given, and then noted its IP address.
  11. Open an SSH session to the Raspberry Pi’s IP address and login using the default credentials.
    rbp4
  12. Once you’re logged in to the SSH session, you will see a warning that SSH is enabled but the default credentials have not been changed, which poses a security risk.  We will be changing this, among other settings, by running the raspi-config wizard again.  Enter the command “sudo raspi-config”.
    rbp5
  13. First, we will choose “1 Expand Filesystem” to utilize the remainder of the SD card.  Mine is 16 GB and it’d be nice to have it all available for use.  You will see a message that the root partition has been resized and it’ll require a reboot to complete.  No need to reboot yet though.
    rbp6
  14. Next, choose “2 Change User Password”.  Click “Ok” on the next window, and then you will be prompted to enter and re-enter a new password.  Assuming you entered the password correctly twice, you should see a success window.
    rbp7
  15. Next, choose “4 Internationalisation Options”.  We will be configuring our timezone and Wi-fi County here.  Select “I2 Change Timezone” and select your major region, then your applicable timezone.
    rbp8rbp9
  16. Next, select “7 Advanced Options” and then “A2 Hostname”.  This will allow us to change the default hostname from “raspberrypi” to something custom.  Perhaps you have a naming convention on your network that you need to adhere to, or you don’t want it to be blatantly obvious what the device is used for based on the name.  Enter your hostname then select “Ok”.
    rbp10
  17. Now that most of the basic configuration items have been addressed, return to the main raspi-config menu and choose “Finish”. A reboot will be required to apply the changes.
  18. Open a new SSH session to the Raspberry Pi and login with your new credentials.  You will notice that the terminal now shows the customized hostname you selected in the previous steps.
    rbp11
  19. Now that we have basic configuration and network connectivity, it is time to download and install any updates available for your Raspberry Pi.  The first command you need to enter is “sudo apt-get update”.  You should see the updates begin downloading.
    rbp12
  20. When you see the message “Reading package lists… Done”, the download is complete and it’s time to install the updates.  Enter the command “sudo apt-get dist-upgrade”.  You will be notified that some amount of additional disk space will be required, and you will need to answer “Y” to continue.  Depending on the amount of updates available, it could take a few minutes to complete.  Once it’s done installing updates, enter “sudo reboot” for good measure.
    rbp13
  21. Now we will install the Pi-hole software.  Log back into the Raspberry Pi with an SSH session and enter the command “curl –L https://install.pi-hole.net | bash”.  Alternatively, if you don’t want to pipe to bash, you can use the “alternative semi auto installation” instructions located here.
    rbp14
    The install script will launch, which performs various prerequisite checks and downloads the necessary files before launching the “Pi-hole automated installer” wizard.
    rbp15
  22. The first message you will see is a notification that “the installer will transform your device into a network-wide ad blocker”. Well, that is why we’re here after all.  The next window notifies you that the software is free and open sourced, but “powered by your donations”.  If you like the results, kick a little money their way.
  23. The next window states that you need to use a STATIC IP address, since it is after all a server and if its IP were to change, it’d break your Pi-hole DNS service.  Because we never set a static IP address in any of the previous steps, we will have the chance to do so now.
    rbp16
  24. Now you will be asked to select an interface.  “eth0” is the Ethernet port on your Raspberry Pi, and “wlan0” is the WiFi adapter.  I am going to hard wire my Pi-hole to my router for the simplest and most reliable service, so I have selected “eth0”.
    rbp17
  25. The next window asks you to select which protocol(s) to use – since I am not using IPv6 on my network, I’ve left just IPv4 selected.
    rbp18
  26. The next window displays the current IP information and asks if you’d like to use that as your static address.  Since I currently have a DHCP address leased, I do not want to configure a static IP with this address, so I’ve selected “No” and will enter the new information on the next window.  If you decide to reuse the IP address issued to you by DHCP instead of one outside the DHCP pool, it’s possible that a duplicate IP address could be issued (depending on how smart your router/switch is) and cause an issue.
    rbp19
  27. Enter your IP address in “CIDR notation” – meaning that instead of specifying an IP and subnet mask like “255.255.255.0”, you’d enter the IP like “192.168.1.254/24”, implying that it’s a 24 bit mask.
    rbp20
  28. Most likely, the default gateway you received from DHCP will be the same one you want to use when configuring the static IP.
    rbp21
  29. Verify the information is correct and if so, answer “Yes”.
    rbp22
  30. The next screen asks you to select an “Upstream DNS Provider”.  I use OpenDNS currently and have selected that for use by the Pi-hole.  OpenDNS may give you some additional filtering flexibility as opposed to your local internet service provider’s DNS service.
    set-opendns
  31. You will be shown a summary of your DNS configuration – if everything looks correct, select “Yes”.
  32. You will see some commands execute in the background, and then the “Installation Complete” window should appear.  It tells you that in order for the Pi-hole to do its job, the devices on your network need to use it as their DNS server.  Also, since we did assign a new IP address as part of the configuration, a reboot will be required.  Select “Ok”, and then enter the “sudo reboot” once you’re back at the terminal prompt.
    rbp23
  33. At this point, the Pi-hole is ready for use by your clients.  If you’re using DHCP on your router or switch, the easiest way to accomplish this is probably to modify your DHCP options so that the Pi-hole’s IP address is handed out as the DNS server.  If you have static IP addresses set on any of your devices, you will have to modify their DNS server information manually.The option in red below “inserts” my routers IP address into the DHCP config as an available DNS server.  For purposes of verifying that the Pi-hole is doing its job, I have disabled this setting which forces the clients to only use the Pi-hole and nothing else.  For long term use, you could either configure a public DNS server like OpenDNS or Google in the “DNS Server 2” field, or set “Advertise router’s IP in addition….” To “Yes”.
    rbp24png
  34. Now, we will test that the Pi-hole is doing its job and blocking ads.  Before doing this testing, be sure to disable any browser-based ad blockers like “AdBlocker” so that we don’t mask the results of the test.  Also, since you’ve already set your DHCP options to use the Pi-hole, you will need to manually override your device’s IP settings to use the router, or something other than the Pi-hole, as your DNS server.Now, pick a site that is chock full of ads – while I’m not the Hollywood gossip type, I figure they spam you pretty well, so I went to www.thehollywoodgossip.com and sure enough, several Amazon ads were there (I guess someone has been shopping for pastel colored Yeti tumblers in this house).
    rbp25
  35. Now that we have our “control”, go ahead and revert to using the Pi-hole as the DNS server on your device.  Refresh the page (Ctrl + F5 if using Windows) and you shouldn’t see any ads this time.rbp26
  36. Also, the Pi-hole hosts a webpage that gives you ad blocking statistics, which I find really interesting.  In your browser, navigate to http://%5Byour-pihole-ip%5D/adminIn just a few minutes of time, I can see that 16.5% of my internet traffic has been for ads – not an insignificant amount.  It’s also worth noting that this page is open for access by default, so anyone that knew the IP of your Pi-hole and was aware of how to pull up the “admin” page could open your query log and see everything you’ve accessed on the internet.pi-hole-stats

    I found a good blog post that details how to add authentication to force password protection to a page hosted on your Raspberry Pi.  This is probably a good idea to do, for obvious reasons.  I haven’t had a chance to try yet, but will update this post when I get it setup.

Advertisements

Installing Nutanix NFS VAAI .vib on ESXi Lab Hosts

This post covers the installation of a Nutanix NFS VAAI .vib on some “non-Nutanix” lab hosts.

Why would one do this?  Several months ago I stood up a three node lab environment accessing “shared” storage using a Nutanix filesystem whitelist (allows defined external clients to access the Nutanix filesystem via NFS).  While the Nutanix VAAI plugin for NFS would normally be installed on the host as part of the Nutanix deployment, it obviously was not there on my vanilla ESXi 6.0 Dell R720 servers accessing the whitelist….which made things like deploying VM’s from template and other tasks normally offloaded to the storage unnecessarily slow.

Since Nutanix just released “Acropolis Block Services / ABS” GA in AOS 4.7 (read more about it at the Nutanix blog) there’s probably less of a reason to use filesystem whitelists for this purpose now, but alas, maybe someone will find it useful (*edit* – it’s worth noting that ABS doesn’t currently support ESXi.  I haven’t tried to see if it actually workyet but needless to say, don’t do it from a production environment and expect Nutanix to help you *edit 1/27/17* as of AOS 5.0 released earlier this month, ESXi is supported using ABS)  At the time of this blog post, Windows 2008 R2/2012 R2, Microsoft SQL and Exchange, Red Hat Enterprise Linux 6+, and Oracle RAC are supported.  NFS whitelists aren’t supported by Nutanix for the purpose of running VM’s, either.

  1. The first step is to SCP the Nutanix NFS VAAI .vib from one of your existing CVM’s.  Point your favorite SCP client to the CVM’s IP, enter the appropriate credentials, and browse to the following directory:/home/nutanix/data/installer/%version_of_software%/pkg2016-06-27 07_49_20-PhotosCopy the “nfs-vaai-plugin.vib” file to your workstation so that it can be uploaded to storage connected to your ESXi hosts using the vSphere Client.
  2. Once the .vib is uploaded to storage accessible by all ESXi hosts, SSH to the first host to begin installation.  You may need to enable SSH access on the host as it’s disabled by default.  This can be done by starting the SSH service in %host% > Configuration > Security Profile > Services “Properties” in the vSphere Client.
  3. Once logged in to your ESXi host, we can verify that the NFS VAAI .vib is missing by issuing the “esxcli software vib list” command.vib-listIf the .vib were present, we’d see it at the top of the list.
  4. Now we need to get the exact path to location you placed the .vib on your storage.  This can be done by issuing the “esxcli storage filesystem list” command.  You will be presented with a list of all storage accessible to the host, the mount point, the volume name, and UUID.storage-listHighlight the “mount point” of the appropriate storage volume so that we can paste it into the next command.  Alternatively, you could use the “volume name” in place of the UUID in the mount point path, but this was easier for me.
  5. Next, we will  install the .vib file using the “esxcli software vib install -v “/vmfs/volumes/%UUID_or_volume_name%/%subdir_name%/nfs-vaai-plugin.vib”” command.  I created a subdirectory called “VIBs” and placed the nfs-vaai-plugin.vib file in it.  Be careful as the path to the file is case sensitive.vib-installIf the install was successful, you should see a message indicating it completed successfully and a reboot is required for it to take effect.  Assuming your host is in maintenance mode and has no running VM’s on it, go ahead and reboot now.
  6. Once the host has rebooted and is back online, start a new SSH session and issue the “esxcli software vib list” command again and you should see the new .vib at the top of the list.install-confirmationVoila!  You can now deploy VM’s from template in seconds.itsbeautifulmeme

Nutanix – Taking It to the .NEXT Level

I was happy to participate in the opening “Nutanix Champion” event to “ring in” the day one keynote.  When I got back stage during the rehearsal, it was evident a lot of people worked real hard to do something fun for the opening acts (Angelo Luciani, Julie O’Brien  and surely many more…so, kudos to you!).

IMG_20160621_071414.jpg

ClfX_wkVAAAeETb

Picture credit to https://twitter.com/@ClaireBelly

And now, my take on some of the announcements today….

Acropolis File Services:

This announcement fits into the recurring theme of “power through software” – leveraging commodity hardware to deliver additional value based on software upgrades and enhancements.  Acropolis File Services allows you to leverage your existing investment to expose the Nutanix file system for scale out file level storage.

Initially SMB 2.1 will be supported with other protocols (NFS, never versions of SMB, etc) on the roadmap.  AHV and ESXi hypervisors will be supported at GA.  Other features include user driven restore leveraging file level and server level snapshots (think file level recovery and disaster recovery, respectively) with asynchronous replication on the roadmap for Q4.

There are some interesting use cases I can see for this, such as user profile storage and replication for desktop and application virtualization environments and low cost scale out file services using the included Acropolis Hypervisor + Nutanix storage nodes.

Acropolis Block Services:

Similar to Acropolis File Services, Acropolis Block Services exposes the Nutanix file system as an iSCSI target for bare metal servers and applications.  Though the file system is exposed to a bare metal workload, all the great Nutanix features are preserved for them (snapshot, clone, data efficiency services, etc).  Again, this is a demonstration of “power through software” and the evolution of the platform – these features and support were first available for VM’s, then files, and now bare metal.

The Nutanix file system is presented via the iSCSI protocol a little differently than iSCSI is normally implemented.  Instead of resiliency built into the protocol being leveraged (ALUA, multipathing, etc).  Multipathing is handled by the back end, paths are managed dynamically and in the event of a node failure, failover is handled on the backend.  While “best practices” for iSCSI are usually well documented based on the vendor or platform, not having to rely on as much client side configuration and optimization removes the human element and thus risk for “PEBKAC” issues.  I’ve seen the “human element” manifest itself in more than one iSCSI implementation.

I see this feature being a big deal for shops that have both hyperconverged and traditional 3 tier deployed in the same datacenter.  Due to some extenuating circumstance (like a crappy software vendor that STILL doesn’t support virtualization in the year 2016) or an investment in physical servers the business wishes to extract value from, physical servers and/or traditional SAN storage must exist in parallel.  Being able to present traditional “block” storage to a bare metal server or app may remove that last roadblock on the journey.

“All Flash on All Platforms”

As the cost of flash continues to plummet, it continues to become more pervasive in the datacenter.  The capacity of SSD’s has surpassed that of traditional spindles (though at a cost right now) and I foresee a day where “flash first” is the commonplace policy.  As such, starting with the Broadwell / Nutanix G5 platform, all flash config will be available on all platforms. 

Microsoft Cloud Platform System Loaded from Factory

This was announced last week, but in a nutshell Nutanix and Microsoft collaborated to offer the Microsoft Cloud Platform Standard (CPS) installed from the factory.  This offers a more turnkey private cloud that accelerates the time to value and allows for “day one” operation. All “patching operations” are integrated into the Nutanix “One Click Upgrade” platform, further streamlining the day to day care and feeding that historically has burned up so much administrator time.

In addition, Nutanix will support the entire stack from the hardware up to the software, just like they have been doing for vSphere, Hyper-V, and AHV already.  There’s a lot to be said for “one throat to choke” when it comes to technical support.  I’m sure we’ve all been an unwilling participant in the “vendor circular firing squad” at some point.  My experience with Nutanix support has always been excellent, both from the technical capability of the support engineers as well as the customer service they deliver.

Prism Enhancements

Some big enhancements are coming to Prism.  Building upon “Capacity Planning” in Acropolis Base Software 4.6, “What if?” modeling will be added.   Instead of just projections based on existing workloads, you’ll be able to model scenarios such as onboarding of a new client or introduction of a new application or service at a granular level.

One of the benefits of the “building block / right sized” hyperconverged model is being able to accurately size for your existing workload, allowing for sufficient overhead, without overbuying based on best effort projections where your environment might be 3 years out.  Calculation based on existing utilization and growth was the first step, and then modeling “what if _____” is the next evolution of accurately projecting the next “building block” required to meet compute, IOPS, and capacity needs for “just in time forecasting”.  Maybe a “buy node now” button is in order through Prism 😛

Network Visualization

Enhancements to Prism will allow for quick configuration and visualization of the network config in AHV – both the config in the hypervisor and the underlying physical network infrastructure.  This makes finding the root cause of an issue much quicker and lowering the overall time til resolution by being more aware of the underlying network infrastructure.  Josh Odgers has a great blog post covering this with some nice screenshots of the Prism UI so I won’t bother reinventing the wheel http://www.joshodgers.com/2016/06/15/whats-next-2016-prism-integrated-network-configuration-for-ahv/

Community Edition:

Another notable milestone in the Nutanix ecosystem is 12,000+ downloads of Community Edition to date (and nearly 200 activations a week).  Hosted Community Edition trials will be available free in two hour blocks through the Nutanix portal as a “test drive”.  Another option for getting your hands on Nutanix CE are to install it on your own “lab gear” – Angelo Luciani has a great blog post on using an Intel NUC (are multiple NUC’s NUCii?)  https://next.nutanix.com/t5/Nutanix-Connect-Blog/The-Prestige-Continues-Community-Edition/ba-p/10399

Or….maybe on a drone?

20160621_093840.jpg

Other Interesting Notes:

During the general session, some statistics were presented regarding the adoption of Acropolis Base Software 4.6.  500 clusters were updated to 4.6 within 7 days of release, and an overall 43% adoption within 100 days.  It was also noted that there was a significant performance increase available in 4.6 and as such 43% of customers received up to a 4x performance increase at no cost – I’ll say it again, power through software.

20160621_095621.jpg

Another statistic I found interesting was that 15% of the customer base is now running AHV.  I suspect that percentage will increase significantly over the next 12 months with all the new features now native to AHV combined with the ease of “One Click” online hypervisor conversion.

A Picture to Sum It Up:

As someone who’s dealt with infrastructure that was everything but invisible, I think this says it all…

20160621_101445.jpg

“Power Through Software”:

20160621_095004.jpg

Other Blogs to Check Out:

I know I didn’t capture all the announcements here, but some other good blog posts I’ve seen today are worth a read…

Josh Odgers:

http://www.joshodgers.com/ (there’s a whole series here titled “What’s .NEXT 2016”

Marius Sandbu:

https://msandbu.wordpress.com/2016/06/21/whats-coming-from-nutanix-announcements-from-next/

Eduardo Molina:

http://molikop.com/2016/06/nutanix-nextconf-2016-keynote-day-1/

Nutanix Acropolis Base Software 4.5 (NOS release)…be still my heart!

Today Nutanix announced the release of “Acropolis Base Software” 4.5…the software formerly known as NOS.  I happened to be in the Nutanix Portal this morning and didn’t even notice the release in the “Downloads” section due to the name change…thankfully @tbuckholz was nice enough to alert me to this wonderful news.

smiley-with-heart-shaped-eyes

I read through the Release Notes and was pretty excited with what I found – a bunch of new features that solve some challenges and enhance the environment I’m responsible for caring and feeding on a daily basis.  Some of these features I knew were coming, others were a surprise.  There’s a ton of good stuff in this release so I encourage you to check them out for yourself.

A short list of some of the things particularly interesting to me in no particular order…

  1. Cloud Connect for Azure – prior to this release, Nutanix Cloud Connect supported AWS…it’s good to have options.  I was actually having a conversation with a coworker yesterday about the possibility of sending certain data at our DR site up to cloud storage for longer / more resilient retention.

    The cloud connect feature for Azure enables you to back up and restore copies of virtual machines and files to and from an on-premise cluster and a Nutanix Controller VM located on the Microsoft Azure cloud. Once configured through the Prism web console, the remote site cluster is managed and monitored through the Data Protection dashboard like any other remote site you have created and configured. This feature is currently supported for ESXi hypervisor environments only. [FEAT-684]

  2. Erasure Coding – lots of good info out there on this feature that was announced this summer so I won’t go into too much detail.  Long story short it can allow you to get further effective capacity out of your Nutanix cluster.  A lower $ : GB ratio is always welcome. @andreleibovici has a good blog post describing this feature at his myvirtualcloud.net site.

    Erasure CodingComplementary to deduplication and compression, erasure coding increases the effective or usable cluster storage capacity. [FEAT-1096]

  3. MPIO Access to iSCSI Disks – another thing I was fighting Microsoft support and a couple other misinformed people about just last week.  One word:  Exchange.  Hopefully this will finally put to rest any pushback by Microsoft or others about “NFS” being “unsupported”.  I spent a bunch of time last week researching the whole “NFS thing” and it was a very interesting discussion.  @josh_odgers spent a lot of time “fighting the FUD” if you will and detailing why Microsoft should support Exchange with “NFS” backed storage.  A few of my favorite links THIS, THIS (my favorite), and THIS WHOLE SERIES.

    Acropolis base software 4.5 feature to help enforce access control to volume groups and expose volume group disks as dual namespace disks.

  4. File Level Restore (Tech Preview) – this was one of the “surprises” and also one of my favorites.  We are leveraging Nutanix Protection Domains for local and remote snapshots for VM level recovery and Veeam for longer term retention / file based recovery.  However, the storage appliance that houses our backup data can be rather slow for large restores so the ability to recover SOME or ALL of a VM using the Nutanix snapshots I already have in place is a big deal for me.

    The file level restore feature allows a virtual machine user to restore a file within a virtual machine from the Nutanix protected snapshot with minimal Nutanix administrator intervention. [FEAT-680]

  5. Support for Minor Release Upgrades for ESXi hosts – this is nice for those random times that you need to do a minor revision upgrade to ESXi because “when ____ hardware is combined with ______ version of software ______, ______ happens”.  We’ve all been there.  Nutanix still qualifies certain releases for one click upgrade, but there is now support for upgrades using the Controller “VM cluster” command.

    Acropolis base software 4.5 enables you to patch upgrade ESXi hosts with minor release versions of ESXi host software through the Controller VM cluster command. Nutanix qualifies specific VMware updates and provides a related JSON metadata upgrade file for one-click upgrade, but now customers can patch hosts by using the offline bundle and md5sum checksum available from VMware, and using the Controller VM cluster command. [ENG-31506]

It’s always nice to get “new stuff” with a super simple software upgrade.  Thanks for taking the time to read and I encourage you to check out some of the other features that might be of interest to your environment.

Veeam + Nutanix: “Active snapshots limit reached for datastore”

Last night I ran into an interesting “quirk” using Veeam v8 to back up my virtual machines that live on a Nutanix cluster.  We’d just moved the majority of our production workload over to the new Nutanix hardware this past weekend and last night marked the first round of backups using Veeam on it.

We ended up deploying a new Veeam backup server and proxy set on the Nutanix cluster in parallel to our existing environment.  When there were multiple jobs running concurrently overnight, many of them were in a “0% completion” state, and the individual VM’s that make up the jobs had a “Resource not ready: Active snapshots limit reached for datastore” message on them.

veeam 1

I turned to the all-knowing Google and happened across a Veeam forum post that sounded very similar to the issue I was experiencing.  I decided to open up a ticket with Veeam support since the forum post in question referenced Veeam v7, and the support engineer confirmed that there was indeed a self-imposed limit of 4 active snapshots per datastore – a “protection method” of sorts to avoid filling up a datastore.  On our previous platform, the VM’s were spread across 10+ volumes and this issue was never experienced.  However, our Nutanix cluster is configured with a single storage pool and a single container with all VM’s living on it, so we hit that limit quickly with concurrent backup jobs.

The default 4 active snapshot per datastore value can be modified by creating a registry DWORD value in ‘HKEY_LOCAL_MACHINE\SOFTWARE\Veeam\Veeam Backup and Replication\’ called MaxSnapshotsPerDatastore and use the appropriate hex or decimal value.  I started off with ’20’ but will move up or down as necessary.  We have plenty of capacity at this time and I’m not worried at all about filling up the storage container.  However, caveat emptor here because it is still a possibility.

This “issue” wasn’t anything specific to Nutanix at all, but is increasingly likely with any platform that uses a scale-out file system that can store hundreds or thousands of virtual machines on a single container.

NetScaler 10.5 – fsck commands have changed

Just a heads up in case anyone else runs into this, the process for verifying the file system integrity of a NetScaler appliance has apparently changed in the 10.5 firmware, but Citrix’s documentation does not (yet) reflect this.

We had a “surprise network outage” over the weekend which took down iSCSI connectivity, and with it, my NetScaler VPX HA pair.

images

Shortly after everything came back up, we began receiving alerts from NetScaler Command Center that “hardDiskDriveErrors” are seen on the system.

nserror

Alright, fine, let’s verify file system integrity and see what there is to see.  Citrix article CTX122845 outlines the process…or so I thought.  I’d never done this/had to do this before, so I couldn’t compare to a past experience.

In Step 5, you are told to press enter after booting to single user mode and the command prompt of the appliance changes to “\u@\h$” but it doesn’t…just “\u@”.  Then, in Step 6 you are directed to enter the “/sbin/fsck /dev/######” command.  However, that command didn’t work either.

I opened a ticket with Citrix support to rule out a PEBKAC error (quite likely if I’m involved) and they confirmed these commands indeed do not work, and they’ve actually changed in the NS 10.5 firmware.

In this screenshot, we see the error when attempting to enter the command according to Step 6 in CTX122845

ns2

Here is an example of the correct command “\u@fsck -t ufs /dev/######”

ns1

Also, be sure to check out what your actual device names are according to CTX121853, it does vary by model and in the case of the VPX, differs from the example used in CTX122845.

Hopefully Citrix will update their documentation soon, but if not, this might be your work around.

A tale of two firmware upgrades…

On this fine Friday afternoon, I thought I’d have a little fun comparing and contrasting the firmware upgrade process on two different storage solutions. We recently bought some Nutanix 8035 nodes to replace the existing storage platform. While I wouldn’t necessarily call Nutanix “just” a storage platform, the topic of this discussion will be the storage side of the house. For the sake of anonymity, we’ll call our existing storage platform the “CME XNV 0035”.

One of the biggest factors in choosing the Nutanix platform for our new compute and storage solution was “ease of use”. And there’s a reason for that – the amount of administrative effort required to care and feed the “CME XNV 0035” was far too high, in my opinion. Even a “simple” firmware upgrade took days or weeks of pre-planning/scheduling and 8 hours to complete in our maintenance window. Now that I’ve been through the firmware upgrade on our Nutanix platform, I thought a compare and contrast was in order.

First, let me take you through the firmware upgrade on the “CME XNV 0035”

  1. Reach out to CME Support, open a ticket, and request a firmware update. They might have reached out to you proactively if there was a major bug/stability issue found. (Their product support structure is methodical and thorough, I will give them that) 30 minutes
  2. An upgrade engineer was scheduled to do the “pre upgrade health check” at a later date.
  3. The “pre upgrade health check” occurs, logs and support data are gathered for later analysis. Eventually it occurred frequently enough I’d just go ahead and gather and upload this data on my own and attach it to the ticket. 1 hour
  4. A few hours to a few days later, we’d get the green light from the support analysis that we were “go for upgrade”. In the mean time, the actual upgrade was scheduled with an upgrade engineer for a later date during our maintenance window…typically a week or so after the “pre upgrade health check” happened.
  5. Day of the upgrade – hop on a Webex with the upgrade engineer, and begin the upgrade process.  Logs were gathered again and reviewed.  This was a “unified” XNV 0035, though we weren’t using the file side…..I’m…..not sure why file was even bought at all, but I digress….which meant we still had to upgrade the data movers and THEN move onto the block side.  One storage processor was upgraded and rebooted…took about an hour, and then the other storage processor was upgraded and rebooted…took another hour.  Support logs were gathered again, reviewed by the upgrade engineer, and as long as there were no outstanding issues, the “green light” was given.  6-8 hours

Whew……7.5 – 9 hours of my life down the drain…

download

Now, let’s review the firmware upgrade process on the Nutanix cluster

  1. Log into Prism, click “Upgrade Software” 10 seconds
  2. Click “Download” if it hasn’t done it automatically 1 minute (longer if you’re still on dial up)
  3. Click “Upgrade”, then click the “Yes, I really really do want to upgrade” button (I paraphrase) 5 seconds
  4. Play “2048”, drink a beer or coffee, etc. 30 minutes
  5. Run a “Nutanix Cluster Check (NCC)”
  6. Done

There you have it, 31 minutes and 15 seconds later, you’re running on the latest firmware.  Nutanix touts “One click upgrades”, but I counted four, technically.  I can live with that.

Yes, this post is rather tongue in cheek, but it is reflective of the actual upgrade process for each solution.  Aside from the initial “four clicks”, Nutanix handles everything else for you and the firmware upgrade occurs completely non-disruptively.

unnamed