Categories
VMware VMware Cloud on AWS

VMC Host Errors

When you run a large enough Infrastructure failure is inevitable. How you handle that can be a big differentiator. With VMware Cloud on AWS, the hosts are monitored 24×7 by VMware/AWS Support all as part of the service. If you pay for X number of hosts you should have X. That includes during maintenance and failure operations.

I’m not sure lucky is the right word but I did witness a host issue with a customer I was working with. True to the marketing It was picked up and automatically remediated.

Looking at the log extract above a new host was being provisioned the same minute the issue was identified. Obviously this needed to boot and join the VMware/vSAN cluster before a full data evacuation takes place on the faulty host and finally, the host is removed.

All of this was seamless to the customer. I noticed it as a few HA alarms tripped in the vCentre ( These were cosmetic only)

Just another reason why you should look at the VMware Cloud on AWS Service

Categories
VMware

VMware Certified Master Specialist HCI 2020

I recently sat ( and passed the VMware HCI Master Specialist exam (5V0-21.20). I wont go into any details of the contents but I will comment that I felt the questions were fair and that there wasn’t anything in it to trip you up. The required knowledge was certainly wider than the vSAN specialist exam.

This was my third remote proctored exam and I must say the experience has improved. Partially that is down to Pearson Vue improving the process and partially down to knowing what to expect a bit better.

Together with the vSAN exam that I passed earlier in the month, this entitles me to a VMware Master Services Competency. This is now the second that I hold and I must say I like the way VMware is delivering a thought out learning path.

Categories
VMware VMware Cloud on AWS

New Host Family

VMware Cloud on AWS has introduced a new host to its lineup the “i3en”. This is based on the i3en.metal AWS instance.

The specifications are certainly impressive packing in 96 logical cores, 768GiB RAM, and approximately 45.84 TiB of NVMe raw storage capacity per host.

It’s certainly a monster with a 266% uplift in CPU, 50% increase in RAM and a whopping 440% increase in raw storage per host compared to the i3. Most of the engagements I have worked on so far have discovered that they are storage limited requiring extra hosts to handle the required storage footprint. With such a big uplift in Storage capacity hopefully, this will trend towards filling up CPU, RAM & Storage at the same time. This is the panacea of Hyperconvergence.

The other two noticeable changes are that the processor is based on a much later Intel family. It is now based on 3.1 GHz all-core turbo Intel® Xeon® Scalable (Skylake) processors. This is a much more modern processor than the Broadwell’s in the original i3. This brings a number of processor extension improvements including Intel AVX, Intel AVX2, Intel AVX-512

The other noticeable change is the networking uplift with 100Gb/s available to each host.

Model pCPUMemory GiBNetworking GbpsStorage TBAWS Host Pricing (On-demand in US-East-2 Ohio)
i3.metal36*512258×1.9$5.491
i3en.metal967681008×7.5$11.933

*The i3.metal instance, when used with VMware Cloud on AWS has hyperthreading disabled.

At present this host is only available in the newer SDDC versions (1.10v4 or later) and in limited locations.

It also looks like the i3 still has to be the node used in the first cluster within the SDDC (where the management components reside) and they aren’t supported in 2 node clusters.

At the time of writing pricing from VMware is not available however pricing is available for the hosts if they were bought directly from AWS. Assuming the VMware costs fall broadly in line with this giving:

VMware have now released pricing. The below is for On-Demand in the AWS US-East region.

i3.Metal is £6.213287 per hour & i3en.Metal £13.6221 per hour giving:

  • A cost per GB of SSD instance storage that is up to 50% lower
  • Storage density (GB per vCPU) that is roughly 2.6x greater
  • Ratio of network bandwidth to vCPUs that is up to 2.7x greater

This new host type adds an additional complication into choosing host types within VMware Cloud on AWS but makes it a very compelling solution.

Categories
AWS Veeam VMware VMware Cloud on AWS

Monitoring VMC – Part 1

As previously mentioned I have been working a lot with VMware Cloud on AWS and one of the questions that often crops up is around an approach to monitoring.

This is an interesting topic as VMC is technicaly “as a service” therefore the monitoring approach is a bit different. Technically AWS and VMware’s SRE teams will be monitoring all of the infrastructure components,

however you still need to monitor your own Virtual Machines. If it was me I would still want some monitoring on the Infrastructure and I see two different reasons why you would want to do this:

Firstly I want to check that the VMware Cloud on AWS service is doing what I am paying for. Secondly I still need to monitor my VM’s to ensure they are all behaving properly, the added factor is that with a good realtime view of my workload I can potential optimise the number of VMC hosts in my fleet reducing the costs.

With that in mind, I decided to look at a few options for connecting some monitoring tools to a VMC enviroment to see what worked and what didn’t.  I am expecting some things could behave differently as you don’t have true root/admin access as you would usually do.  All of the tests will be done with the cloudadmin@vmc.local account.   This is the highest level account that a service user has within VMC.

The first product that I decided to test was Veeam One.  This made sense for a few reasons:  Firstly I’m a Veeam Vanguard and am very familiar with the product. I also have access to the Beta versions of the v10 products as part of the Vanguard program.

Secondly, it’s pretty easy to spin up a test server to kick the tyres and finally, the config is incredibly quick to implement.

I could have easily added a VMC vCentre to my existing Veeam servers however I choose to deploy a new server just for this testing.  Assuming you have network access between your Veeam One server and the VMC vCentre then adding to Veeam One is straightforward. If not you will need to open up the relevant firewall’s

Once done Veeam performs an inventory operation and returns all of the objects you would expect.   This test was shortly after a VMC environment was created so it doesn’t yet have any workloads migrated to it.  However, as you can see below its correctly reporting on the hosts and VM workloads. It is correctly reporting back that the hosts are running ESXi 6.9.1

I also ran a couple of test reports to check they functioned as expected. Everything seemed to work as I would expect.

In Part two I am going to look at using  Grafana, Influxdb and Telegraf and seeing if this common opensource monitoring stack works with VMC.

Categories
AWS Veeam VMware

Monitoring VMC – Part 1

As previously mentioned I have been working a lot with VMware Cloud on AWS and one of the questions that often crops up is around an approach to monitoring.

This is an interesting topic as VMC is technicaly “as a service” therefore the monitoring approach is a bit different. Technically AWS and VMware’s SRE teams will be monitoring all of the infrastructure components,

however you still need to monitor your own Virtual Machines. If it was me I would still want some monitoring on the Infrastructure and I see two different reasons why you would want to do this:

Firstly I want to check that the VMware Cloud on AWS service is doing what I am paying for. Secondly I still need to monitor my VM’s to ensure they are all behaving properly, the added factor is that with a good realtime view of my workload I can potential optimise the number of VMC hosts in my fleet reducing the costs.

With that in mind, I decided to look at a few options for connecting some monitoring tools to a VMC enviroment to see what worked and what didn’t.  I am expecting some things could behave differently as you don’t have true root/admin access as you would usually do.  All of the tests will be done with the cloudadmin@vmc.local account.   This is the highest level account that a service user has within VMC.

The first product that I decided to test was Veeam One.  This made sense for a few reasons:  Firstly I’m a Veeam Vanguard and am very familiar with the product. I also have access to the Beta versions of the v10 products as part of the Vanguard program.

Secondly, it’s pretty easy to spin up a test server to kick the tyres and finally, the config is incredibly quick to implement.

I could have easily added a VMC vCentre to my existing Veeam servers however I choose to deploy a new server just for this testing.  Assuming you have network access between your Veeam One server and the VMC vCentre then adding to Veeam One is straightforward. If not you will need to open up the relevant firewall’s

Once done Veeam performs an inventory operation and returns all of the objects you would expect.   This test was shortly after a VMC environment was created so it doesn’t yet have any workloads migrated to it.  However, as you can see below its correctly reporting on the hosts and VM workloads. It is correctly reporting back that the hosts are running ESXi 6.9.1

I also ran a couple of test reports to check they functioned as expected. Everything seemed to work as I would expect.

 

 

In Part two I am going to look at using  Grafana, Influxdb and Telegraf and seeing if this common opensource monitoring stack works with VMC.

 

Categories
Homelab Storage VMware

NFS 4.1

Switching on NFS4.1 In the Homelab

I like a number of Homelabers use Synology for storage.  In my case, I have two a 2 bay DS216+ and a 4 bay DS918 That I have filled with SSD’s

NFS has been the prefered storage protocol for most people with Synology for two main reasons the biggest being simplicity but it’s tended to offer better performance than iSCSI by all accounts.

For me, the performance (especially on the DS918+) is great with one clear exception.   That would be Storage vMotion. It’s not often that I move VM’s around but when I do its a tad painful.   This is because I only have gigabit networking and NFS was limited to a single connection. However, it’s now possible to fix this…..

I have tried to find out when Synology officially supported NFS 4.1 but couldn’t find a reliable answer.  It has been a CLI option for a while but it certainly exists in DSM 6.2.1

The first thing to do is to make sure it’s enabled.

Then from vSphere create a new datastore

Make sure to select NFS 4.1

Then add the Name and configuration this is where the subtle differences kick in.

Note the plus on the server line where multiple inputs can be added.   In my setup, I have two IP addresses (one for each interface on my DS918)

Although NFS 4.1 supports Kerberos I don’t use it.

Finally, mount to the required hosts.

Of course, if you want to do with Powershell that’s also an option

[codeblocks name=’NFSMount’]

The other really nice thing is VAAI is still supported and if you want to see the difference here is a network graph from the Synology during a Storage vMotion clearly better than the single network performance.  This makes me much happier.

A Note of caution for anyone wanting to do this. DONT have the same NFS datastore presented into VMware with NFS 3 and NFS 4.1 protocols. The locking mechanisms are different and so bad things are likely to happen.   I chose to evacuate the datastore unmount and represent as 4.1 for all of mine.

Categories
VMware

vRealize Suite LifeCycle Manager – Environment

Intro

As part of my new role, I have worked extensively on a project deploying VMware’s vRealize Suite Lifecycle Manager. It’s a project that is fairly new in the VMware ecosystem and not a lot of people have come across it.  If you run any of the following products then it’s worth checking it out.

vRealize Automation

vRealize Operations

vRealize Log Insight

vRealize Network Insight

This is the first post of a few that I’m going to do on vRSLCM showing a bit of the environment Management and a product deployment

 

Split Personality

vRSLCM performs two fairly distinct roles.   These are Environment and Content Management. The Content management part of the product is a replacement for codestream (Houdini)

Environment management is used for deployment, Patching, Certificate and environment config management.

Environment Management

To deploy a component you must first set up an environment within vRSLCM

This involves creating a datacentre where you also add the target vCentre

 

When that is done you can create an environment.   An environment is like a wrapper for the products and controls them as a set. You can have multiple of these with the idea that you would have production, test development etc.   You can have multiple of each of these if required.

When you have your environment created its time to deploy a product. vRSLCM will do this step for you but it needs to have the relevant files within the appliance.  These are added in the settings section under product binaries.

You can use three different ways to add the relevant files into the appliance.  Connect it to an NFS share with the relevant files,  Manually upload via SSH or by far the easiest is to add your “My VMware” details and then vRSLCM will download them automatically.     The advantage to the My VMware method is that it can also track the available patches for the products.

I used a combination of NFS and My VMware to add the product binaries

Product Patches

 

Once that was done I added two separate environments  (one for testing and one to simulate production) and then deployed some workloads

Here you can see that I have vROPS deployed in test and both Log Insight and vROPS in the production environment.

I am now going to use vRSLCM to deploy Log Insight Into the test environment by adding it as a product

 

Here you can see I have selected Log Insight to be added (It is possible to add multiple at the same time.). I have gone for a “small” config and I have chosen version 4.6.0.

You will then be asked to confirm the user agreement and vRSLCM will take you into the deployment step.  Here you will provide the specific info. A really nice feature is that as part of this wizard if you provided the My VMware details earlier it will list the keys for you.

 

Most of the Infrastructure details are taken from the environment set up earlier Including vCentre, Cluster, Network details, Datastore NTP etc. It will also deploy certificates for you at this step (a really nice feature)

 

These are the only questions that the wizard can’t answer which is basically the node size to be deployed.   The name for the VM, Hostname and IP address.  Once you have added these to the wizard check that both forward and reverse DNS Is in place before going any further. This is because on the next step the vRSLCM does a prerequisite check.

 

Here you can see that the precheck failed as I had a clash of virtual machine names between my test and production environments. This is an issue as they are in the same vCentre/Cluster

With the precheck passing, you submit this and vRSLCM will go off and deploy.  Obviously, this can take quite a while depending on the config you have asked to deploy.   This can be monitored in the requests section.

Here you can see all the steps and if required can troubleshoot any failures.    When complete it should look something like the below

 

Going back to the Environment we now see that Test matches Production