Random Post: Fun with VMware vShield Edge
RSS .92| RSS 2.0| ATOM 0.3
  • Home
  •  

    VMworld session – VSP3305 — Upgrading to VMware ESXi 5.0

    August 29th, 2011

    Kyle Gleed presented a session Monday morning to cover tips hints and gotchas for the migration to ESXi 5.0.   5.0 will be the first version of vSphere to only ship with the stripped down ESXi.  No more service console.  At work we conducted the migration from ESX 3.5 to ESXi 4.1 in preparation of the release of 5.0.

    Kyle stressed that script, or command line management of ESXi will be conducted via the esxcli command set, vicfg and PowerCLI.  esxcfg commands have been deprecated.  It is important to note that esxcli commands can be run either local to the ESXi instance, or via the remote CLI suite.

    If you are planning on upgrading your ESXi instances to 5.0 (not including a full install), you must be running ESX 4.x.  NOTE: You can upgrade from ESX 4.x to ESXi 5.0 easily!  You can not upgrade from ESX 3.5 directly to 5.0.

    One important thing to note is that you must have 5G allocated for your boot device, be it SAN or local disk.  The ESXi installer partitions off the first 1G for the OS, and the next 4G for scratch space.

    The usual upgrade path of vCenter to 5.0, followed by ESXi to 5.0, and then virtual machine tool and/or virtual hardware, are steps that were taken from 3.5 to 4.x.  ESXi 5.0 now introduces VMFS-5, which according to VMware is a hot upgrade that does not affect running virtual machines.


    VAAI and supportability

    June 28th, 2011

    VAAI, or vStorage API for Array Integration, is an industry answer to offload specific storage requests from the ESX(i) servers to the storage array.

    VAAI implements only three functions, as described in VMware KB1021976:

    • Atomic Test & Set
    • Clone Blocks/Full Copy/XCOPY
    • Zero Blocks/Write Same

    As it turns out, only certain arrays support VAAI.  Since VAAI is on by default, at least in 4.1, the hosts will send VAAI commands to the array.  The array will usually say they are not supported, and then ESX will fall back to regular SCSI commands.  However, there seems to be occasions when SAN devices do not support VAAI, but can apparently cause failure states.  I’ve seen it.  It’s true.

    Now, how do we determine if the SAN handles VAAI?  SSH to your servers and run the following command, via KB102197:

    esxcfg-scsidevs -l | egrep “Display Name:|VAAI Status:”

    If the array does not support VAAI, you will most likely see results such as:

    Display Name: <Vendor> Fibre Channel Disk (naa.600*)
    VAAI Status: unknown

    VMware KB1033665 tells us how to disable VAAI both from the ESX(i) command line, as well as the vSphere Remote CLI

    ESX(i) command line:

    esxcfg-advcfg -s 0 /DataMover/HardwareAcceleratedMove

    esxcfg-advcfg -s 0 /DataMover/HardwareAcceleratedInit

    esxcfg-advcfg -s 0 /VMFS3/HardwareAcceleratedLocking

    -s 0 will disable, -s 1 will enable. You can also substitute -s 0 with -g to see the current setting.

    Remote CLI:

    vicfg-advcfg.pl –server <servername> –username <possibly root> –password <password> -s 0 /DataMover/HardwareAcceleratedMove

    vicfg-advcfg.pl –server <servername> –username <possibly root> –password <password> -s 0 /DataMover/HardwareAcceleratedInit

    vicfg-advcfg.pl –server <servername> –username <possibly root> –password <password> -s 0 /DataMover/VMFS3/HardwareAcceleratedLocking

    where items in < > are dependent on your specific configuration.  You can again substitute -s 0 with -g to see the current setting.

     


    How available is VMware’s Round Robin Path Selection Plugin?

    June 10th, 2011

    VMware wisely introduced Round Robin (RR) as a supported path selection plugin (PSP) as part of their Native Multipathing (NMP) suite.  We originally dealt with Fixed Path (FP) and Most Recently Used (MRU), which led to administrators to directly manage I/O if their servers had multiple host bus adapters (HBA) and/or storage targets.  I’m sure most admins stuck with fixed path, which provided for an active/passive configuration, and went on with their business.  I know I have.

    Along comes Round Robin to many cheers and hoorays, and we merrily go about our business and configure our hosts to use it.  We extolled the virtues to our management saying how available our connection to the SAN fabric and storage has become now that we are using one path per I/O.  I know I did.

    Here is the rub.  Round Robin is not designed to provide High Availability.  As it turns out, VMware has designed and built RR (and the two other PSP for that matter) to only mark a path dead when it receives certain SCSI sense codes.  You can find the full list on the VMware KB.  They have followed the specifications, and I completely understand why they made their choices.

    But what happens when we are having issues with the SAN fabric and do NOT see those specifics sense codes?  If you guessed nothing, you would be very correct.  The implementation of RR will only move to the next path upon a SUCCESSFUL (sense code 0x0 or 0x00) I/O.  If we receive “soft” errors such as command: 0x2 (Host_Bus_Busy) or 0x28 (Task Set Full), RR will not move to the next path, due to the fact it has not received a 0x0 code.  VMware’s definition of SCSI error conditions can be found at VMware KB #1030381.  This means we will be STUCK on a path that is having problems, which results in failed I/Os.  You may notice the ESX(i) server disconnect from vCenter, the ability to run df from the command line diminish, and no way to enumerate the storage.  You will also notice that virtual machines stop responding to pings, and applications failures there-in.  This is a result of failed I/O’s in the virtual machine.  SCSI errors will be clearly visible within the logs of guest OS.

    You can find the error messages on ESX(i) in the system log from the command line:

    • do: cd /var/log
    • do: grep naa messages

    Or look through the old logs:

    • do: zcat messages*.gz |grep naa

    Due to ESXi’s very quick log rotation, you may be out of luck by the time you respond to an event.  You should take the time and export syslog from your ESXi hosts to a central server such as the vMA appliance.  If you need help setting up syslog, see Kanuj Behl’s post blog post on vmwise.com.


    vSphere Round Robin MultiPathing

    March 29th, 2011

    There are a number of blog posts describing the configuration of Round Robin (RR) multipathing on vSphere.  *Note: Content on this page has been distilled from the sources referenced below, as well as my colleague vmwise.com.  Check those sites for a deeper dive in to the content.  I’ve also removed some identifiers from the output.

    http://www.boche.net/blog/index.php/2010/02/04/configure-vmware-esxi-round-robin-on-emc-storage/

    http://www.yellow-bricks.com/2009/03/19/pluggable-storage-architecture-exploring-the-next-version-of-esxvcenter/

    http://www.ivobeerens.nl/?p=465

    The three commands that are your friends throughout this post:

    esxcli nmp satp list <- Storage Array Type Plugin (SATP)

    esxcli nmp psp list <- Path Selection Plugin (PSP)

    esxcli nmp device list <- List the LUNs from the SAN represented as their device names

    1) SSH in to the server (assuming you enabled remote tech support from the console).

    2) Display the current pathing configuration:

    esxcli nmp device list

    naa.60
    Device Display Name: Fibre Channel Disk (naa.60)
    Storage Array Type: VMW_SATP_DEFAULT_AA
    Storage Array Type Device Config: SATP VMW_SATP_DEFAULT_AA does not support device configuration.
    Path Selection Policy: VMW_PSP_FIXED
    Path Selection Policy Device Config: {preferred=vmhba:C:T:L;current=vmhba:C:T:L}

    3.1) If you have storage from NetApp, do(note, there are two dashes before “psp” and “satp”):

    esxcli nmp satp setdefaultpsp –psp VMW_PSP_RR –satp VMW_SATP_DEFAULT_AA

    3.2) If you have certain storage from an EMC DMX, do:

    esxcli nmp satp setdefaultpsp –psp VMW_PSP_RR –satp VMW_SATP_SYMM

    These commands will change the default pathing to round robin (PSP or Path Selection Plugin) for the specific SATP (Storage Array Type Plugin).

    3.3) At this point, you can reboot the  if LUNs were already presented.  If no SAN storage is attached, scan in the new devices, and they will be automagically set to round robin.  Or, run the following command to set the Path Selection Policy:

    for i in `ls /vmfs/devices/disks/ | grep naa.60` ; do esxcli nmp device setpolicy –device $i -P VMW_PSP_RR ; done

    4) Check the current config, post reboot:

    esxcli nmp device list

    naa.60
    Device Display Name: Fibre Channel Disk (naa.60)
    Storage Array Type: VMW_SATP_DEFAULT_AA
    Storage Array Type Device Config: SATP VMW_SATP_DEFAULT_AA does not support device configuration.
    Path Selection Policy: VMW_PSP_RR
    Path Selection Policy Device Config: {policy=rr,iops=1000,bytes=10485760,useANO=0;lastPathIndex=1: NumIOsPending=0,numBytesPending=0}

    Look at Path Selection Policy.  It now says WVM_PSP_RR instead of VMW_PSP_FIXED.  We are getting closer to our goal.

    5) Now we want to configure the round robin policy to send 1 IO down a path, and then round robin to the next path (note: there are two dashes before “type”).

    for i in `ls /vmfs/devices/disks/ | grep naa.60` ; do echo $i ; esxcli nmp roundrobin setconfig –type “iops” –iops=1 –device $i ;done

    This command will look in the /vmfs/devices/disks/ directory, grab anything that starts with naa.60 (which should pick up SAN storage), and then set the round robin policy to 1 IO per path.

    6) Verify the new configuration:

    esxcli nmp device list

    naa.60
    Device Display Name: Fibre Channel Disk (naa.60)
    Storage Array Type: VMW_SATP_DEFAULT_AA
    Storage Array Type Device Config: SATP VMW_SATP_DEFAULT_AA does not support device configuration.
    Path Selection Policy: VMW_PSP_RR
    Path Selection Policy Device Config: {policy=iops,iops=1,bytes=10485760,useANO=0;lastPathIndex=5: NumIOsPending=0,numBytesPending=0}

    Validating our output, we now have our policy=iops, and iops=1.


    ThinApp Howto

    March 19th, 2011

    I attended a brain dump session by Travis Sales (@thinappguru on twitter), one of the guys that built the original Thinstall prior the VMware’s purchase and re-branding.

    I decided to put together a step by step Howto ThinApp a program like Firefox.  The setup:

    Windows 7 64b, VMware Workstation 7.1.3, Windows XP SP2 guest VM and ThinApp Enterprise 4.6.1.  The VM is configured for Host Only networking.  A share has been configured on Windows 7 to hold content that will be used during the ThinApp process (*).

    Per Travis’ suggestion to “Know Thy App,” I have gone with XP SP2 with the following packages installed:

    • VMware Tools
    • Windows Installer 3.1
    • SP2

    I took a snapshot (#1) of the Virtual Machine for rollback.  I then installed ThinApp Enterprise, verified it worked, and took another snapshot (#2).  This will be my “gold” image.

    We are now ready to conduct a ThinApp Capture.   First fire up ThinApp -> Start -> Programs -> VMware ->ThinApp Setup Capture.

    You will see the Welcome Screen, hit Next.

    Select Prescan, in most cases.

    Once the scan is complete, we can now install the application, in our case it will be FireFox.

    Install FireFox as usual, and then verify it works.  At that point, click Postscan in the ThinApp window.

    I selected Mozilla Firefox.exe as the only entry point for our ThinApp.  In short, Entry Points are the Windows executables that allow the launch of the ThinApp.  For a more detailed description, check out this ThinApp team blog entry.

    Select the user groups that have permissions to run this ThinApp.  If this machine was connected to a Windows domain, AD groups can be selected here.  ThinApp permissions could then be managed via AD.  Very cool!

    I told our ThinApp to run in WriteCopy mode for security purposes.

    Place the sandbox on a windows share as described with (*) above.  We do this to allow for rollback of the VM to test the FireFox Thinapp, and keep our data intact.

    On the next few screens, select “No, Do not send info to VMware,” and Next on the plugin section.

    Change the Inventory Name from Mozilla Firefox 3.y.z to Mozilla Firefox 3.  This way we can easily upgrade .y.z versions, and have seperate trees for x. versions.

    Create the Package Settings with both the EXE entry point we selected above, as well as a MSI file.

    If you want to poke around on the build screen, go ahead.  I hit Build.

    The build is complete!

    Copy the Captures directory back to your file share.  The EXE and MSI will be found in the bin directory.

    Roll back your VM, re-mount the file share, and test the EXE.  Congrats, you just built your first ThinApp!


    VCAP-DCD Round-Up

    January 12th, 2011

    I sat for the VMware Certified Advanced Processional 4 – Datacenter Design (VCAP4-DCD) recently.  This was my first crack at a “design” test, and didn’t really know what to expect other than few public documents VMware has distributed.  So I put together this round-up.

    I do suggest sitting for the vSphere Design Workshop  http://mylearn.vmware.com/mgrreg/courses.cfm?ui=www&a=one&id_subject=13754.  It is a good review for the test.  My review can be found here: http://philthevirtualizer.com/2010/07/12/vmware-vsphere-design-workshop/

    Review the VCAP-DCD blueprint: http://mylearn.vmware.com/register.cfm?course=76644

    Watch the Exam UI demo: http://mylearn.vmware.com/courseware/82525/VCAPDCD_Tutorial.swf

    Look at questions by fellow VCAP candidates on the VCAP communities site: http://communities.vmware.com/community/vmtn/certedu/certification/vcap

    Another round-up page:

    http://www.seancrookston.com/2011/01/12/vmware-vsphere-design-workshop/

    And be patient.  You have 4 hours to work through the exam.  Best of luck!

    Update:  I have been assigned VCAPDCD-120!


    The Great Road Trip to the Cloud

    December 22nd, 2010

    Cloud computing is one of the new buzz words of the tech industry.  Everyone is jumping on the bandwagon.  The adoption of virtualization in the Enterprise has led to the rise of Cloud.  Cloud has even gone mainstream with Microsoft’s “To the Cloud” add campaign.

    I became interested in Cloud when I worked at a SaaS company.  At the time we had to graft three different environments together due to acquisition.  I started to think of a better way to standardize on an application server, Operating System and platform.  In effect we were dealing with a 3x3x3x3x3x3x3 syndrome.  We had 3 different web servers, 3 application servers, 3 operating systems, 3 database platforms, 3 SAN’s, 3 networks and 3 sets of hardware.  It was painful.

    I stumbled upon the blog of Don MacAskill from a service called SmugMug (http://www.smugmug.com)  He wrote about his version of SkyNet to elastically extend his environment to Amazon AWS.  Needless to say it was a turning point, sort of like when I heard Led Zeppelin I for the first time.

    A Short History

    Virtualization is not new technology.  In fact, it has its roots in Mainframes.  The tech industry is a circular beast.  Central computing with dumb terminals gave way to distributed computing, client/server, and now a hybrid where data can be found on multiple hubs, and a combination of smart and dumb spokes.  The industry also realized that running a data center is not an easy task.  Running multiple data centers incurs huge expense.  Thus, the rise of co-location.  Business realized it could be a cheaper proposition to pay someone else to do some of the dirty work (space, power, cooling, physically security), all the way up to a managed service.

    Business then realized it was still booking Capital Expense (CapEx) and Operational Expense (OpEx) in the dealing with co-lo.  Servers are not being used as much as expected.  When growth hit unexpectedly, giant road blocks presented themselves in both acquiring gear fast enough, but finding space, and still staying within the original co-lo agreement.

    Virtualization nudges itself in to the equation because people realized that everything shouldn’t be focused on the application and an infrastructure that is a) expensive and b) underutilized.  If you want to focus solely on your application stack, you can now do that.  If you don’t want to go through CapEx to buy infrastructure, you can easily lease CPU time, in effect, from the cloud.

    Cloud

    So now you may ask yourself “what is cloud computing?”  Good question.  A good answer: It all depends on who you ask.

    I’ll give you my opinion on the state of Cloud.

    • Public
    • Private
    • Hybrid
    • SaaS
    • IaaS
    • PaaS
    • AaaS

    Public: The cloud is hosted by a third-party, somewhere on the Internet.

    Private: The cloud is hosted inside the firewalls of the business.

    Hybrid: A grafting of resources from Public and Private clouds, used to augment the infrastructure.  In short, if Public and Private are two circles in a venn diagram, their intersection is Hybrid.

    Saas: It could be argued that Software as a Service (SaaS) was the first of the new generation of infrastructure that begat cloud.  A person or business consumes a resource that is hosted, and possibly sold, by a third-party.  Twitter and Facebook and World of Warcraft all fall in to this category.  The SaaS provider usually built their own web, application and database servers, storage and network.  Most likely at great cost.  The environment may have been self-hosted, or in a co-lo.

    IaaS: I believe technology developed by VMware has led to Infrastructure as a Service (IaaS).  I know IBM, Sun and HP have been doing virtualization for years, but only on high-end gear.  VMware was the mainstream player that rammed it down everyone’s throats.  Turning cheap x86 based servers in to powerhouses.  Servers went from scale out, to scale up/scale out configurations.  We need bigger, but less.  Short provision cycles, and chargeback models all help to turn IaaS in to a business generator, and less a budget black hole.  Amazon AWS is probably the biggest player in Public Cloud IaaS.

    PaaS: PaaS provides an infrastructure as a bundled stack, where infrastructure is abstracted and is presented as a consumable resource.  It seems to me that VMware’s vCloud Director is going to allow business to provision the private cloud, and sell resources to its internal, and external customers.

    AaaS: I count the App as a Service to be a power-play by vendors.  They give application developers a fully abstracted platform, and expose certain pieces by API calls.  The users and developers on top of this platform do not care at all how the plumbing works, only that it does.  Google App Engine, Microsoft Azure and Salesforce are big players in this arena.  VMware and Red Hat are making in-roads with their latest purchases.

    Conclusion

    The race to the cloud includes a tipping-point for business when consuming public-cloud resources becomes more expensive than building a private-cloud.  There are always use-cases for all the current cloud types I have listed.  Industry is trying to build partnerships to allow private cloud application stacks to migrate to public, and vice-versa.  The technology is not ready as of the end of 2010, but by mid-2011 I do believe we will see the beginnings of true migration paths to create Hybrid clouds to create active-active infrastructure.

    This blog post will be a living document as things change.  Stay tuned!


    vCenter 4.1 Upgrade, Datastore Permission Problem

    October 26th, 2010

    After completing a recent upgrade of Virtual Center from 2.5 to vCenter 4.1, we hit an initial problem where we could not allocate Virtual Disks on Datastores.  We had to add the Allocate Space bit to Datastore Privileges for our custom roles.  This can be found under Home->Administration->Roles.  This problem occurs when you have created specific roles in 2.x and move to (at least) 4.1.  The Virtual-Jay blog also has a post about this: http://virtual-jay.blogspot.com/2010/01/watch-out-for-datastore-permissions-in.html


    PowerCLI – Generate Count of Running VM’s

    September 8th, 2010

    If you are like me, you have virtual machines in different states in your virtual environment.  Running, paused, or powered-off.  If you have ever been asked “How many VM’s do we have,” and you know the right answer is technically not what vCenter lists as total VM’s, run the following PowerCLI script:

    $vcounter=0
    
    (get-vm )| %{
      $vm = $_
      if( get-vmguest -VM $vm.Name |where-object {$vm.State -eq &quot;Running&quot;}){
        $vcounter++
        }
      }
    
    echo $vcounter
    
    

    Let’s break down line 5, where all of the magic happens.  We are running get-vmguest on the current VM pulled from get-vm on line 3, and then determine if it’s state is “Running.”  PowerCLI and vCenter differ in how they display the state of a VM: “Running” or “Not-Running” in PowerCLI vs “Powered On” or “Powered Off” in vCenter.

    Voila, you now have the count of virtual machines that are Powered-On and theoretically doing work.


    OS X, VMware Fusion, Apple Bootcamp

    August 27th, 2010

    I have been an Apple convert since the second generation iPod and eventually moved to a MacBook Pro over a year ago.  I have been using VMware Fusion on OS X ever since, and it is frankly a “killer app.”  I recently played with Bootcamp to dual boot in to Windows 7.

    Fusion provides the bells and whistles found in VMware Workstation, including Unity mode.   Unity presents your guest OS applications with the look and feel of OS X apps.  All of your app windows are integrated and can be independently min/maximized.  Brilliant.

    Fusion really shines when Bootcamp is employed.  Bootcamp is Apple’s instantiation of multi-boot, and frankly works well.  Fusion will automagically detect a Bootcamp-based OS, in my case Windows 7, and make it available to launch.  There is no need to conduct the import; the machine is simply listed in available machines.  At first run, Fusion will install tools and prep the guest OS to run virtualized hardware.  Booting the OS via Bootcamp will allow the OS to still run on the bare-metal.

    There have been blog posts, and a VMware KB, outlining the requirement to license Windows 7 in bare-metal mode, and then virtualized via Fusion, due to the fact the underlying hardware has technically changed (bare vs virtualized), however I was not prompted.