When I was writing an installation document on vSphere I thought and have some friends on twitter check the doc and comment on it. I received a lot of responses and decided to make a blog post out of it, so here it is.
vSphere default installation settings
When I’m out in the field installing vSphere 4, I want to make sure I install it the same way for each customer. Of course I also want to make sure I use the best practices, but a best practice will work in 90% of the cases and sometimes a best practice doesn’t work at all for your special case. What is important when applying these best practices: Keep thinking!!! But, if you make changes document them in the design and explain why the change is made, so that in a year when you’re with that customer upgrading to vSphere 5 (just kiddin’) you know why you changed the default.
Also remember that my way is not the only way. Especially in the partitioning section there is a lot of discussion on what is best and I think it is a discussion that will never end. Same goes for the HA settings weather to leave VMs powered on or shut them down as response to isolation. Duncan Epping wrote the HA deepdive guide and it is a must read for anyone configuring HA. See: http://www.yellow-bricks.com/vmware-high-availability-deepdiv/. Again, don’t just apply my settings without thinking if they would fit your environment.
Here we go!
How to configure a default vSphere 4 host.
– Use strong root password
– Create at least one local user account on the host
Partitioning of local disk during install:
- / – 5120MB
- Swap – 1600MB
– Extended Partition:
- /var – 4096MB
- /home – 2048MB
- /opt – 2048MB
- /tmp – 2048MB
– For the local datastore use the name:Â <esxhostname>-local.
– When NOT using the full local disk to install ESX on, create at least a VMFS that a few GB bigger than the partitions you define. Otherwise vCenter will keep on reporting that your datastore is over 75% usage.
– At least two ntp servers, 0.pool.europe.ntp.org, 1.pool.europe.ntp.org
Network configuration during install:
– Setting IP address, default GW, etc.
– Configure DNS during install
– Configure hostname during install. Difficult to change the linked certificates after name change!
– Do not use host files to solve your HA problems. That was ESX 3.5
– Do not change the COS memory assignment (default = 300 MB) like you used to do in ESX 3.x. With vSphere 4, this value is automatically changed by vSphere based on the amount of RAM the host has.
– Rename the “Service Console” portgroup to “pg-cos” for simpler use in scripts
– Optional: Create a second service console named “pg-cos2” for heartbeats, to avoid false positives and trigger an HA event
– Set default number of ports on a vSwitch to 120. It’s not the 120 that is important, but 56 can be too small when running many VMs on a host and gives strange VMotion problems.
– Unload VMFS2 drivers which is unfortunately still needed in vSphere.
– In BIOS of host enable NUMA, Intel-VT or AMD-V features and for Nehalam type cpu’s and higher also enable HyperThreading.
– Service Console network and VMotion network can be combined on one vSwitch.
- Portgroup pg-cos will have vmnic0 active, vmnic1 standby
- Portgroup pg-vmotion will have vmnic0 standby, vmnic1 active
- Portgroup pg-cos2 (optional) will have vmnic0 standby, vmnic1 active (haven’t tried it like this, may not let you put two service consoles on same vSwitch)
– Default loadbalancing on vSwitch is Virtual Port ID. This is the recommended setting.
– Create a vSwitch without physical nics for quarantine network. Add a portgroup call pg-quarantine
– Virtual Machine startup/shutdown -> Do not set at host level, only at HA cluster level
– iSCSI targets -> use static discovery as much as possible
– VMFS blocksize op 8MB
– Advanced settings: Only if NFS is being used there will be a few advanced settings (Check NetApp guides)
- Promiscuous mode: Reject
- Mac address changes: Accept
- Forged Transmit: Accept
– Failover and Load balancing
- Port ID
- Network failure detection: Link status only
- Notify switches: Yes
- Failback: Yes
Cluster level settings:
– HA Configuration settings (Remember to look at Duncan Epping Deepdive HA section)
- Enable host monitoring = active
- Admission Control = Enable
- Admission Control policy = Percentage 25% (adapt to specific situation)
- Virtual Machine startup/shutdown -> Domain Controllers, SQL Server that holds the vCenter DB, vCenter at high level.
- VM monitoring not enabled unless specifically mentioned in design
- Isolation response is Shutdown
- When using iSCSI set the isolation response to â€œPower Offâ€ and also create a secondary service console running on the same vSwitch as the iSCSI network to detect an iSCSI outage and avoid false positives.
- Add the das.failuredetectiontime at 60000 (60 secs) to avoid possible spanning tree protocol events at the switch level,Â this is mitigated again by setting Cisco ports to portfast mode.
– DRS Configuration settings
- Fully automated with default threshold of 3 (Conservative =1 , aggressive = 5)
- Rules: Design should specify what VMs to keep together or apart
- Virtual Machine Options -> Exclude the vCenter server and the connected SQL Server from DRS. Set to â€œdisabledâ€. Always place vCenter on the first ESX host in the cluster. When first host has to be set to maintenance mode, move to second host. For SQL server, place it on the second host and move to first host when in maintenance mode. Also see: http://www.gabesvirtualworld.com/how-to-quickly-recover-from-disaster/
- Virtual machine options under DRS – Make sure to exclude any VM that is using Microsoft clustering (MSCS or Windows Failover Clusters) in order to maintain support with both Microsoft and VMware.
- Power Management (DPM) -> configure ILO settings in routed network. ESX hosts broadcast the WOL package over VMotion network.
– Set VMware EVC mode according to design. Keep in mind to set EVC before running VMs in the cluster since it can only downgrade CPU’s / Hosts when VMs are powered off.
– Swapfile location remains default with the VM