Can I replace the new vCenter 5.1 iPXE boot with good old vCenter 5.0 PXE? Yes you can !!!! I called on twitter for help for someone to send me the contents of their AutopDeploy 5.0 FTPRoot directory and I promptly received a response from Roman ( @medea61 ). After placing the AutoDeploy 5.0 contents in the FTPRoot directory and editing the TRAMP file to reflect the IP address of the AutoDeploy server, all hosts booted without issues. I’m still not sure why iPXE fails on some hosts, but now at least there is a workaround.
Like many of you I downloaded the VMware vSphere 5.1 bits as soon as they were available and started the upgrade of my homelab to vSphere 5.1. My first step was a fresh new install of vCenter Server 5.1 and AutoDeploy. After this was finished and I added the new ESXi 5.1 bundle to my AutoDeploy set, I rebooted the first host and …. nothing. The host wouldn’t boot into the new ESXi 5.1 and halted after loading iPXE. iPXE is the replacement of PXE from earlier AutoDeploy 5.0.
Investigating the issue, I learned that the host had received the reserved IP address, had found the TFTP server and also started downloading the iPXE image but than somehow stopped. Looking at the logs of my TFTP server I saw the host connected and started downloading the image but than aborted that download. Maybe something wrong with the binaries in the TFTPRoot directory?
To test if AutoDeploy would function at all, I created a new VM, had it boot from network, saw it discover the IP address, contacted the TFTP Server, loaded all binaries and started booting ESXi. So, AutoDeploy and the TFTPRoot files must be OK.
Maybe the NICs weren’t supported anymore? Yeah that is the pain of running a whitebox. To test this I installed ESXi 5.1 on USB and booted my host from USB. The NICs showed up fine and I had no troubles at all. Detail: The NIC I use to PXE boot is a Realtek 8111B/C/D adapter, reported by ESXi 5.1 as Realtek 8168 with driver R8168. That same day I also received a new whitebox server with 32GB of RAM and almost the same Realtek adapter, this was the Realtek 8111E adapter. I connected the new host to my network, had it PXE boot and wrote down the MAC address of the NIC. Set it in DHCP as a reservation, created the AutoDeploy RuleSet and waited. It booted without issues !!! And after it was loaded ESXi 5.1 reported the NIC as Realtek 8168 and loaded the same driver: R8168. Hmm, couldn’t be the NIC type then that causes this issue.
To be really sure, I enabled PXE boot on my “Intel Pro 1000 PT Dual Port Server” NIC in the old whitebox. Again, iPXE boot failed. I then removed the Intel card from my old whitebox to my new whitebox and IT WORKED!!! Confirming the NICs being used are not the issue. Only other thing I could think of was the mainboard causing the problem, unfortunately updating the BIOS of my mainboard didn’t help at all.
Let’s fight iPXE then. On the website of iPXE there is little info to be found on problems with iPXE. I hang around on the #IRC channel but no help from there, browsed through the docs and the FAQ, but came up with nothing.
Can I replace iPXE with good old PXE? Yes you can !!!! I called on twitter for help for someone to send me the contents of their AutopDeploy 5.0 FTPRoot directory and I promptly received a response from Roman ( @medea61 https://twitter.com/medea61 ). After placing the AutoDeploy 5.0 contents in the FTPRoot directory and editing the TRAMP file to reflect the IP address of the AutoDeploy server, all hosts booted without issues. I’m still not sure why iPXE fails on some hosts, but now at least there is a workaround.