venerdì 24 ottobre 2025

[Holodeck] - issue fixed : services don't start in DHCP mode

Issue


Holodeck is a powerfull toolkit designed to provide a standardized and automated method to deploy nested VMware Cloud Foundation (VCF) environments on a VMware ESX host or a vSphere cluster for your homelab learning test.

All informations about holodeck are available here.

The appliance can be deployed with a static IP or with DHCP.
What happens if I deploy the appliance using DHCP, then shut it down, and when I power it back on, it receives a different IP address?
The answer is that after reboot, previously configured services may fail to start properly, causing the Kubernetes control plane to become unresponsive.

I retrieve the new IP address and attempt to access the appliance via the web interface ...


Solution


Disclaimer: Use it at your own risk.

As a quick fix, if the old IP is still available, simply set the old IP in the network configuration as a static IP and restart the network services. The pods will be activated fairly quickly.
If the IP is unavailable, follow the steps below.

1. Stop iptables

First of all I stop the iptables/Firewall service to connect via SSH to the VM.
# systemctl stop iptables

2. Check status

Connect via SSH to the Holorouter and control the Kubernetes pod, with the following command:
# kubectl get pods
What you can see from the image above is that the control plane was expecting a response from a different IP than the one we currently have.
Previous IP: 192.168.1.70
Current IP: 192.168.1.238

I check the network configuration as well ...
# cat /etc/systemd/network/50-static-en.network

3. Re-init the kubernetes control plane

To reinitialize the Kubernetes control plane and allow the server API and pods to function properly again, I created the following script (downloadable below). The script also changes the network settings from DHCP to statically with the new IP address obtained (in my case, 192.168.1.238).

I create the new file in the root path:
# vi change-control-plane-ip.sh

I paste what you can see in the image (script below); I save and run the script...
# bash change-control-plane-ip.sh
If all went well, it should look something like the one shown in the picture.
Check the current state, pre-reboot
As you can see from the image above, the pods are in an "Unknow" state.

4. Reboot and check results

I restart the appliance and perform the post-reboot check ...
# reboot

To check if the pods have powered up, I log in to the appliance and run the following command:
# kubectl get pods
If they haven't completely in a running state, wait a moment until they are completely up.
When the pods are up and running try connecting via the web.

Boom!! It works


Below the script used change-control-plane-ip.sh
# change-control-plane-ip.sh
# Stop Services
systemctl stop kubelet docker

# Backup Kubernetes and kubelet
mv -f /etc/kubernetes /etc/kubernetes-backup
mv -f /var/lib/kubelet /var/lib/kubelet-backup

# Keep the certs we need
mkdir -p /etc/kubernetes
cp -r /etc/kubernetes-backup/pki /etc/kubernetes
rm -rf /etc/kubernetes/pki/{apiserver.*,etcd/peer.*}

# Start docker
systemctl start docker

# Get IP address
IP=`ip -o -4 addr show eth0 | awk '{print $4}' | cut -d/ -f1`

# Init cluster with new ip address
kubeadm init --control-plane-endpoint $IP --ignore-preflight-errors=all --v=5

# Verify resutl
kubectl cluster-info

# Change IP on the configuration file 
cp /etc/systemd/network/50-static-en.network /etc/systemd/network/50-static-en.network.backup 
cat > /etc/systemd/network/50-static-en.network << EOF

[Match]
Name=eth0

[Network]
Address=`ip -o -4 addr show eth0 | awk '{print $4}'`
Gateway=`ip route show 0.0.0.0/0 dev eth0 | cut -d\  -f3`
DNS=10.1.1.1 null
EOF

    



That's it.

Nessun commento:

Posta un commento