Highlighted
Respected Contributor.. Respected Contributor..
Respected Contributor..
405 views

(SMA) Support Tip Proactive Forum- How to fix when one worker node is lost connection with master

If one of the worker node is lost connectivity with Master, Run kubectl get nodes > check the status 

Workarounds: There are two ways to fix this issue 

Option 1:  In Master node, Run below command

kubectl cordon <Master IP>  //  Cordon marks the node unschedulable and will prevent new pods from creating

kubectl drain <Worker1 IP> --force --grace-period=0 --ignore-daemonsets --delete-local-data 

kubectl drain <Worker2 IP> --force --grace-period=0 --ignore-daemonsets --delete-local-data

The drain command will delete the pods 

Note: It Not delete any pods unless you use –force 

Log in worker VM that is lost, run below commands

1)      systemctl status kubelet.service // Make sure kubelet is dead

2)      systemctl start kubelet.service // start kubelet

Go back to Master node, run

kubectl uncordon <Worker1 IP> //the uncordon will make the node schedulable

kubectl uncordon <Worker2 IP>

kubectl uncordon <Master IP>

Then the lost node and all hosted pods come back.     

Option 2 : Last resort is to uninstall CDF from the worker node, and read the worker node again from management portal.

0 Replies
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.