Our vBulletin migration is complete.
Welcome vBulletin users! All content and user information from the Micro Focus Forums (vBulletin) site has been migrated to this site. READ MORE.
Micro Focus Contributor
Micro Focus Contributor
945 views

Issue with SMAX 2018.08 pods failing to go up and kubelet returning Connection failed error

Hello to all,

Thank you in advance for ypur time and consideration.

I have 4 Machines ( One Master and 3 workers) with 32GB RAM each. I have managed a Month ago to install SMAX 2018.08. However after a month, I am no longer able to log to Suite.

After troubleshooting trials, I figured out that there are pods down. I didn't succeed in making them up again despite : delete/create pod trial, Stop and then Start CDF, Stop and then Start Suite, Kube-redeploy.sh running.

Now, I have this error with Kubectl : The Connection to the server psce10192s1.swinfra.net:8443 was refused - did you specify the right host or port.

BTW, I edited the /et/hosts as a resolution option but there was no progress.

Have anyone an idea of what should I do to fix this?

Thank you very much for your help

0 Likes
9 Replies
Misaq Acclaimed Contributor.
Acclaimed Contributor.

Re: Issue with SMAX 2018.08 pods failing to go up and kubelet returning Connection failed error

This message sometimes just points to the management portal is not up yet.

you can run kube-status.sh to check if full CDF is running or not.

0 Likes
Micro Focus Contributor
Micro Focus Contributor

Re: Issue with SMAX 2018.08 pods failing to go up and kubelet returning Connection failed error

Thanks for you reply.

I have run kube-status.sh and Full CDF is NOT started!!

Here is the result :

[root@pscel0192s2 bin]# ./kube-status.sh
Get Node IP addresses ...

Master servers:  pscel0192s1.swinfra.net
Worker servers:  pscel0192s2.swinfra.net pscel0192s3.swinfra.net pscel0192s4.swi                                                                                                                     nfra.net

Checking status on 16.46.38.210
--------------------------------------
[DockerVersion] Docker:v1.13.1 ....................................... Running
[KubeVersion] Client:v1.9.6  Server:v1.9.6 ........................... Running
[NativeService] docker ............................................... Running
[NativeService] docker-bootstrap ..................................... Running
[NativeService] kubelet .............................................. Running
[NativeService] kube-proxy ........................................... Running
[Etcd]
     ETCD on pscel0192s1.swinfra.net ................................. Running
[Vault]
     Vault on pscel0192s1.swinfra.net ................................ Running
[APIServer] API Server - https://pscel0192s1.swinfra.net:8443 ........ Running
[MngPortal] URL: https://pscel0192s1.swinfra.net:5443 ................ Inactive
[Process] flanneld ................................................... Running
[Process] kubelet .................................................... Running
[Bootstrap] kube_flannel ............................................. Running
[Node]
     (Master) pscel0192s1.swinfra.net ................................ Running
     (Worker) pscel0192s2.swinfra.net ................................ Running
     (Worker) pscel0192s3.swinfra.net ................................ Running
     (Worker) pscel0192s4.swinfra.net ................................ Running
[Pod]
   <pscel0192s1.swinfra.net>
     (core) apiserver-pscel0192s1.swinfra.net ........................ Running
     (core) controller-pscel0192s1.swinfra.net ....................... Running
     (core) scheduler-pscel0192s1.swinfra.net ........................ Running
   <pscel0192s2.swinfra.net>
   <pscel0192s3.swinfra.net>
   <pscel0192s4.swinfra.net>
[DaemonSet]
     (core) kube-registry-proxy ...................................... 1/4
     (core) nginx-ingress-controller ................................. 0/1
     (core) kube-dns ................................................. 1/1
     (core) kube-registry ............................................ 0/1
     (core) kubernetes-vault ......................................... 0/1
[Deployment]
     (kube-system) heapster-apiserver ................................ 1/1
     (core) idm ...................................................... 0/2
     (core) mng-portal ............................................... 0/1
     (core) suite-db ................................................. 0/1
     (core) cdf-apiserver ............................................ 0/1
     (core) suite-installer-frontend ................................. 0/1
     (core) itom-postgresql-default .................................. 0/1
[Service]
     (default) kubernetes ............................................ Running
     (core) default-postgresql-svc ................................... Running
     (core) idm-svc .................................................. Running
     (core) kube-dns ................................................. Running
     (core) kube-registry ............................................ Running
     (core) kubernetes-vault ......................................... Running
     (core) mng-portal ............................................... Running
     (core) suite-db-svc ............................................. Running
     (core) suite-installer-svc ...................................... Running
     (core) cdf-svc .................................................. Running
     (core) cdf-suitefrontend-svc .................................... Running
     (kube-system) heapster .......................................... Running
[Ping] kube-registry-proxy ........................................... noMngPort                                                                                                                     al
[NFS]
   <PersistentVolume: db-single>
     pscel0192s1.swinfra.net:/var/vols/itom/db-single-vol ............ Passed
   <PersistentVolume: itom-vol>
     pscel0192s1.swinfra.net:/var/vols/itom/core ..................... Passed
   <PersistentVolume: itsma1-db-backup-vol>
     pscel0192s1.swinfra.net:/var/vols/itom/db-backup-vol ............ Passed
   <PersistentVolume: itsma1-db-volume>
     pscel0192s1.swinfra.net:/var/vols/itom/itsma/db-volume .......... Passed
   <PersistentVolume: itsma1-db-volume-1>
     pscel0192s1.swinfra.net:/var/vols/itom/itsma/db-volume-1 ........ Passed
   <PersistentVolume: itsma1-db-volume-2>
     pscel0192s1.swinfra.net:/var/vols/itom/itsma/db-volume-2 ........ Passed
   <PersistentVolume: itsma1-global-volume>
     pscel0192s1.swinfra.net:/var/vols/itom/itsma/global-volume ...... Passed
   <PersistentVolume: itsma1-rabbitmq-infra-rabbitmq-0>
     pscel0192s1.swinfra.net:/var/vols/itom/itsma/rabbitmq-infra-rabbitmq-0  Pas                                                                                                                     sed
   <PersistentVolume: itsma1-rabbitmq-infra-rabbitmq-1>
     pscel0192s1.swinfra.net:/var/vols/itom/itsma/rabbitmq-infra-rabbitmq-1  Pas                                                                                                                     sed
   <PersistentVolume: itsma1-rabbitmq-infra-rabbitmq-2>
     pscel0192s1.swinfra.net:/var/vols/itom/itsma/rabbitmq-infra-rabbitmq-2  Pas                                                                                                                     sed
   <PersistentVolume: itsma1-rabbitmq-pro-rabbitmq-0>
     pscel0192s1.swinfra.net:/var/vols/itom/itsma/rabbitmq-pro-rabbitmq-0  Passe                                                                                                                     d
   <PersistentVolume: itsma1-rabbitmq-pro-rabbitmq-1>
     pscel0192s1.swinfra.net:/var/vols/itom/itsma/rabbitmq-pro-rabbitmq-1  Passe                                                                                                                     d
   <PersistentVolume: itsma1-rabbitmq-pro-rabbitmq-2>
     pscel0192s1.swinfra.net:/var/vols/itom/itsma/rabbitmq-pro-rabbitmq-2  Passe                                                                                                                     d
   <PersistentVolume: itsma1-smartanalytics-volume>
     pscel0192s1.swinfra.net:/var/vols/itom/itsma/smartanalytics-volume  Passed
[DB] defaultdb ....................................................... Error

Client certificate expiration date: Nov  6 16:03:13 2019 GMT, 335 days left

ERROR: Full CDF is NOT started!!

0 Likes
Francis Feugue Outstanding Contributor.
Outstanding Contributor.

Re: Issue with SMAX 2018.08 pods failing to go up and kubelet returning Connection failed error

That issue occurs most of the time because teh CDF is down.

Can you share the output of the disk check command ?  from the ternminal run: df -h

Micro Focus Contributor
Micro Focus Contributor

Re: Issue with SMAX 2018.08 pods failing to go up and kubelet returning Connection failed error

I have made some trials to get the pods up ( Restart CDF, Restart Kubelet, delete and create Yaml) with unfortunately no success.

I am quite new at this so excuse my lack of knowledge.

Here is the pods that always aren't in Running status :

cdf-apiserver-6ccbcbb4d4-q26sn  ( Status = Pending)

idm-774bf84fb9-55qtt ( status = Pending)

itom-cdf-ingress-frontend-swdxd   ( Status = Pending)

itom-k8s-dashboard-65c466b967-sgf55  ( Status = Pending)

itom-logrotate-pjw24  ( Status = Pending)

itom-postgresql-default-55f6776d56-6lbw4  ( Status = Pending)

kube-registry-fvmnv ( Status = Pending)

kube-registry-proxy-mrtj7  ( Status = crash loop back off)

 

Here is more details :

[root@pscel0192s1 output]# kubectl get pods -n core -o wide
NAME                                        READY     STATUS             RESTARTS   AGE       IP             NODE
apiserver-pscel0192s1.swinfra.net           1/1       Running            1          1d        16.46.39.98    pscel0192s1.swinfra.net
cdf-apiserver-6ccbcbb4d4-q26sn              0/2       Pending            0          4m        <none>         <none>
controller-pscel0192s1.swinfra.net          1/1       Running            1          29d       16.46.39.98    pscel0192s1.swinfra.net
fluentd-4zhr7                               1/1       Running            4          2d        172.16.67.12   pscel0192s2.swinfra.net
fluentd-5hq8z                               1/1       Running            3          2d        172.16.49.2    pscel0192s4.swinfra.net
fluentd-6vvq5                               1/1       Running            3          2d        172.16.61.2    pscel0192s3.swinfra.net
fluentd-m22dt                               1/1       Running            0          6m        172.16.31.13   pscel0192s1.swinfra.net
idm-774bf84fb9-55qtt                        0/2       Pending            0          1m        <none>         <none>
idm-774bf84fb9-8sn6b                        0/2       Pending            0          56s       <none>         <none>
itom-cdf-ingress-frontend-swdxd             0/1       Pending            0          3s        <none>         pscel0192s1.swinfra.net
itom-k8s-dashboard-65c466b967-sgf55         0/2       Pending            0          1m        <none>         <none>
itom-logrotate-2cj2z                        1/1       Running            4          29d       172.16.49.4    pscel0192s4.swinfra.net
itom-logrotate-2d7v9                        1/1       Running            5          29d       172.16.67.13   pscel0192s2.swinfra.net
itom-logrotate-hpmnq                        1/1       Running            4          29d       172.16.61.3    pscel0192s3.swinfra.net
itom-logrotate-pjw24                        0/1       Pending            0          2s        <none>         pscel0192s1.swinfra.net
itom-postgresql-default-55f6776d56-6lbw4    0/2       Pending            0          2m        <none>         <none>
kube-dns-zpktv                              3/3       Running            0          10m       172.16.31.15   pscel0192s1.swinfra.net
kube-registry-fvmnv                         0/2       Pending            0          0s        <none>         pscel0192s1.swinfra.net
kube-registry-proxy-2bbl2                   0/2       Pending            0          1s        <none>         pscel0192s1.swinfra.net
kube-registry-proxy-db2jt                   2/2       Running            946        29d       172.16.67.6    pscel0192s2.swinfra.net
kube-registry-proxy-mrtj7                   1/2       CrashLoopBackOff   1149       29d       172.16.49.6    pscel0192s4.swinfra.net
kube-registry-proxy-zg87k                   1/2       CrashLoopBackOff   947        29d       172.16.61.4    pscel0192s3.swinfra.net
kubernetes-vault-275m5                      0/1       Pending            0          2s        <none>         pscel0192s1.swinfra.net
mng-portal-7df9dd5678-h7srz                 0/2       Pending            0          5m        <none>         <none>
nginx-ingress-controller-fgkb4              0/1       Pending            0          3s        <none>         pscel0192s1.swinfra.net
suite-conf-pod-itsma-78b44bd99d-t72c7       2/2       Running            195        2d        172.16.67.8    pscel0192s2.swinfra.net
suite-db-6b4cd5567d-9rpdk                   0/2       Pending            0          2m        <none>         <none>
suite-installer-frontend-6c4c7bfdcf-hdsw5   0/2       Pending            0          2m        <none>         <none> 

0 Likes
Micro Focus Contributor
Micro Focus Contributor

Re: Issue with SMAX 2018.08 pods failing to go up and kubelet returning Connection failed error

Thak you  for your reply

Attached is the output of df-h on Master Node.

0 Likes
Misaq Acclaimed Contributor.
Acclaimed Contributor.

Re: Issue with SMAX 2018.08 pods failing to go up and kubelet returning Connection failed error

Your  resources are bit low , I dont think you followed the specs for storage.

Could you also run  free -g     (also make sure to follow the recomended sizing/memory allocation) as there may not be enough resources to spawn the pods.

https://docs.microfocus.com/itom/Service_Management_Automation_-_X:2018.08/Sizing-for-on-premises-deployment_19895859

 

For a deeper look

https://docs.microfocus.com/itom/Service_Management_Automation_-_X:2018.08/Troubleshoot-CDF_22183990

0 Likes
Francis Feugue Outstanding Contributor.
Outstanding Contributor.

Re: Issue with SMAX 2018.08 pods failing to go up and kubelet returning Connection failed error

I also think you are hitting a ressource problem

try to run this one and check the result
kubectl describe node pscel0192s1.swinfra.net
kubectl describe node <FQDN of worker node>

I see 98% of disk space also already used

0 Likes
Micro Focus Contributor
Micro Focus Contributor

Re: Issue with SMAX 2018.08 pods failing to go up and kubelet returning Connection failed error

I have extended the disk space and I have Restarted CDF with forcing the deletion of some pods. I have also run rabbitmq recover script.

Now I have CDF running but the suite is not up even after starting it three days ago. ( view attachment)

I tried to check description of each pod and the error is about probe error connection refused / connection timeout.

Does any one have encountered this kind of issues?

 

0 Likes
Highlighted
Madddy Respected Contributor.
Respected Contributor.

Re: Issue with SMAX 2018.08 pods failing to go up and kubelet returning Connection failed error

Hi, 

I am New to SMAX, I had issues with the pods as well but we fixed it by doing hard reboot of the server then VM's followed by SMAX services.

Regards,

Madhan

0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.