Wikis - Page

Kubernetes Advance Authentication / Diagnostics

1 Likes

Working with Kubernetes is easy, this just a tip on how to start diagnostic when a problem appears.

The problem that is going to be analyzed is pod crash because a filesystem is full. But this way to diagnostic is can be usefully in any case.

The first thing that do you have to do is install the kubectl, my recommendation is use linux SLES/Ubuntu or if you are using Windows 10 use the Windows apps store and download SLES or Ubuntu 

For Linux  subsystem installation on Windows 10 please go to:

For kubeclt install, please refer to:

or follow this steps if is ubuntu 

$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
$ sudo touch /etc/apt/sources.list.d/kubernetes.list 
$ echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee -a /etc/apt/sources.list.d/kubernetes.list
$ sudo apt-get update
$ sudo apt-get install -y kubectl

Or this steaps for suse

For Azure CLI please refer to:

When you finish all this installation, the Azure Admin must share the subscription ID and resource, and you must execute this commands:

  • az account set --subscription XXXXXXXXXXXXXXXXXXX
  • az aks get-credentials --resource-group RG-servicios_kubernetes --name AA-devl

This first thing you have to do is know your NAMSPACE and POD NAME

  • kubectl get deployments --all-namespaces=true
  • kubectl get pods -o wide
  • kubectl describe -n [NAMESPACE_NAME] pod [POD_NAME] > /tmp/runbooks_describe_pod.txt
  • kubectl logs --all-containers -n [NAMESPACE_NAME] > /tmp/runbooks_pod_logs.txt
  • kubectl logs --all-containers --previous -n [NAMESPACE_NAME] > /tmp/runbooks_previous_pod_logs.txt
  • kubectl logs <podname>-n <namespace>>> c:\temp\podnamelogs.txt

In these logs you will find the following

Container ID: containerd://6c6db73beaf6b3194bc5fb5f9e5caf9ba2d3b243b5d9c47495995e338e3500b4
Image: mfsecurity/aaf-searchd:6.3.2.0
Image ID: docker.io/mfsecurity/aaf-searchd@sha256:d2c0169848584280957393b8632ab945501e1a0f4725e86bc96f74e43bc44d6c
Ports: 9200/TCP, 9300/TCP
Host Ports: 0/TCP, 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated


Reason: Error
Exit Code: 1


Started: Tue, 07 Dec 2021 14:48:38 +0000
Finished: Tue, 07 Dec 2021 14:48:40 +0000
Ready: False
Restart Count: 187
Environment:
bootstrap.memory_lock: true
Mounts:
/opt/AuCore/data from aucore-data (rw)
/usr/share/elasticsearch/config from searchd-config (rw)
/usr/share/elasticsearch/data from searchd-data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-jcbgh (ro)

This is an indication for an error:

Reason: Error
Exit Code: 1

Now we know where is the problem is in searchd.

The next steaps is find the reason for the problem with the command

  • kubectl logs <podname> searchd

This command will give us the detail:

Waiting for /opt/AuCore/data/es_data.json (setup reporting step of aucore)
Stalling for Elasticsearch...
[watcher] start monitor /opt/AuCore/data/searchd.RESTART.watcher
[2021-12-07T14:53:53,070][INFO ][o.e.n.Node ] [NODE-1] initializing ...
[2021-12-07T14:53:53,135][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [NODE-1] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: Failed to create node environment
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:136) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:123) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:70) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:134) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.cli.Command.main(Command.java:90) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:91) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:84) ~[elasticsearch-5.6.9.jar:5.6.9]
Caused by: java.lang.IllegalStateException: Failed to create node environment
at org.elasticsearch.node.Node.<init>(Node.java:268) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.node.Node.<init>(Node.java:245) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:233) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:233) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:342) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:132) ~[elasticsearch-5.6.9.jar:5.6.9]
... 6 more


Caused by: java.io.IOException: No space left on device


at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[?:?]
at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:60) ~[?:?]
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) ~[?:?]
at sun.nio.ch.IOUtil.write(IOUtil.java:65) ~[?:?]
at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:211) ~[?:?]
at java.nio.channels.Channels.writeFullyImpl(Channels.java:78) ~[?:1.8.0_252]
at java.nio.channels.Channels.writeFully(Channels.java:101) ~[?:1.8.0_252]
at java.nio.channels.Channels.access$000(Channels.java:61) ~[?:1.8.0_252]
at java.nio.channels.Channels$1.write(Channels.java:174) ~[?:1.8.0_252]
at java.util.zip.CheckedOutputStream.write(CheckedOutputStream.java:73) ~[?:1.8.0_252]
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) ~[?:1.8.0_252]
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) ~[?:1.8.0_252]
at org.apache.lucene.store.OutputStreamIndexOutput.getChecksum(OutputStreamIndexOutput.java:80) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
at org.apache.lucene.codecs.CodecUtil.writeCRC(CodecUtil.java:548) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
at org.apache.lucene.codecs.CodecUtil.writeFooter(CodecUtil.java:393) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
at org.elasticsearch.gateway.MetaDataStateFormat.write(MetaDataStateFormat.java:140) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.env.NodeEnvironment.loadOrCreateNodeMetaData(NodeEnvironment.java:419) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:263) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.node.Node.<init>(Node.java:265) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.node.Node.<init>(Node.java:245) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:233) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:233) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:342) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:132) ~[elasticsearch-5.6.9.jar:5.6.9]
... 6 more

For more information, please use the guide:

https://komodor.com/learn/how-to-fix-crashloopbackoff-kubernetes-error/ 

To enter to the node we use the command:

From here is Linux as usual. If you can liberate space, this ends here. If you need to add more space, these are the next steps:

We the need to review  allowVolumeExpansion is equal true, using the command

  • kubectl get sc
  • In case of false false: kubectl edit sc<sc_name>

 

Using the portal we need to stop the instances

We take out the number of replicas using the command:

  • kubectl scale statefulset <name>--replicas=0

 

We need to edit PVC, with the commad

  • kubcetl edit pvc <pv_name> and change the storage size

https://kubernetes.io/blog/2018/07/12/resizing-persistent-volumes-using-kubernetes/

 

We need to review the PVC change using the commad:

  • kubectl get pvc myclaim -o yaml

 

We review the events with the command

  • kubectl get events -A

 

We restore the number of replicas /statefulset

  • kubectl scale statefulset <name>--replicas=1

 

We restart the instances with the command:

  • kubectl --namesapce < AA_Namespace> delete pod --grace-period=0 --force<POD_NAME>

Thanks

Labels:

How To-Best Practice
Support Tips/Knowledge Docs
Comment List
Related
Recommended