Cybersecurity
DevOps Cloud (ADM)
IT Operations Cloud
Working with Kubernetes is easy, this just a tip on how to start diagnostic when a problem appears.
The problem that is going to be analyzed is pod crash because a filesystem is full. But this way to diagnostic is can be usefully in any case.
The first thing that do you have to do is install the kubectl, my recommendation is use linux SLES/Ubuntu or if you are using Windows 10 use the Windows apps store and download SLES or Ubuntu
For Linux subsystem installation on Windows 10 please go to:
For kubeclt install, please refer to:
or follow this steps if is ubuntu
$ curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -
$ sudo touch /etc/apt/sources.list.d/kubernetes.list
$ echo "deb http://apt.kubernetes.io/ kubernetes-xenial main" | sudo tee -a /etc/apt/sources.list.d/kubernetes.list
$ sudo apt-get update
$ sudo apt-get install -y kubectl
Or this steaps for suse
For Azure CLI please refer to:
When you finish all this installation, the Azure Admin must share the subscription ID and resource, and you must execute this commands:
This first thing you have to do is know your NAMSPACE and POD NAME
In these logs you will find the following
Container ID: containerd://6c6db73beaf6b3194bc5fb5f9e5caf9ba2d3b243b5d9c47495995e338e3500b4
Image: mfsecurity/aaf-searchd:6.3.2.0
Image ID: docker.io/mfsecurity/aaf-searchd@sha256:d2c0169848584280957393b8632ab945501e1a0f4725e86bc96f74e43bc44d6c
Ports: 9200/TCP, 9300/TCP
Host Ports: 0/TCP, 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 07 Dec 2021 14:48:38 +0000
Finished: Tue, 07 Dec 2021 14:48:40 +0000
Ready: False
Restart Count: 187
Environment:
bootstrap.memory_lock: true
Mounts:
/opt/AuCore/data from aucore-data (rw)
/usr/share/elasticsearch/config from searchd-config (rw)
/usr/share/elasticsearch/data from searchd-data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-jcbgh (ro)
This is an indication for an error:
Reason: Error
Exit Code: 1
Now we know where is the problem is in searchd.
The next steaps is find the reason for the problem with the command
This command will give us the detail:
Waiting for /opt/AuCore/data/es_data.json (setup reporting step of aucore)
Stalling for Elasticsearch...
[watcher] start monitor /opt/AuCore/data/searchd.RESTART.watcher
[2021-12-07T14:53:53,070][INFO ][o.e.n.Node ] [NODE-1] initializing ...
[2021-12-07T14:53:53,135][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [NODE-1] uncaught exception in thread [main]
org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: Failed to create node environment
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:136) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:123) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:70) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:134) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.cli.Command.main(Command.java:90) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:91) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:84) ~[elasticsearch-5.6.9.jar:5.6.9]
Caused by: java.lang.IllegalStateException: Failed to create node environment
at org.elasticsearch.node.Node.<init>(Node.java:268) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.node.Node.<init>(Node.java:245) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:233) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:233) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:342) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:132) ~[elasticsearch-5.6.9.jar:5.6.9]
... 6 more
Caused by: java.io.IOException: No space left on device
at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[?:?]
at sun.nio.ch.FileDispatcherImpl.write(FileDispatcherImpl.java:60) ~[?:?]
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) ~[?:?]
at sun.nio.ch.IOUtil.write(IOUtil.java:65) ~[?:?]
at sun.nio.ch.FileChannelImpl.write(FileChannelImpl.java:211) ~[?:?]
at java.nio.channels.Channels.writeFullyImpl(Channels.java:78) ~[?:1.8.0_252]
at java.nio.channels.Channels.writeFully(Channels.java:101) ~[?:1.8.0_252]
at java.nio.channels.Channels.access$000(Channels.java:61) ~[?:1.8.0_252]
at java.nio.channels.Channels$1.write(Channels.java:174) ~[?:1.8.0_252]
at java.util.zip.CheckedOutputStream.write(CheckedOutputStream.java:73) ~[?:1.8.0_252]
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82) ~[?:1.8.0_252]
at java.io.BufferedOutputStream.flush(BufferedOutputStream.java:140) ~[?:1.8.0_252]
at org.apache.lucene.store.OutputStreamIndexOutput.getChecksum(OutputStreamIndexOutput.java:80) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
at org.apache.lucene.codecs.CodecUtil.writeCRC(CodecUtil.java:548) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
at org.apache.lucene.codecs.CodecUtil.writeFooter(CodecUtil.java:393) ~[lucene-core-6.6.1.jar:6.6.1 9aa465a89b64ff2dabe7b4d50c472de32c298683 - varunthacker - 2017-08-29 21:54:39]
at org.elasticsearch.gateway.MetaDataStateFormat.write(MetaDataStateFormat.java:140) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.env.NodeEnvironment.loadOrCreateNodeMetaData(NodeEnvironment.java:419) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:263) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.node.Node.<init>(Node.java:265) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.node.Node.<init>(Node.java:245) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:233) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:233) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:342) ~[elasticsearch-5.6.9.jar:5.6.9]
at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:132) ~[elasticsearch-5.6.9.jar:5.6.9]
... 6 more
For more information, please use the guide:
https://komodor.com/learn/how-to-fix-crashloopbackoff-kubernetes-error/
To enter to the node we use the command:
From here is Linux as usual. If you can liberate space, this ends here. If you need to add more space, these are the next steps:
We the need to review allowVolumeExpansion is equal true, using the command
Using the portal we need to stop the instances
We take out the number of replicas using the command:
We need to edit PVC, with the commad
https://kubernetes.io/blog/2018/07/12/resizing-persistent-volumes-using-kubernetes/
We need to review the PVC change using the commad:
We review the events with the command
We restore the number of replicas /statefulset
We restart the instances with the command:
Thanks