Cluster and DaaS Service Test response code: 485

Hi,

We have two node cluster with IG 3.7

With either node up alone, data collection works flawless.

When both nodes are up, datacollections fails... sometimes

Both nodes have their own runtime ids

com.netiq.iac.runtime.id = ig01

com.netiq.iac.runtime.id = ig02

A test on an ldap collector can work, and when tested again, it can fail.

The log appears on whatever server I am accessing via the load balancer

When it fails, I see this in the log

[FINE] 2023-12-01 17:51:41.950 [com.netiq.iac.server.common.rest.RestCallExecutor] [IG-SERVER]
GET ig.aakiamtest.dk:8443/.../test
   Credentials: iac-service:********
   Token: 2023-12-01T17:53:32.000+0100 blablabla
   Headers
      Authorization=Basic Y249Wdc2VyLG91PXVzZXJzG91PXNhLG89c3lzdGVtOmhlZkxTYXhnN0Z3cVbmtRRENu
      X-Authorization=Bearer blablabla

[FINE] 2023-12-01 17:51:41.981 [com.netiq.iac.persistence.dcs.dce.daas.DaaSService] [IG-SERVER] DaaS Service Test response code: 485
[FINE] 2023-12-01 17:51:41.981 [com.netiq.iac.persistence.dcs.dce.daas.DaaSService] [IG-SERVER] DaaS Service Test error message: Service '516d8ab3-3c46-4bc0-b674-f77d51c2a4e6' is not loaded

and

API server response{"status":200,"body":"{\"Fault\":{\"Code\":{\"Value\":\"Sender\",\"Subcode\":{\"Value\":\"DaasRequestInvalid\"}},\"Reason\":{\"Text\":\"DaaS connector returned error (485):
Service 'c46f3208-df94-4562-bcce-700034d93702' is not loaded\",\"Stack\":null}}}"}

and

[SEVERE] 2023-12-01 17:56:14.326 [com.netiq.iac.server.rest.ConnectionService] [IG-SERVER] Test Connection error: DaaS connector returned error (485): Service 'c46f3208-df94-4562-bcce-700034d9
3702' is not loaded
com.netiq.iac.persistence.spi.exception.DaaSServiceException: com.netiq.iac.common.IacException
        at com.netiq.iac.persistence.dcs.dce.daas.DaaSService.testConnection(DaaSService.java:593)
        at com.netiq.iac.persistence.service.cum.DataCollectionService.testConnection(DataCollectionService.java:245)
        at com.netiq.iac.server.rest.ConnectionService.testConnection(ConnectionService.java:95)
        at sun.reflect.GeneratedMethodAccessor2470.invoke(Unknown Source)

When it works I see A LOT more.

I can see, that the same server can log success and fail, seemingly in a random fashion.

What can be done to find out why this is happening?

  • 0  

    I usually turn up logging on com.netiq.iac.daas or whatever that class is for the data collection and see if it shows any errors.  Your error is in an odd class to me,

    com.netiq.iac.persistence.dcs.dce.daas

    I wonder if thet moved stuff around.

  • 0  

    All the persistence classes have to do with storage of data in the database.  I wonder if you are getting a connection dropped from one of our nodes back to the database?   Do you have a way to get any metrics or logs from the DB side to see if connections are dropping?  Does the credential you are using have any limits on how many sessions?   You can look at the tomcat config for database sessions as well, might be something to tweak or to increase.    Im my experience its very hard to see via logging when a DB conenction drops, but the end result is not unlike what you've shared.

    --Jim

  • 0   in reply to   

    Thanks for the replies

    The response code means "Invalid JSON Request"

    I still wonder what can be wrong.

    Probably a configuration issue (FAT40 error / Error 40).

    I have opened an SR, and will return here with what we discover