Highlighted
Super Contributor.. Super Contributor..
Super Contributor..
555 views

JGROUP error found in sm logs

Hi All,

we are using SMA - SM
Service manager 9.52 - Windows server
service Portal 2018.08

Request your kind help in resolving the below issue, as I understanding jgroup is used for communication between sm servers which are on Horizontal scaling. 

This is costing my email integration to stop, we are using connect IT 9.60 for inbound integration. everything works fine until we find the below errors, whenever our integration stops working then below error msg is seen.

I have removed unwanted adaptors on SM application servers (windows) and could not find any abnormal entries in etc/host file .still I am seeing these error msg. has anyone faced same similar error and found the fix.

4280( 7684) 07/21/2019 12:53:48 JRTE W Session 1531C77E34F68D1E1C8CCC56FB1C0ADB is no longer valid. Sending SOAP fault
4280( 7684) 07/21/2019 12:53:48 JRTE W Send error response: Session no longer valid
4280( 7780) 07/21/2019 12:54:03 BT-VWP-HPSMA-02-4280: no physical address for a9269ee6-2406-fcc2-08ad-a966d38a929d, dropping message
4280( 7780) 07/21/2019 12:54:03 [JGRP00011] BT-VWP-HPSMA-02-4280: dropped message 185,480 from non-member cca4ccae-fbac-3c40-bcc6-c1ed6ef222b0 (view=[BT-VWP-HPSMA-02-4280|3] [BT-VWP-HPSMA-02-4280, BT-VWP-HPSMA-02-7852, BT-VWP-HPSMA-02-3572, BT-VWP-HPSMA-02-212, BT-VWP-HPSMA-02-6664])
4280( 7780) 07/21/2019 12:54:32 BT-VWP-HPSMA-02-4280: no physical address for a9269ee6-2406-fcc2-08ad-a966d38a929d, dropping message
4280( 7780) 07/21/2019 12:54:54 BT-VWP-HPSMA-02-4280: no physical address for a9269ee6-2406-fcc2-08ad-a966d38a929d, dropping message
4280( 688) 07/21/2019 12:55:09 [JGRP00011] BT-VWP-HPSMA-02-4280: dropped message 185,514 from non-member cca4ccae-fbac-3c40-bcc6-c1ed6ef222b0 (view=[BT-VWP-HPSMA-02-4280|3] [BT-VWP-HPSMA-02-4280, BT-VWP-HPSMA-02-7852, BT-VWP-HPSMA-02-3572, BT-VWP-HPSMA-02-212, BT-VWP-HPSMA-02-6664]) (received 35 identical messages from cca4ccae-fbac-3c40-bcc6-c1ed6ef222b0 in the last 66,155 ms)

Regards,

Madhan

Labels (2)
0 Likes
8 Replies
Highlighted
Knowledge Partner
Knowledge Partner

I also had simillar issues, fixed adjusting the parameters on sm.ini and sm.cfg..
Make sure you set correctly all settings/parameters, in special the gossiprouter, groupname, system, gossiprouterhosts
Take a look at the manual https://docs.microfocus.com/SM/9.60/Codeless/Content/serversetup/concepts/Configure_Jgroups_on_TCP_in_a_horizontal_envt.htm
and review your sm.ini and sm.cfg. In case you still have problems share with us your sm.ini, sm.cfg and the information about your network interfaces and I can help you.
Regards,
Breno Abreu

If you feel this was helpful please click the KUDOS! thumb below!
Highlighted
Super Contributor.. Super Contributor..
Super Contributor..

Thanks Breno,

Currently jgroup is using UDP protocol, I will change it to TCP and update you the result.

Regards,

Madhan 

0 Likes
Highlighted
Super Contributor.. Super Contributor..
Super Contributor..

Hi Breno,

I tried to configure but got no luck.  Got below error msg while I was trying on the primary server. 

I am also attaching my sm.ini & sm.cfg file

6908( 2312) 07/24/2019 20:46:39 JRTE I Starting TRCLIENT thread
6908( 2312) 07/24/2019 20:46:39 JRTE I Waiting for TRCLIENT() to initialize.
6908( 7856) 07/24/2019 20:46:39 RTE I Using "utalloc" memory manager, mode [0]
6908( 7856) 07/24/2019 20:46:39 RTE I Process sm 9.52.2021 (P2) System: 50443 (0x784DFB00) on PC (x64 64-bit) running Windows (6.2 Build 9200) Timezone GMT+03:00 Locale en_US from ServerHost(removed actual servername)
6908( 7856) 07/24/2019 20:46:39 RTE I Host network address: 10.10.10.171
6908( 7856) 07/24/2019 20:46:39 RTE I Thread attaching to resources with key 0x784DFB00
6908( 7856) 07/24/2019 20:46:39 JRTE I ServerSession is created with threadid 7856
6908( 2312) 07/24/2019 20:46:39 JRTE I JGroups 3.6.2.Final

6908( 2312) 07/24/2019 20:46:39 JRTE I JGroups 3.6.2.Final
6908( 2312) 07/24/2019 20:46:40 failed connecting to ServerHost(removed actual servername)/10.10.10.171:7801: java.lang.Exception: Could not connect to ServerHost(removed actual servername)/10.10.10.171:7801
6908( 2312) 07/24/2019 20:46:41 failed reconnecting stub to GR at ServerHost(removed actual servername)/10.10.10.171:7801: java.lang.Exception: Could not connect to ServerHost(removed actual servername)/10.10.10.171:7801
6908( 2312) 07/24/2019 20:46:42 failed reconnecting stub to GR at ServerHost(removed actual servername)/10.10.10.171:7801: java.lang.Exception: Could not connect to ServerHost(removed actual servername)/10.10.10.171:7801
6908( 2312) 07/24/2019 20:46:42 failed fetching members from ServerHost(removed actual servername)/10.10.10.171:7801: java.lang.Exception: Connection to ServerHost(removed actual servername)/10.10.10.171:7801 broken. Could not send GOSSIP_GET request, cause: java.lang.Exception: not connected
6908( 2312) 07/24/2019 20:46:53 failed reconnecting stub to GR at ServerHost(removed actual servername)/10.10.10.171:7801: java.lang.Exception: Could not connect to ServerHost(removed actual servername)/10.10.10.171:7801

Tags (1)
0 Likes
Highlighted
Knowledge Partner
Knowledge Partner

I think the error is because you are using the port 7800 to the GossipRouter and this port is the standard for each member... make a test setting a different port to the GossipRouter, remove the grouptcmbindport .. also to exclude name resolution problems, use the IP instead the name..

sm.cfg

sm -GossipRouter -Gossiprouterport:12001 -log:../logs/gossiprouter.log

 

sm.ini

groupname:hpservice
jgroupstcp:1
GossipRouterhosts:10.10.10.171[12001]
groupbindaddress:10.10.10.171
#grouptcpbindport:7802

 

Let me know

Regards,
Breno Abreu

If you feel this was helpful please click the KUDOS! thumb below!
0 Likes
Highlighted
Regular Contributor.
Regular Contributor.

HI Bruno,

 

I am also facing the same issue and made the suggested modifications but still getting the below errors.

 

 

0 Likes
Highlighted
Knowledge Partner
Knowledge Partner

Could you share the log error?
Regards,
Breno Abreu

If you feel this was helpful please click the KUDOS! thumb below!
Highlighted
Regular Contributor.
Regular Contributor.

Thanks Bruno,I have resolved this error by clearing it out the scdb system entry from the database and restarted the services on both nodes.It started working fine.This particular issue may not be relevant to this topic but every Friday,In our lower environments,Sm.exe process is consuming 100% and it is forcing us to restart the server to fix this issue.We are on 9.60 and recently installed oracle 12 c client on our dev and test servers.Except that we didn’t made any changes to our Dev&QA servers.Whats causing Sm.exe process to consume 100%,Any ideas would be very much helpful to resolve the issue.
0 Likes
Highlighted
Knowledge Partner
Knowledge Partner

That is strange; any tips from log? Are you able to connect to sm in such time? If yes go to system status, system monitor and check which process is consuming your cpu; if necessary enable a trace on it to figure out what is going on.
Regards,
Breno Abreu

If you feel this was helpful please click the KUDOS! thumb below!
0 Likes
The opinions expressed above are the personal opinions of the authors, not of Micro Focus. By using this site, you accept the Terms of Use and Rules of Participation. Certain versions of content ("Material") accessible here may contain branding from Hewlett-Packard Company (now HP Inc.) and Hewlett Packard Enterprise Company. As of September 1, 2017, the Material is now offered by Micro Focus, a separately owned and operated company. Any reference to the HP and Hewlett Packard Enterprise/HPE marks is historical in nature, and the HP and Hewlett Packard Enterprise/HPE marks are the property of their respective owners.