NNMi cancels Connection Down incident even the problem still exists

Typical situation:

-Two Cisco Nexus switches connected via Ethernet/fiber
- LLDP and CDP are running as link layer discovery protocol
- NNMi created the connection using LLDP as topology source (LLDP preferred is the default setting in device profiles)
- “Delete Unresponsive Objects Control” in the discovery setting is set to Zero (never delete object when down)
- Everything nice and green on the map

Now there is a fiber cut on one side (to test I pulled the plug)
- NNMi creates an interface down event for the two affected interfaces, and additionally a connection down incident.
- On the map the two affected nodes turn yellow and the link between turns red, everything as it should…..

10 minutes later I do a configuration poll on either one of the affected nodes, while the fiber is still disconnected. Also I do Map refresh..
-NNMi cancels all previously created incidents. (Incident cancelled by: Connection deleted from topology, incident cancelled by: InterfaceUnpolled)
-On the Map the link between the nodes has been removed and both nodes turned green,  even the problem still exist !
-I did the same test while forcing NNMi to use CDP as topology source, same results.


LLDP holdtime is on Cisco switches is by default 120 Sec. Means 120 sec after a link is down the switch removes it from the neighborship table.
Seems the NNMi comes to the conclusion that because a neighbor is not seen in the LLDP or CDP table is not existing anymore and therefore can be removed, this conclusion would be of course completely wrong.

The Problem is so obvious that I thought it must have something to do with our individual settings….. I checked everything in my mind but so far no Idea

I also opened a case (5317849227) on high prio, something which I hardly ever do, but this really affects the monitoring. A connection down during night was “cleared” by the scheduled configuration poll and in the morning the problem was still around while NNMi showed everything green 

Running latest 10.21 Patch2 on Windows 2012 using latest device pack.

Just wonder if somebody else had such an experience ?

Thanks Thomas

Parents
  • Verified Answer

    FYI

    Problem was that NNMi deleted and recreated LLDP discovered eth. L2 connections (UUID changed)  every time a configuration poll was done.  If link was down during that time it was not recreated but deleted and with it also all related incidents which made monitoring very unreliable. 

    Thanks Thomas

    Hotfix-NNMI-10.2XP2-DISCOVERY-20170314 has solved the problem

     

     

Reply
  • Verified Answer

    FYI

    Problem was that NNMi deleted and recreated LLDP discovered eth. L2 connections (UUID changed)  every time a configuration poll was done.  If link was down during that time it was not recreated but deleted and with it also all related incidents which made monitoring very unreliable. 

    Thanks Thomas

    Hotfix-NNMI-10.2XP2-DISCOVERY-20170314 has solved the problem

     

     

Children
No Data