Quantcast
Channel: THWACK: Message List
Viewing all articles
Browse latest Browse all 20396

Re: Alerting setup failed us when most needed - suggestions for improvement

$
0
0

Given an option between SNMP & ICMP, I would always chose ICMP for determining Node status. Because SNMP is a low priority process in most devices and even a slight spike in CPU will happily ignore SNMP Queries. And the connection isn't TCP either.

 

But then, the two concerns raised related to this issue need to be addressed:

 

  • If a node responds to SNMP but not ICMP it should still be considered up. Several appliances do not respond to ICMP by default and enterprise firewalls also blocks them. The device down logic should be like:

 

If (ICMPTimeOut && SNMPTimeOut) { Node.Status = 'Up' } Else { Node.Status = 'Down' }

 

  • If the device doesn't respond to SNMP queries over a prolonged period, we should be alerted (the original premise of this thread). Alerting can be achieved indirectly through setting the status to 'Unknown'

 

If (SNMPTimeOut && (!ICMPTimeout) && (LastSNMPPollTime < Threshold) { Node.Status = 'Unknown' }


Viewing all articles
Browse latest Browse all 20396

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>