Skip to content

Monitor, not pinging hosts, unstable results #17

Description

@STiAT

Hi,

I've used your example code to add 593 of our hosts to the monitor tool, since it basically should do what I need (I need a statistics every 60 seconds which I would want to log away).

  • Of those 593 hosts only 255 are ever reported in the ticker (always 255, always the first 255 I add as targets). It makes no difference weather I use AddTarget or AddTargetDelayed, it's always 255.

  • Hosts which are certainly available are being reported as PacketLoss 100 %, while Wireshark sees the successful UDP packages:
    image

The monitor reports the following:
2020/12/03 14:14:17 hanco: {PacketsSent:10 PacketsLost:10 Best:0 Worst:0 Median:NaN Mean:NaN StdDev:NaN}

Which is certainly the host and it can be pinged:
image

It is certainly true that several of the hosts are currently "down" because behind Firewalls and I'm not testing in the same zone, but I'd expect it to report the loss at the "right" hosts ... and all 593 instead of just 255 ofc ;-).

Current code, which pretty much reflects the example with some little changes:

package slaping

import (
	"net"
	"os"
	"time"
	"log"
	"os/signal"
	"syscall"

	"github.com/digineo/go-ping"
	"github.com/digineo/go-ping/monitor"
)

var (
	pingInterval        = 5 * time.Second
	pingTimeout         = 4 * time.Second
	reportInterval      = 60 * time.Second
	size           uint = 56
	pinger         *ping.Pinger
	targets        []string
)

func (p *SLAPing) PingHosts() {
	// setup, we need to populate cache once before we start to actually have host and maint info
	CMDBHosts, err := p.CMDB.GetSLAHosts()

	if err != nil {
		log.Println(err)
	}

	// Bind to sockets
	if pi, err := ping.New("0.0.0.0", "::"); err != nil {
		log.Printf("Unable to bind: %s\nRunning as root?\n", err)
		os.Exit(2)
	} else {
		pinger = pi
	}
	pinger.SetPayloadSize(uint16(size))
	defer pinger.Close()

	// Create monitor
	monitor := monitor.New(pinger, pingInterval, pingTimeout)
	defer monitor.Stop()

	// Add targets
	targets := CMDBHosts
	validtargets := 0
	for i, target := range CMDBHosts {
		ipAddr, err := net.ResolveIPAddr("", target)
		if err != nil {
			// currently irrelevant
			continue
		}

		err = monitor.AddTarget(string([]byte{byte(i)}), *ipAddr)
		//err = monitor.AddTargetDelayed(string([]byte{byte(i)}), *ipAddr, 10*time.Millisecond*time.Duration(i))

		if err != nil {
			// currently irrelevant
			continue
		}
		validtargets = validtargets + 1
	}

	log.Printf("Total targets: %d, Valid targets: %d, Errors: %d\n", len(targets), validtargets, len(targets)-validtargets)

	// Start report routine
	ticker := time.NewTicker(reportInterval)
	defer ticker.Stop()
	go func() {
		for range ticker.C {
			for i, metrics := range monitor.ExportAndClear() {
				log.Printf("%s: %+v\n", targets[[]byte(i)[0]], *metrics)
			}
		}
	}()

	// Handle SIGINT and SIGTERM.
	// in other words we wait here until interrupted and let the ticker tick
	ch := make(chan os.Signal, 1)
	signal.Notify(ch, syscall.SIGINT, syscall.SIGTERM)
	log.Println("received", <-ch)
}

[edit]
My initial thought was that it may be connected to the ICMP pings getting blocked by firewalls and threads are getting locked up due to the timeouts not properly working there, since it's an uncommon case and may not be that well tested.
Now, due to the result "shuffle" I'm not sure but I have systems in the results which actually are blocked by firewalls, going through the slice sequentially that could explain why it's always the same number, but I didn't see any max threads number to get locked in the library, is there a limit where the channel / parallel work of the library blocking?

[edit2]
I could test it now on a host which has the necessary firewall permissions to actually run this without getting blocked, the results are still the same. Something is wrong here, just can't make out what.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions