Skip to content

Concurrency Issue on OpenTelemetry Collector Sink #2665

@hkfgo

Description

@hkfgo

Report

Stacktrace

System.InvalidOperationException: Operations that change non-concurrent collections must have exclusive access. A concurrent update was performed on this collection and corrupted its state. The collection's state is no longer correct.
   at System.Collections.Generic.HashSet`1.AddIfNotPresent(T value, Int32& location)
   at Promitor.Integrations.Sinks.OpenTelemetry.Collectors.OpenTelemetrySystemMetricsSink.WriteGaugeMeasurementAsync(String name, String description, Double value, Dictionary`2 labels, Boolean includeTimestamp) in /src/Promitor.Integrations.Sinks.OpenTelemetry/OpenTelemetrySystemMetricsSink.cs:line 21
   at Promitor.Core.Metrics.AggregatedSystemMetricsPublisher.WriteGaugeMeasurementAsync(String name, String description, Double value, Dictionary`2 labels, Boolean includeTimestamp) in /src/Promitor.Core/AggregatedSystemMetricsPublisher.cs:line 30
   at Promitor.Integrations.Sinks.Prometheus.Collectors.AzureScrapingSystemMetricsPublisher.WriteGaugeMeasurementAsync(String name, String description, Double value, Dictionary`2 labels) in /src/Promitor.Integrations.Sinks.Prometheus/AzureScrapingSystemMetricsPublisher.cs:line 44
   at Promitor.Core.Scraping.Scraper`1.ReportBatchScrapingOutcomeAsync(BatchScrapeDefinition`1 batchScrapeDefinition, Boolean isSuccessful, Int32 batchSize) in /src/Promitor.Core.Scraping/Scraper.cs:line 216
   at Promitor.Core.Scraping.Scraper`1.BatchScrapeAsync(BatchScrapeDefinition`1 batchScrapeDefinition) in /src/Promitor.Core.Scraping/Scraper.cs:line 139

Specifically this line was not thread safe:

This was a previous attempt to fix it: #2239, but had to be reverted because it broke the sink entirely

Expected Behavior

All scraped metrics written to OTEL Collector sink successfully.

Actual Behavior

Exception thrown, no metrics written.

Steps to Reproduce the Problem

Run a reasonably sized Promitor deployment, enable OTEL Collector

Component

Scraper

Version

2.14.1

Configuration

Configuration:

# Add your scraping configuration here

Logs

example

Platform

None

Contact Details

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions