Potential memory leak in v0.20.0 on linux/arm64 with webhook provider

## Description

I'm observing non-deterministic memory growth in external-dns v0.20.0 on linux/arm64. The external-dns container memory increases from ~14Mi to ~90Mi (6x increase) during initialization and stays elevated until pod restart.

I observed this issue several times initially, but have been unable to reproduce it since. The non-deterministic nature suggests a possible race condition or timing-dependent issue.

## Environment

**External-DNS:**
- Version: v0.20.0
- Platform: linux/arm64
- Image: `registry.k8s.io/external-dns/external-dns:v0.20.0`

**Configuration (actual from running pod):**
```
Sources: [gateway-httproute service]
Interval: 1m0s
MinEventSyncInterval: 5s
Policy: sync
Registry: txt
TXTOwnerID: unifi
Provider: webhook
ProviderCacheTime: 0s
WebhookProviderURL: http://localhost:8888
WebhookProviderReadTimeout: 5s
WebhookProviderWriteTimeout: 10s
AnnotationPrefix: internal-dns/
LogLevel: info
LogFormat: json
MetricsAddress: :7979
DomainFilter: []
ManagedDNSRecordTypes: [A AAAA CNAME]
```

**Kubernetes:**
- Deployment with 2 containers (external-dns + webhook provider)
- Webhook provider memory: stable 33-34Mi (NOT affected)
- DNS records managed: ~10 A records

## Expected Behavior

External-DNS memory should remain stable around 14-18Mi, similar to what I observed with the webhook provider container which stays consistently at 33-34Mi.

## Actual Behavior

**Normal state (most of the time):**
- external-dns: 14-18Mi
- Total pod: 48-52Mi

**Problem state (observed several times, cannot reproduce now):**
- external-dns: 90Mi (6x increase!)
- Total pod: 124Mi
- Memory stayed elevated until manual pod restart

**Important details:**
- All DNS records were already "up to date" - no changes were being made
- No record manipulations occurred during the high memory state
- Logs showed only normal operation messages (see below)

## Reproducibility

**Cannot reliably reproduce:**
- Observed the issue several times on fresh pod starts
- After manual restarts, sometimes reproduced, sometimes didn't
- Ran 10 consecutive pod restarts as a test - all showed normal memory (14-18Mi)
- Problem has not recurred since initial observations

This non-deterministic behavior suggests a race condition or state-dependent issue.

## Logs

**Logs were completely clean** during both normal and high-memory states. No errors, warnings, or unusual messages:

```json
{"level":"info","msg":"All records are already up to date"}
```

Repeated every minute. **No webhook errors, API errors, retries, or any indication of problems.**

The clean logs are particularly notable because:
1. No record changes were happening
2. No errors to trigger retries or buffering
3. External-DNS reported normal operation while using 6x memory

## Investigation Performed

1. **Webhook provider**: Memory stable at 33-34Mi in all cases, logs clean
2. **Configuration**: `ProviderCacheTime: 0s` means no webhook response caching
3. **Go memstats** (when operating normally at 14Mi):
   - `go_memstats_alloc_bytes`: 6.2MB
   - `go_memstats_heap_inuse_bytes`: 9.6MB
   - `go_memstats_stack_inuse_bytes`: 1.1MB
4. **Restart behavior**: Problem cleared immediately on pod restart
5. **Logs**: Clean in both states - no errors or warnings at any point

## Hypothesis

Since the webhook provider memory remains stable and there's no caching (`ProviderCacheTime: 0s`), the issue appears to be in external-dns internal components, possibly:
- Kubernetes informers (gateway-httproute, service, pods, nodes, namespaces, endpointslices)
- Platform-specific issue (linux/arm64)
- Race condition during initialization
- Regression from v0.19.0 (which had significant memory improvements)

## Questions

1. Are there known issues with v0.20.0 on arm64?
2. Have others reported similar memory behavior with v0.20.0?
3. Any known race conditions in informer initialization that could cause this?

## Additional Context

- I can provide heap dumps and goroutine dumps if the problem reproduces
- Willing to test patches or provide additional diagnostics
- Problem is not critical (pod still functional, restart resolves it)
- Unable to reproduce on demand, so cannot test downgrade scenarios

## Related

- v0.19.0 release notes mention memory improvements: https://github.com/kubernetes-sigs/external-dns/releases/tag/v0.19.0
- v0.20.0 is very recent (5 days old): https://github.com/kubernetes-sigs/external-dns/releases/tag/v0.20.0

---

*Filing this issue despite inability to reproduce, in case others encounter the same behavior.*


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Potential memory leak in v0.20.0 on linux/arm64 with webhook provider #5965

Description

Environment

Expected Behavior

Actual Behavior

Reproducibility

Logs

Investigation Performed

Hypothesis

Questions

Additional Context

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Potential memory leak in v0.20.0 on linux/arm64 with webhook provider #5965

Description

Description

Environment

Expected Behavior

Actual Behavior

Reproducibility

Logs

Investigation Performed

Hypothesis

Questions

Additional Context

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions