Skip to content

AccessViolationException: Attempted to read or write protected memory - caused the Nethermind node to restart #8378

Open
@markoburcul

Description

@markoburcul

Description
We experienced an unexpected restart of our Nethermind node. System metrics do not indicate resource exhaustion as the cause of the restart. Upon inspecting the logs, we found the following error occurring shortly before the restart:

Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.

This appears to be an isolated event, and we cannot reproduce the issue at this time. Since this is a mainnet node, the restart caused downtime, which is a significant concern for us. We would appreciate any guidance on:

  • Understanding the root cause of this error.
  • Steps to mitigate such incidents in the future.
  • Recommendations for hardening our setup to improve resilience.

Steps to Reproduce
Unfortunately, we cannot provide a reproducible scenario for this issue:

  • The node was running normally.
  • The fatal error occurred, and the node restarted itself.

Actual behavior
The node unexpectedly restarted, resulting in downtime.

Expected behavior
If the error is non-critical, we would expect it to be handled gracefully, allowing the node to continue operating without a restart.

Desktop (please complete the following information):
Please provide the following information regarding your setup:

  • Operating System: NixOS 23.05
  • Version: 1.31.1
  • Installation Method: GitHub Release
  • Consensus Client: Nimbus

Logs
Please include any relevant logs that may help identify the issue.

2025-03-15 18:22:30.279	15 Mar 17:22:30 | Nethermind is starting up
2025-03-15 18:22:06.641	at System.Threading.PortableThreadPool+WorkerThread.WorkerThreadStart()
2025-03-15 18:22:06.641	at System.Threading.ThreadPoolWorkQueue.Dispatch()
2025-03-15 18:22:06.641	at System.Runtime.CompilerServices.AsyncTaskMethodBuilder`1+AsyncStateMachineBox`1[[System.Threading.Tasks.VoidTaskResult, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol+<ProcessRequests>d__238`1[[System.__Canon, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], Microsoft.AspNetCore.Server.Kestrel.Core, Version=9.0.0.0, Culture=neutral, PublicKeyToken=adb9793829ddae60]].MoveNext(System.Threading.Thread)
2025-03-15 18:22:06.641	at System.Threading.ExecutionContext.RunFromThreadPoolDispatchLoop(System.Threading.Thread, System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object)
2025-03-15 18:22:06.641	at Microsoft.AspNetCore.Server.Kestrel.Core.Internal.Http.HttpProtocol+<ProcessRequests>d__238`1[[System.__Canon, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
2025-03-15 18:22:06.641	at Microsoft.AspNetCore.Builder.Extensions.MapMiddleware.Invoke(Microsoft.AspNetCore.Http.HttpContext)
2025-03-15 18:22:06.641	at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Microsoft.AspNetCore.Builder.Extensions.MapMiddleware+<InvokeCore>d__4, Microsoft.AspNetCore.Http.Abstractions, Version=9.0.0.0, Culture=neutral, PublicKeyToken=adb9793829ddae60]](<InvokeCore>d__4 ByRef)
2025-03-15 18:22:06.641	at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Prometheus.MetricServerMiddleware+<Invoke>d__7, Prometheus.AspNetCore, Version=8.0.0.0, Culture=neutral, PublicKeyToken=a243e9817ba9d559]](<Invoke>d__7 ByRef)
2025-03-15 18:22:06.641	at Microsoft.AspNetCore.Builder.Extensions.MapMiddleware+<InvokeCore>d__4.MoveNext()
2025-03-15 18:22:06.641	at Prometheus.MetricServerMiddleware+<Invoke>d__7.MoveNext()
2025-03-15 18:22:06.641	at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Prometheus.CollectorRegistry+<CollectAndSerializeAsync>d__21, Prometheus.NetStandard, Version=8.0.0.0, Culture=neutral, PublicKeyToken=a243e9817ba9d559]](<CollectAndSerializeAsync>d__21 ByRef)
2025-03-15 18:22:06.641	at Prometheus.CollectorRegistry+<CollectAndSerializeAsync>d__21.MoveNext()
2025-03-15 18:22:06.641	at Prometheus.CollectorRegistry+<RunBeforeCollectCallbacksAsync>d__22.MoveNext()	
2025-03-15 18:22:06.641	at System.Runtime.CompilerServices.AsyncMethodBuilderCore.Start[[Prometheus.CollectorRegistry+<RunBeforeCollectCallbacksAsync>d__22, Prometheus.NetStandard, Version=8.0.0.0, Culture=neutral, PublicKeyToken=a243e9817ba9d559]](<RunBeforeCollectCallbacksAsync>d__22 ByRef)
2025-03-15 18:22:06.641	at Prometheus.DotNetStats.UpdateMetrics()
2025-03-15 18:22:06.641	at System.Diagnostics.Process.EnsureHandleCountPopulated()
2025-03-15 18:22:06.641	at System.Collections.Generic.List`1[[System.__Canon, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]]..ctor(System.Collections.Generic.IEnumerable`1<System.__Canon>)
2025-03-15 18:22:06.641	at System.IO.Enumeration.FileSystemEnumerator`1[[System.__Canon, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]].MoveNext()
2025-03-15 18:22:06.641	at System.IO.Enumeration.FileSystemEntry.Initialize(System.IO.Enumeration.FileSystemEntry ByRef, DirectoryEntry, System.ReadOnlySpan`1<Char>, System.ReadOnlySpan`1<Char>, System.ReadOnlySpan`1<Char>, System.Span`1<Char>)
2025-03-15 18:22:06.641	at System.SpanHelpers.NonPackedIndexOfValueType[[System.Byte, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e],[System.SpanHelpers+DontNegate`1[[System.Byte, System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]], System.Private.CoreLib, Version=9.0.0.0, Culture=neutral, PublicKeyToken=7cec85d7bea7798e]](Byte ByRef, Byte, Int32)
2025-03-15 18:22:06.641	Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
2025-03-15 18:22:06.560	15 Mar 17:22:06 | Attempt to request ENR before bonding

Additional Context
We discovered that similar issues have been reported previously:

These reports suggest this may not be an isolated case. However, the recurrence raises concerns about whether the underlying problem is fully resolved or if additional safeguards are needed.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions