Skip to content

GlobalMutex.LinuxGlobalMutexAdapter fails to handle EINTR #359

Open
@rmunn

Description

@rmunn

While debugging a failing unit test in the LfMerge project, I got the following exception:

SIL.PlatformUtilities.NativeException : An error with the number, 4, ocurred.
  at SIL.Threading.GlobalMutex+LinuxGlobalMutexAdapter.Wait () [0x0002d] in /var/lib/TeamCity/agent/work/60b6a3b495b7759c/SIL.Core/Threading/GlobalMutex.cs:208 
  at SIL.Threading.GlobalMutex+LinuxGlobalMutexAdapter.Init (Boolean initiallyOwned) [0x0007a] in /var/lib/TeamCity/agent/work/60b6a3b495b7759c/SIL.Core/Threading/GlobalMutex.cs:198 
  at SIL.Threading.GlobalMutex.InitializeAndLock (System.Boolean& createdNew) [0x00006] in /var/lib/TeamCity/agent/work/60b6a3b495b7759c/SIL.Core/Threading/GlobalMutex.cs:98 
  at SIL.FieldWorks.FDO.Infrastructure.Impl.SharedXMLBackendProvider.StartupInternal (Int32 currentModelVersion) [0x00000] in <filename unknown>:0 
  at SIL.FieldWorks.FDO.Infrastructure.Impl.FDOBackendProvider.StartupInternalWithDataMigrationIfNeeded (IThreadedProgress progressDlg) [0x00000] in <filename unknown>:0 
  at SIL.FieldWorks.FDO.Infrastructure.Impl.FDOBackendProvider.StartupExtantLanguageProject (IProjectIdentifier projectId, Boolean fBootstrapSystem, IThreadedProgress progressDlg) [0x00000] in <filename unknown>:0

The error number in SIL.PlatformUtilities.NativeException comes from Marshal.GetLastWin32Error(), which on Linux returns the latest value of errno, the Unix C library's all-purpose error number. http://www.virtsync.com/c-error-codes-include-errno lists errno code 4 as EINTR, "Interrupted system call". According to this SO question, this blog post, and this libc manual entry, the right thing to do when EINTR is received is usually to restart the interrupted system call. (If EINTR was received because the user hit Ctrl-C or ran kill (your process ID), then your code should already be handling that signal elsewhere and shutting down the program.)

In this case, that's certainly the right approach. The LinuxGlobalMutexAdapter needs to check for EINTR and handle it by retrying the appropriate system call, up to small number of times (say, 5). I don't currently have time to work on a patch for this, but I will have some time next month if nobody gets to this issue before then.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions