Skip to content

[BUG] matchSessions is brittle if sessions use different HasFS instances #663

Open
@jorisdral

Description

@jorisdral

-- | Check that all tables in the session match. If so, return the matched
-- session. If there is a mismatch, return the list indices of the mismatching
-- tables.
--
-- TODO: compare LockFileHandle instead of SessionRoot (?). We can write an Eq
-- instance for LockFileHandle based on pointer equality, just like base does
-- for Handle.
matchSessions ::
(MonadSTM m, MonadThrow m)
=> NonEmpty (Table m h)
-> m (Either (Int, Int) (Session m h))
matchSessions = \(t :| ts) ->
withSessionRoot t $ \root -> do
eith <- go root 1 ts
pure $ case eith of
Left i -> Left (0, i)
Right () -> Right (tableSession t)
where
-- Check that the session roots for all tables are the same. There can only
-- be one *open/active* session per directory because of cooperative file
-- locks, so each unique *open* session has a unique session root. We check
-- that all the table's sessions are open at the same time while comparing
-- the session roots.
go _ _ [] = pure (Right ())
go root !i (t':ts') =
withSessionRoot t' $ \root' ->
if root == root'
then go root (i+1) ts'
else pure (Left i)
withSessionRoot t k = withOpenSession (tableSession t) $ k . sessionRoot

matchSessions checks whether a non-empty set of tables all have the same session by checking that the session roots (i.e., file paths) are the same. Recall that because of file locking, there can only be one open session at any given time for a specific directory. However, this really only works nicely if all sessions are using the same HasFS instance. Each session has its own HasFS instance (with its own mount point if it is the real file system), and the implementations of the HasFS instances might be different entirely (some could be the real file system, some could be simulations). So in general, HasFS instances (and therefore sessions) can not be uniquely identified by the session root.

In practice, all sessions will be using the same HasFS instance, but it's not guaranteed to be so. Ideally, we'd be able to compare the HasFS instances as well, but it's hard (if not impossible) to compare HasFS instances because they are records of functions. Maybe we could get away with giving each HasFS instance an approximately unique identifier (like wall clock time) to use a basis for comparisons. If something like this is not achievable, then we should at least mention the brittleness in the documentation of matchSessions and possibly in the public API.

One could also muse about whether it would be fine for the union's input tables to have different sessions and pick one of the sessions arbitrarily for the output table. I can imagine that this would open a whole other can of warms, so maybe it's best to restrict interaction between sessions as much as possible.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions