Skip to content

Sync: server can persist a board referencing a local (short) board id #2218

Description

@RodriSanchez1

Part of #2195 (sync engine resilience epic). Priority: High — produces persistent server-side data corruption with no automatic recovery.

Summary

When local board changes are synced to the API, a board can be saved on the server with a tile whose loadBoard points to a short, locally-generated id (shortid) instead of a server-assigned board id. That short id does not correspond to any board on the server, so the reference is dangling: opening that folder tile from a board pulled on another device (or after a fresh pull) resolves to nothing.

The server state ends up out of sync with local state: locally the parent correctly points at the server child, but the copy persisted on the server still points at the old short id.

Background

The client generates board ids locally with shortid.generate(). When a board is created on the server (POST /board), the server assigns a new (MongoDB ObjectId-style) id and returns it. Any reference to the old short id must then be rewritten everywhere it appears:

  • tile.loadBoard on parent boards
  • communicator.boards[], communicator.rootBoard, communicator.activeBoardId

The interactive edit path (updateApiObjects, Board.actions.js:1152) handles this atomically: it creates the child, reads back the new id, rewrites the parent's tile.loadBoard (lines 1163-1168), and pushes the parent in the same operation.

The sync path (pushLocalChangesToApi, Board.actions.js:747) does not. It pushes boards one at a time via updateApiObjectsNoChild, and the parent's loadBoard is only reconciled by a separate, deferred best-effort step (CREATE_API_BOARD_SUCCESS rewriting local state + markToUpdate / shouldCreateBoard re-push via updateApiMarkedBoards). That second step is not atomic with the first push, so it can be lost.

Why it happens

A folder created while logged out / offline never goes through the atomic edit path (handleApiUpdates early-returns when there is no logged-in user, Board.container.js:1087). It produces purely local state:

  • CREATE_BOARD → child board C with a short id, marked PENDING, appended to the end of state.boards (Board.reducer.js:298).
  • CREATE_TILE on parent P → tile with loadBoard = C.shortId, P marked PENDING (Board.reducer.js:328).

On the next sync, pushLocalChangesToApi pushes boards in state.boards array order (Board.actions.js:688). Because the new child was appended last, the parent is pushed before the child:

  1. Parent P is pushed first, while C still has its short id, so the request body carries loadBoard = C.shortId. The server now stores a dangling reference.
  2. Child C is created next; CREATE_API_BOARD_SUCCESS (Board.reducer.js:379) rewrites P in local state and re-marks P PENDING.
  3. A deferred pass (updateApiMarkedBoards) is expected to re-push the corrected P.

Step 1 always happens. Step 3 is a separate, later request — it is what is supposed to fix the server, and it can be dropped.

REPLACE_BOARD (Board.reducer.js:216), which the sync path dispatches, does not rewrite parent loadBoard references.

Impact

  • Transient (common): on essentially every sync that materializes an offline-created folder, the server briefly holds the parent with a short loadBoard between step 1 and step 3.
  • Persistent (data corruption): the corrected re-push (step 3) can fail to land while the parent push (step 1) already succeeded, leaving the server permanently pointing at a non-existent short id. Known triggers:
    • The child create fails (validation/network) while the parent push succeeded. The parent is now SYNCED on the server with the short id; it is only re-marked PENDING if/when the child is eventually created, so if the child keeps failing the dangling reference is permanent.
    • The sync run is interrupted (tab closed, app killed, connection dropped) between the parent push and the child create / corrected re-push.
    • The deferred re-push is silently dropped — errors in the push loop are swallowed (Board.actions.js:804), and updateApiMarkedBoards aborts the whole pass with return instead of continue when a board id changed mid-loop (Board.actions.js:1098).

Result on affected accounts: a folder tile that opens correctly on the device that created it, but opens to an empty/missing board on any other device or after the local cache is cleared, because the server copy references an id that was never saved.

Reproduction

Manual (real-world)

  1. Log in as a user (so the parent is a real server board with a long id).
  2. Open DevTools → Network → set to Offline.
  3. Inside one of your boards, add a folder tile. This creates child board C with a short id; the parent P now has a tile with loadBoard = C.shortId; both are marked PENDING. Edit a tile inside the new folder so C is dirty.
  4. Go back Online. Sync starts and pushes parent P first, carrying C.shortId.
  5. Force the child to not complete: use DevTools "Block request URL" on the child POST /board (or switch back to Offline) right after the parent's request completes.
  6. Inspect P on the server (fetch it directly, or open it on a second device): its folder tile's loadBoard is the short id, and opening that folder elsewhere resolves to nothing.

If the child create is allowed to succeed, the Network tab still shows the bad first push (parent with the short id) followed by a corrected push — that is the self-heal. The persistent corruption requires the corrected push to be lost (step 5).

Deterministic (automated)

Using the real reducer + thunk with the ../../api module mocked, capture the request payloads and force the child create to reject:

import API from '../../api';
jest.mock('../../api');

it('server receives a tile.loadBoard pointing at a local short id', async () => {
  const parent = {
    id: '1234567890ABCDEF',                 // long id = existing server board
    email: 'me@x.com',
    tiles: [{ id: 't1', loadBoard: 'childShort' }],
  };
  const child = { id: 'childShort', email: 'me@x.com', tiles: [] }; // needs create

  const store = makeRealStore({
    boards: [parent, child],
    syncMeta: {
      '1234567890ABCDEF': { status: 'PENDING' },
      childShort: { status: 'PENDING' },
    },
  });

  const puts = [];
  API.updateBoard.mockImplementation(b => { puts.push(b); return Promise.resolve(b); });
  API.createBoard.mockRejectedValue(new Error('boom')); // child create fails

  await store.dispatch(actions.pushLocalChangesToApi([]));

  // Parent was sent to the server carrying the LOCAL short id:
  expect(puts[0].tiles[0].loadBoard).toBe('childShort');
  // and it is never corrected, because the child never got a server id:
  expect(puts.some(p => p.tiles[0].loadBoard !== 'childShort')).toBe(false);
});

Removing the mockRejectedValue line lets the test observe the self-heal (a second, corrected push), which is why the bug is intermittent rather than constant.

Relevant code

  • pushLocalChangesToApi — sync push, per-board, parent-before-child order: Board.actions.js:747
  • Push loop, swallowed errors: Board.actions.js:786
  • updateApiObjectsNoChild (no parent rewrite): Board.actions.js:1045
  • updateApiMarkedBoards return-instead-of-continue: Board.actions.js:1098
  • updateApiObjects (atomic edit path, for contrast): Board.actions.js:1152
  • CREATE_API_BOARD_SUCCESS local rewrite: Board.reducer.js:379
  • REPLACE_BOARD (does not rewrite parent refs): Board.reducer.js:216
  • Offline creation marks boards PENDING: Board.reducer.js:298, :328
  • Edit path early-returns when logged out: Board.container.js:1087

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions