Skip to content

Conversation

@acwhite211
Copy link
Member

@acwhite211 acwhite211 commented Sep 26, 2025

Fixes #6490

Look at Django docs, it seems that explicitly locking mysql tables from within a Django atomic transaction can lead to issues of tables not getting unlocked. The post_resource function has the transaction.atomic decorator, which calls create_obj -> autonumber_and_save -> do_autonumbering -> lock_tables. This needs to be avoid in order to evade the issue.

LOCK TABLES implicitly commits any active transaction, and while tables are locked the connection is restricted to those tables only. When Django later tries to manage the savepoint/transaction stack, the connection state can no longer matches expectations. This can leave the connection stuck.

This solution tries using GET_LOCK and RELEASE_LOCK, acting like a mutex. Using these instead of lock tables should avoid problems of query transactions getting stuck. These sql named locks are scoped so that only one autonumbering operation can only be performed on a table at a time. This will avoid race-conditions between autonumbering operations.

In addition to the sql named lock, we'll use Django's select_for_update function to perform row-level locking during the autonumbering operation. This implementation will replace the need to create explicit table locks. While the sql named lock will prevent race-conditions between autonumbering process, this row-level locking will ensure that other processes will not create/edit records that the autonumbering operation depends on while running. Indexes are added to the common autonumbering fields in order to get the row-level locking to work with Django.

Checklist

  • Self-review the PR after opening it to make sure the changes look good and
    self-explanatory (or properly documented)
  • Add relevant issue to release milestone
  • Add pr to documentation list

Testing instructions

  • Test creating a collection object to make sure that the autonumbering functionality is working.
  • Try creating two collection objects, in two different windows, at the same time, and see that both get created without timing out.
  • Setup up two different medium-large batch edits in separate windows. Try running them both at relatively the same time. See that they both complete and one does not hang indefinitely.
  • Try running queries in the QB will a batch edit is running. Hopefully see that the queries complete before timing out.

@acwhite211
Copy link
Member Author

Hey @melton-jason, can you test this PR out before I open it up to other testers.

@acwhite211
Copy link
Member Author

I created some other solutions that try to avoid race conditions when doing autonumbering, but none are fully satisfactory:

@acwhite211 acwhite211 marked this pull request as ready for review October 24, 2025 20:25
@acwhite211 acwhite211 changed the title Fix autonumbering locking with GET_LOCK and RELEASE_LOCK Fix autonumbering locking with SQL named locks and Django row locking Oct 24, 2025
@acwhite211
Copy link
Member Author

TODO: add indexes to all the common used autonumbering fields.

Copy link
Contributor

@melton-jason melton-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for taking so long on this.
I haven't yet finished reviewing the autonumbering/locking changes, but here are my comments so far regarding the migrations!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(just a note/consideration)
Very nice, glad we're considering some more indices here! 🏆

I would love to see other indices (like those proposed in Add missing field indexes (#7482)) if the benefit of adding them would outweight the cost, although that could definitely be a future PR.

I will warn that indexes can result in a performance loss for write, insert, and delete operations, as each index on a table needs to be maintained (i.e., the B-Tree or similar data structure updated).

In the case of Specify for example, this could slow down (non-negligibly or perhaps even significantly, depending on the index data structure and data being inserted/updated) WorkBench and BatchEdit operations along with common DataEntry operations.

If this is a fine trade-off (such as the case were there will be overall significantly more reads than write and the index is well-structured for the intended optimization), I say shoot for the stars 🚀

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya, sounds good. I'm looking through the issue, and creating a separate PR for create those specific indexes. The difference for this PR, is that the indexes are always on two values (scoping_field, autonumbering_field). So far we added these two indexes:

operations = [
    migrations.AddIndex(
        model_name='accession',
        index=models.Index(fields=['division_id', 'accessionnumber'], name='AccScopeAccessionsnumberIDX'),
    ),
    migrations.AddIndex(
        model_name='collectionobject',
        index=models.Index(fields=['collectionmemberid', 'catalognumber'], name='ColObjScopeCatalognumberIDX'),
    ),
]

For this PR, I'm going through all the proposed indexes and picking out the ones that are common for autonumbering. Let me know if there are any specific ones you want to include?

@github-project-automation github-project-automation bot moved this from 📋Back Log to Dev Attention Needed in General Tester Board Nov 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Dev Attention Needed

Development

Successfully merging this pull request may close these issues.

The Specify Worker unexpectedly stops responding

3 participants