Skip to content

Add Python 3.14 support#428

Draft
Copilot wants to merge 2 commits intoadd-python3.14from
copilot/sub-pr-402-yet-again
Draft

Add Python 3.14 support#428
Copilot wants to merge 2 commits intoadd-python3.14from
copilot/sub-pr-402-yet-again

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Feb 26, 2026

Adds Python 3.14 support via DevOps updates and minimal implementation changes to the parallel extract machinery. Python 3.14 restricts sharing bare multiprocessing synchronization objects across spawned processes, requiring manager-backed primitives.

Summary

Objectives:

  • Add support for Python 3.14
  • Update DevOps files to include Python 3.14 in the CI matrix
  • Make minimal implementation changes to parallel.py and extract.py to support Python 3.14

Issue resolution:

Select one: This pull request is...

  • a bug fix: increment the patch version
  • a small improvement: increment the minor version
  • a new feature: increment the minor version
  • an incompatible (non-backwards compatible) API change: increment the major version

Small Change

  • To merge, I will use "Squash and merge". That is, this change should be a single commit.
  • Logic: I have visually inspected the entire pull request myself.
  • Pre-commit checks: All the pre-commits checks have passed.

Details

  • DevOps: Add Python 3.14 to CI matrix; pin setuptools >=65.0.0 in conda/dev.yml
  • parallel.py:
    • Switch PrintMonitor to multiprocessing.Manager-backed Value and Lock (Python 3.14 compat — Python 3.14 restricts sharing unpickled synchronization objects across spawned processes)
    • Precompute tar → index map; raise RuntimeError with informative message on double-increment (defensive)
    • Add comments on 10 ms polling trade-off and why manager-backed primitives are needed
  • extract.py:
    • Use args.hpss with fallback to config.hpss so --hpss is not required when the DB has the value
    • Each worker opens its own sqlite3 connection (connections are not process-safe)
    • Wrap the extraction loop in try/finally to guarantee DB connection cleanup on exception; guard sqlite3.connect() with try/except to close on cursor-creation failure
    • Add break after assigning a tar to its worker in the greedy loop (each tar belongs to exactly one worker)
    • Add comment on sleep(0.01) latency trade-off in the main-process drain loop
    • manager.shutdown() called after all workers exit and the failure queue is fully drained

Note on extract.py diff size: The ~400-line diff is dominated by 4-space re-indentation from wrapping the for i in range(nfiles): loop in try/finally. With --ignore-all-space, the diff reduces to ~20 lines covering the 4 substantive logic changes listed above.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

…tools version, manager comment, polling comment, race detection

Co-authored-by: forsyth2 <30700190+forsyth2@users.noreply.github.com>
Copilot AI changed the title [WIP] Add support for Python 3.14 Add Python 3.14 support Feb 26, 2026
@forsyth2
Copy link
Copy Markdown
Collaborator

@copilot Can you explain the changes in zstash/extract.py. They look pretty extensive for just addressing code review comments. Is it mostly indentation changes?

Copy link
Copy Markdown
Contributor Author

Copilot AI commented Feb 26, 2026

@copilot Can you explain the changes in zstash/extract.py. They look pretty extensive for just addressing code review comments. Is it mostly indentation changes?

Yes, almost entirely indentation. The 400 line "change" in the diff is dominated by the 4 spaces added by wrapping the for i in range(nfiles): loop in try/finally.

With --ignore-all-space, the diff shrinks to 20 lines covering just 4 substantive changes:

  1. Line 359: break added after finding the right worker in the tar-assignment loop
  2. Lines 398-399: Two comment lines added above time.sleep(0.01)
  3. Lines 525-538: DB connection opening refactored — con declared as Optional[sqlite3.Connection] = None upfront, and the sqlite3.connect() call wrapped in try/except so the connection is closed if cursor creation fails
  4. Lines 557 / 750-754: try: added before the for loop and finally: + DB cleanup moved inside it (the for loop body itself is unchanged, just re-indented by 4 spaces)

@forsyth2
Copy link
Copy Markdown
Collaborator

For reference, with --ignore-all-space, we get simply:

> git diff HEAD^ HEAD --ignore-all-space | cat
diff --git a/conda/dev.yml b/conda/dev.yml
index 85f64d3..9ea3757 100644
--- a/conda/dev.yml
+++ b/conda/dev.yml
@@ -6,7 +6,7 @@ dependencies:
   # =================
   - pip
   - python >=3.11,<3.15
-  - setuptools
+  - setuptools >=65.0.0
   - sqlite
   - six >=1.16.0
   - globus-sdk >=3.15.0,<4.0
diff --git a/zstash/extract.py b/zstash/extract.py
index c72432f..3dfaa81 100644
--- a/zstash/extract.py
+++ b/zstash/extract.py
@@ -356,6 +356,7 @@ def multiprocess_extract(
         for workers_idx in range(len(workers_to_tars)):
             if db_row.tar in workers_to_tars[workers_idx]:
                 workers_to_matches[workers_idx].append(db_row)
+                break
 
     # Ensure each worker processes tars in order
     for worker_matches in workers_to_matches:
@@ -395,6 +396,8 @@ def multiprocess_extract(
                 failures.append(failure_queue.get_nowait())
         except queue.Empty:
             pass
+        # Sleep briefly to avoid busy-waiting. This limits failure detection
+        # to ~100 checks/second (~10 ms worst-case latency for short jobs).
         time.sleep(0.01)
 
     # Drain any remaining failures after all processes have exited.
@@ -520,14 +523,19 @@ def extractFiles(  # noqa: C901
     will be opened for this worker process.
     """
     # Open database connection if not provided (parallel case)
+    con: Optional[sqlite3.Connection] = None
+    close_db: bool = False
     if cur is None:
-        con: sqlite3.Connection = sqlite3.connect(
+        try:
+            con = sqlite3.connect(
                 get_db_filename(cache), detect_types=sqlite3.PARSE_DECLTYPES
             )
             cur = con.cursor()
-        close_db: bool = True
-    else:
-        close_db = False
+            close_db = True
+        except Exception:
+            if con is not None:
+                con.close()
+            raise
 
     failures: List[FilesRow] = []
     tfname: str
@@ -546,6 +554,7 @@ def extractFiles(  # noqa: C901
         # Don't have the logger print to the console as the message come in.
         logger.propagate = False
 
+    try:
         for i in range(nfiles):
             files_row: FilesRow = files[i]
 
@@ -738,8 +747,9 @@ def extractFiles(  # noqa: C901
                     else:
                         raise TypeError("Invalid tfname={}".format(tfname))
 
+    finally:
         # Close database connection if we opened it
-    if close_db:
+        if close_db and con is not None:
             cur.close()
             con.close()
 
diff --git a/zstash/parallel.py b/zstash/parallel.py
index 4bcda59..741572a 100644
--- a/zstash/parallel.py
+++ b/zstash/parallel.py
@@ -43,7 +43,10 @@ class PrintMonitor(object):
             tar: i for i, tar in enumerate(tars_to_print)
         }
 
-        # Use a simple counter to track which tar we're on
+        # Use manager-backed primitives instead of bare multiprocessing primitives
+        # for Python 3.14 compatibility: Python 3.14 restricts sharing unpickled
+        # synchronization objects across processes (see bpo-38119 / gh-84582).
+        # A simple counter tracks which tar we're on.
         self._current_tar_index: multiprocessing.managers.ValueProxy = manager.Value(
             "i", 0
         )
@@ -70,6 +73,7 @@ class PrintMonitor(object):
                 raise NotYourTurnError()
 
             attempted = True
+            # Poll every 10 ms: lower CPU usage vs 1 ms at cost of ~10 ms latency.
             time.sleep(0.01)
 
     def done_enqueuing_output_for_tar(
@@ -86,6 +90,14 @@ class PrintMonitor(object):
         with self._lock:
             if self._current_tar_index.value == tar_index:
                 self._current_tar_index.value += 1
+            else:
+                raise RuntimeError(
+                    "Attempted to advance tar index for tar {} (expected index"
+                    " {}) but current index is {}. This indicates a programming"
+                    " error.".format(
+                        workers_curr_tar, tar_index, self._current_tar_index.value
+                    )
+                )
 
 
 class ExtractWorker(object):

@forsyth2
Copy link
Copy Markdown
Collaborator

Using the performance plots of #427, following up on #402 (comment):

PR 428 (this one) vs main
extract_pr428

PR 428 (this one) vs PR #402
extract_pr428_vs_pr402

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants