Skip to content

Optimize thread_queue_hack with std::mutex#83

Open
lucifer4330k wants to merge 1 commit into
jesec:masterfrom
lucifer4330k:fix-pr14
Open

Optimize thread_queue_hack with std::mutex#83
lucifer4330k wants to merge 1 commit into
jesec:masterfrom
lucifer4330k:fix-pr14

Conversation

@lucifer4330k

@lucifer4330k lucifer4330k commented Feb 18, 2026

Copy link
Copy Markdown

Replaced the manual spinlock implementation (using __sync_bool_compare_and_swap and usleep(0)) in thread_queue_hack with std::mutex.

This change:

  • Replaces busy-waiting with efficient OS-level blocking.
  • Fixes a potential lock leak in push_back when the queue overflows (previously threw without unlocking).
  • Avoids using memset on non-trivial types (which was unsafe if member types were changed).
  • Standardizes synchronization using C++17 primitives.
  • 4 threads: throughput improvement
  • 8 threads: Comparable throughput, but with better CPU scheduling behavior.
    #gsoc

Replaced the manual spinlock implementation (using `__sync_bool_compare_and_swap` and `usleep(0)`) in `thread_queue_hack` with `std::mutex`.

This change:
- Replaces busy-waiting with efficient OS-level blocking.
- Fixes a potential lock leak in `push_back` when the queue overflows (previously threw without unlocking).
- Avoids using `memset` on non-trivial types (which was unsafe if member types were changed).
- Standardizes synchronization using C++17 primitives.

Benchmark results (standalone test with 10k iterations):
- 4 threads: ~5% throughput improvement (293k -> 309k ops/s).
- 8 threads: Comparable throughput, but with better CPU scheduling behavior.
Copilot AI review requested due to automatic review settings February 18, 2026 14:06

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR modernizes the thread_queue_hack synchronization mechanism by replacing a custom spinlock implementation with std::mutex.

Changes:

  • Replaced manual spinlock (__sync_bool_compare_and_swap with busy-waiting) with std::mutex for more efficient OS-level blocking
  • Fixed lock leak in push_back when queue overflows by unlocking before throwing exception
  • Replaced unsafe memset calls with std::fill to properly handle the now-non-trivial type containing std::mutex

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/thread_base.cc
Comment on lines +88 to 89
std::mutex m_mutex;
value_type m_queue[max_size + 1];

Copilot AI Feb 18, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The placement of std::mutex as the first member may affect cacheline alignment strategy. The original code used a simple int m_lock which is 4 bytes, while std::mutex is typically much larger (often 40+ bytes on Linux). This could cause the m_queue array to span additional cachelines, potentially impacting the performance benefits of the lt_cacheline_aligned attribute on this class. Consider placing m_mutex after m_queue to keep the frequently-accessed queue data at the beginning of the cacheline, or verify that this new layout doesn't negatively impact false sharing patterns.

Suggested change
std::mutex m_mutex;
value_type m_queue[max_size + 1];
value_type m_queue[max_size + 1];
std::mutex m_mutex;

Copilot uses AI. Check for mistakes.
Comment thread src/thread_base.cc
#include <algorithm>
#include <cassert>
#include <cstdlib>
#include <cstring>

Copilot AI Feb 18, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The <cstring> header is no longer needed after removing the memset calls. Consider removing this unused include.

Suggested change
#include <cstring>

Copilot uses AI. Check for mistakes.
Comment thread src/thread_base.cc
#include <cstring>
#include <iostream>
#include <iterator>
#include <mutex>

Copilot AI Feb 18, 2026

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR description mentions "C++17 primitives", but std::mutex has been available since C++11. This is a minor inaccuracy in the description, though the change is still correct and beneficial.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants