Skip to content

Prevent concurrent access to data in InMemoryPerProcess' resolveXXX methods #3216

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

eduar-hte
Copy link
Contributor

@eduar-hte eduar-hte commented Aug 7, 2024

what

Lock access to InMemoryPerProcess' data in resolveXXX methods.

why

As reported in #3054, the resolveXXX methods in InMemoryPerProcess are not acquiring a lock/mutex to prevent concurrent access to the data structures that may be modified at the same time from other threads, and thus triggering undefined behaviour.

changes

  • Replaced inheritance of std::unordered_multimap data structure with data member to prevent potential clients to use it without acquiring the mutex to protect concurrent access.
  • Updates to InMemoryPerProcess::resolveFirst
    • Updated the code to store the expired var in expiredVars to delete them after iterating over the range (and releasing the read lock, as delIfExpired needs to acquire it for exclusive access), as the current call to delIfExpired would invalidate the range triggering undefined behaviour on the followingiteration.
    • Noticed that in commit 118e1b3 the call to delIfExpired in this function is done using it->second.getValue() instead of it->first, which seems incorrect (based on similar code in other resolveXXX functions).
  • Updated InMemoryPerProcess::delIfExpired to use std::find_if (with a lambda that matches both the key and the isExpired condition) because the data structure is a multimap. The version introduced in commit 118e1b3 could find an entry (not necessarily the first, because the map is unordered) where isExpired is false and exit, while another entry could be expired.
  • Replaced pthreads lock with std C++11 std::shared_mutex
    • Provides exclusive/shared lock access so that multiple readers can access the data at the same time, but only one writer. this is used to favor query performance by allowing more concurrent access to the data until an update needs to be performed.
    • Simplifies acquisition and disposal of lock/mutex with std::lock_guard, which has RAII semantics.
    • NOTE: Because std::shared_mutex is not recursive, calls to another function that tries to acquire the lock will fail. Introduced __store & __updateFirst helper methods to workaround this.

@eduar-hte
Copy link
Contributor Author

eduar-hte commented Aug 7, 2024

@airween: I took a look at the multithreaded example you mentioned here (and adapted it to align with the similar code in the reading_logs_via_rule_message example). I could run it successfully (even after adding more threads and requests).

Do you want me to upload it as part of this PR both as an example and also if you want to test some other rules to stress test this code?

@eduar-hte eduar-hte force-pushed the inmemory-collection-shared-mutex branch from 4e4c044 to 3718eca Compare August 7, 2024 22:06
@marcstern marcstern added the 3.x Related to ModSecurity version 3.x label Aug 8, 2024
@eduar-hte eduar-hte force-pushed the inmemory-collection-shared-mutex branch from 3718eca to c51500c Compare August 8, 2024 14:06
@airween
Copy link
Member

airween commented Aug 8, 2024

@eduar-hte - please let us know if you finished the work on this PR.

Btw: have you seen the Sonarcloud's notifications?

@eduar-hte
Copy link
Contributor Author

@eduar-hte - please let us know if you finished the work on this PR.

I think I'm done with the changes, as I tried to limit them to introduce the std::shared_mutex to support exclusive & shared access to the data (protecting the areas that @martinhsv alluded to in #3054.

There is still this question I tried to ask you in this previous comment about whether we should add the multithreaded example you were working on in the context of that issue and that I reworked to align with the changes to the reading_logs_via_rule_message example.

Btw: have you seen the Sonarcloud's notifications?

Yes, I went through all of them.

Most of them (Extract this nested code block into a separate function & Replace this declaration by a structured binding declaration) are about the existing InMemoryPerProcess code, which I tried not to update to limit the changes in this PR to introducing the std::shared_mutex. However, because I had to introduce scopes to unlock the mutex before deleting expired entries, Sonarcloud looks at that as new/updated code.

Some of them (Replace this use of "std::thread" with "std::jthread") are related to suggestions to use C++20 features, which we have to disregard as we're currently targeting C++17 (as I said before, it'd be nice to see how to change Sonarcloud's project configuration for ModSecurity to report issues targeting C++17 instead).

@eduar-hte eduar-hte force-pushed the inmemory-collection-shared-mutex branch from c51500c to 3d9027b Compare August 8, 2024 18:13
@airween
Copy link
Member

airween commented Aug 9, 2024

Do you want me to upload it as part of this PR both as an example and also if you want to test some other rules to stress test this code?

Yes, I think that would be fine - thanks!

@airween
Copy link
Member

airween commented Aug 9, 2024

Do you want me to upload it as part of this PR both as an example and also if you want to test some other rules to stress test this code?

Yes, I think that would be fine - thanks!

@eduar-hte sorry, seems you've already uploaded with commit 3d9027b - thanks.

airween
airween previously approved these changes Aug 9, 2024
Copy link
Member

@airween airween left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - thanks for this very impressive PR.

@eduar-hte
Copy link
Contributor Author

@eduar-hte sorry, seems you've already uploaded with commit 3d9027b - thanks.

No, those were updates to the existing example, reading_logs_via_rule_message, which is multithreaded.

I've just uploaded your test from #3054 (and rewritten it to use libModSecurity C++ API and std::thread, to align it with the changes to the other example).

@eduar-hte eduar-hte force-pushed the inmemory-collection-shared-mutex branch 4 times, most recently from df25d02 to da2bc50 Compare August 9, 2024 18:30
- As reported in owasp-modsecurity#3054, the resolve methods in InMemoryPerProcess are
  not acquiring a lock/mutex to prevent concurrent access to the data
  structures that may be modified at the same time from other threads,
  and thus triggering undefined behaviour.
- Replace inheritance of std::unordered_multimap data structure with
  data member to prevent potential clients to use it without acquiring
  the mutex to protect concurrent access.
- Replace pthreads lock with std C++11 std::shared_mutex
  - Provides exclusive/shared lock access so that multiple readers can
    access the data at the same time, but only one writer. this is used
    to favor query performance by allowing more concurrent access to the
    data until an update needs to be performed.
  - Simplifies acquisition and disposal of lock/mutex with
    std::lock_guard, which has RAII semantics.
  - NOTE: Because std::shared_mutex is not recursive, calls to another
    function that tries to acquire the lock will fail. Introduced
    __store & __updateFirst helper methods to workaround this.
- Updates to InMemoryPerProcess::resolveFirst
  - Updated the code to store the expired var in 'expiredVars' to delete
    them after iterating over the range (and releasing the read lock, as
    'delIfExpired' needs to acquire it for exclusive access), as the
    current call to 'delIfExpired' would invalidate the range triggering
    undefined behaviour on the following iteration.
  - Noticed that in commit 118e1b3 the call to 'delIfExpired' in this
    function is done using 'it->second.getValue()'' instead of
    'it->first', which seems incorrect (based on similar code in other
    resolveXXX functions).
- Updated InMemoryPerProcess::delIfExpired to use 'std::find_if' (with a
  lambda that matches both the key and the 'isExpired' condition)
  because the data structure is a multimap. The version introduced in
  commit 118e1b3 could find an entry (not necessarily the first, because
  the map is unordered) where 'isExpired' is 'false' and exit, while
  another entry could be expired.
…other platforms

- Replaced WITHOUT_XXX build options with WITH_XXX to make it easier to
  understand and configure.
- Updated GitHub workflow to align with these changes and include a
  build 'with lmdb' (again, analogous to non-Windows configurations)
- Replaced pthread_mutex_t in modsecurity::operators::Pm with std::mutex
- Replaced pthread's thread usage in reading_logs_via_rule_message
  example with std::thread.
  - Simplified and modernized C++ code.
- Removed unnecessary includes of pthread.h
…rween)

- Rewritten to use C++ libModSecurity API and std::thread (instead of
  pthreads)
@eduar-hte eduar-hte force-pushed the inmemory-collection-shared-mutex branch from da2bc50 to 4bf9616 Compare August 9, 2024 18:35
Copy link

sonarqubecloud bot commented Aug 9, 2024

@eduar-hte
Copy link
Contributor Author

@eduar-hte - please let us know if you finished the work on this PR.

@airween - this is now done. when this is merged, issue #3054 should be closed.

@airween airween merged commit 718d121 into owasp-modsecurity:v3/master Aug 13, 2024
49 checks passed
@eduar-hte eduar-hte deleted the inmemory-collection-shared-mutex branch August 13, 2024 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.x Related to ModSecurity version 3.x
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants