-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
Block issue happen when I do a websocket function. I will describe the reason directly as follow:
This function will be called when I use websocket function
...
int getNewMode(SocketImpl* sockImpl, int mode)
{
ScopedLock lock(_mutex);
auto it = _socketMap.find(sockImpl);
if (it != _socketMap.end())
mode |= it->second.second;
return mode;
}
We can see that there is an mutex operation which try to lock resource. At the same time, there is another thread executing poll option:
oid SocketReactor::run()
{
_pThread = Thread::current();
while (!_stop)
{
try
{
if (!hasSocketHandlers())
{
onIdle();
Thread::trySleep(static_cast<long>(_timeout.totalMilliseconds()));
}
else
{
bool readable = false;
PollSet::SocketModeMap sm = _pollSet.poll(_timeout);
...................
in _pollSet.poll,
PollSet::SocketModeMap poll(const Poco::Timespan& timeout)
{
PollSet::SocketModeMap result;
Poco::Timespan remainingTime(timeout);
int rc;
ScopedLock lock(_mutex);
do
{
Poco::Timestamp start;
rc = epoll_wait(_epollfd, &_events[0],
static_cast<int>(_events.size()), static_cast<int>(remainingTime.totalMilliseconds()));
.......
The _mutex also can be visit, and when this thread lock successfully, it will sleep remainingTime.totalMilliseconds() at worse case.
This will cause an deep sleep in locking option.
In my machine, following will be shown
The thread of poll always lock resource, and the thread of getNewMode never can lock resource even thread of poll unlock.
To fix this issue, I have a personal advice:
.......
// ScopedLock lock(_mutex); Move this to $$$$1$$$$
do
{
Poco::Timestamp start;
rc = epoll_wait(_epollfd, &_events[0],
static_cast<int>(_events.size()), static_cast<int>(remainingTime.totalMilliseconds()));
if (rc == 0) {
return result;
}
// if we are hitting the events limit, resize it; even without resizing, the subseqent
// calls would round-robin through the remaining ready sockets, but it's better to give
// the call enough room once we start hitting the boundary
if (rc >= _events.size()) _events.resize(_events.size()*2);
else if (rc < 0)
{
// if interrupted and there's still time left, keep waiting
if (SocketImpl::lastError() == POCO_EINTR)
{
Poco::Timestamp end;
Poco::Timespan waited = end - start;
if (waited < remainingTime)
{
remainingTime -= waited;
continue;
}
}
else SocketImpl::error();
}
}
while (false);
ScopedLock lock(_mutex); // $$$$1$$$$
for (int i = 0; i < rc; i++)
......
I checked code and find _mutx is only responsible for _socketMap, so I this this approach could be ok.
To Reproduce
I test multiple platforms, following are results
- Android: ok
- Ubuntu22.04, must be happen
Expected behavior
I want ok
Logs
None
Screenshots
None
Please add relevant environment information:
- OS Type and Version: Ubuntu22.04
- POCO Version: 1.12.4