Skip to content

Fix bugs in datum_queue_free#29

Closed
jesterhodl wants to merge 5 commits intoOCEAN-xyz:masterfrom
jesterhodl:bugs-in-datum_queue_free
Closed

Fix bugs in datum_queue_free#29
jesterhodl wants to merge 5 commits intoOCEAN-xyz:masterfrom
jesterhodl:bugs-in-datum_queue_free

Conversation

@jesterhodl
Copy link
Copy Markdown
Contributor

  1. q->buffer[1] doesn't seem to be free()'d

  2. q->buffer[0] is duplicated, you probably meant to zero out buffer[1]

  3. shouldn't the unlocking order be reverse of locking order to prevent deadlocks?

1. q->buffer[1] doesn't seem to be free()'d

2. q->buffer[0] is duplicated, you probably meant to zero out buffer[1]

3. shouldn't the unlocking order be reverse of locking order to prevent deadlocks?
@wizkid057
Copy link
Copy Markdown
Member

buffer[1] cannot be free'd because its allocated in a single call for buffer[0] and then split.

q->buffer[1] = ((char *)q->buffer[0]) + ((max_items + 16) * item_size);

@wizkid057
Copy link
Copy Markdown
Member

wizkid057 commented Oct 23, 2024

Looking at the locks, in this case the goal is to acquire all locks as write locks, which, once acquired, prevents other threads from acquiring the locks at all until released. There's no contention possible on the ordering of unlocking since the cleanup function has the sole lock at that point.

The unsetting of [0] twice is a bug, though.

Looking more closely, there may be a race issue where the "datum_queue_free" function frees data while a "datum_queue_add_item" tries to get a lock that's already destroyed. Probably need to check if the buffer ptr is not null around here before trying to get the lock.

This fixes the original harmless buglet of zeroing out q->buffer[0] twice

Will think about the rest of the potential race condition now.
@jesterhodl
Copy link
Copy Markdown
Contributor Author

I'm reading through the code, but I think I found one more thing.
Don't we have a one 0 too much in the condition?

        for (i=0;i<10000000;i++) {
                if (i < 99999990)  // should be 9999999, anything larger doesn't make sense 
// 10000000
// 99999990

@wizkid057
Copy link
Copy Markdown
Member

I'm reading through the code, but I think I found one more thing. Don't we have a one 0 too much in the condition?

        for (i=0;i<10000000;i++) {
                if (i < 99999990)  // should be 9999999, anything larger doesn't make sense 
// 10000000
// 99999990

Yep, seems so.

Anything larger than 9999999 doesn't make sense
@jesterhodl
Copy link
Copy Markdown
Contributor Author

jesterhodl commented Oct 24, 2024

Is this what you had in mind?

			pthread_rwlock_unlock(&q->active_buffer_rwlock);
			
			// Check potential race condition with datum_queue_free() before acquiring a lock
			if (q->buffer[buffer_id] == NULL) {
				DLOG_ERROR("Buffer pointer null. Queue already destroyed?");
				return -1;
			}
			
			// get a write lock for that buffer
			pthread_rwlock_wrlock(&q->buffer_rwlock[buffer_id]);

@luke-jr
Copy link
Copy Markdown
Contributor

luke-jr commented Mar 8, 2025

This was merged...

@luke-jr luke-jr closed this Mar 8, 2025
@jesterhodl jesterhodl deleted the bugs-in-datum_queue_free branch April 1, 2025 10:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants