Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Personal/linuxsmiths/wait for ongoing flush must drain flush callback #162

Open
wants to merge 794 commits into
base: main
Choose a base branch
from

Conversation

linuxsmiths
Copy link
Contributor

No description provided.

Nagendra Tomar added 30 commits October 12, 2024 01:09
notify from inside the lock, else it's possible (extremely unlikely though)
that other thread may delete ctx while we are accessing it.
Make max_ino, min_ino atomic.
Make mt19937 thread local.
If shutdown() deletes an inode, reset existing inode_map iterator as the
delete of the current inode can result in delete of otehr inodes, if the
current inode is a directory inode and ourge of that directory reults
in some other constituent indoes to be deleted.
Since write can get a bc allocated by readahead, we need to enforce
rsize <= wsize
make num_no_readahead atomic.
Actually we intentionally didn't make it atomic as we are ok with occassional
inaccuracy, but since it's not common path, let's silence TSAN.
rpc_transport::last_context
Actually we don't fix it, but add the following TSAN suppression.
- race:rpc_transport::last_context
Misc fixes and comments added.
This makes progress very slow with file reads/writes which need to
get the bc vector.
… directory cache purge of the current directory inode
…e not changed.

w/o this read only scenarios, we can have attr expire and then get_file_size() will
return -1 and we will not do readahead.
…of whether it could or could not add the entry to directory cache.

This is because if a file/dir is created inside a directory, our cache technically
becomes out of sync with the server. Note that w emay fail to add to the directory
cache because it may be full or some other error.
…) and send_readdir_or_readdirplus_response() to fix an inode leak

Now we hold inode lookupcnt ref for all entries with a valid inode.
This way, send_readdir_or_readdirplus_response() can call decref() for the inodes
it doens't pass to fuse. Previously we were decrementing dircachecnt and there was a possibility
of that becoming 0 while lookupcnt was already 0. Those inodes would be sitting in inode_map
with lookupcnt=dircachecnt=0.
We rename all locks to the form <context>_lock_N. N is the order of the
lock. A thread can only hold a higher order lock (greater N) then the highest
order lock it's currently holding, i.e., a thread holding a lock *_lock_N
cannot hold any lock from *_lock_0 to *_lock_N-1 (it can only hold *_lock_N+1
and higher order locks).
Later I plan to add assertions to catch violations of the above rule.

nfs_client::inode_map_lock -> nfs_client::inode_map_lock_0
nfs_inode::ilock -> nfs_inode::ilock_1
nfs_inode::readdircache_lock -> nfs_inode::readdircache_lock_2
nfs_client::jukebox_seeds_lock -> nfs_client::jukebox_seeds_lock_39
ra_state::ra_lock -> ra_state::ra_lock_40
rpc_task_helper::task_index_lock -> rpc_task_helper::task_index_lock_41
rpc_stats_az::stats_lock -> rpc_stats_az::stats_lock_42
bytes_chunk_cache::chunkmap_lock -> bytes_chunk_cache::chunkmap_lock_43
membuf::mb_lock -> membuf::mb_lock_44
…correctly

as we were clearing the nfs_inode.
Added a TODO to fix it.
Also fixed a log for readdir.
Nagendra Tomar and others added 30 commits January 19, 2025 16:19
… at the server.

w/o this we might return stale data for a file that's truncated by some other client.
* Inline flush to free the fuse thread.

* Fix

* Addressed the review comments.

* Audit changes.

* Refactored flushing and inline flushing code, with good comments.

---------

Co-authored-by: Nagendra Tomar <[email protected]>
(cherry picked from commit 222c5db)
* Invalidate attribute cache if post-op attributes are missing

* Address comment

* address comment

---------

Co-authored-by: amulyan13 <root@amnNfsCanaryVM3.vnwjqyduitfelpwtytwaokslhe.cbnx.internal.cloudapp.net>
* Working change of inline writes handling.

* Code Refactoring.

* Changed to vector pointer.

* Nitpiks.
* Removing tenant and subid from config

* Minor fixes

* Fixed run command

* Fixes

* Audit changes.

* Updated with clientid processing and ten+sub in cb

* Addressed comments

* Updated

* Updated

* Revert "Updated"

This reverts commit 44693ce.

* Update

* Update

* Update

---------

Co-authored-by: Nagendra Tomar <[email protected]>
This can help us see fi any specific fuse callback handler is blockign or taking more time.
…side sync_membufs() is senn to fail sometimes.
…side sync_membufs() is senn to fail sometimes.
… before starting the first write in sync_membufs()
Co-authored-by: Ubuntu <shubham@shubham808VM-westeurope.2vwp434kux0ehoz4y3jbj5wcwa.ax.internal.cloudapp.net>
This can be called when cache is being shutdown on file close.
This will delete dirty/uncommitted membufs too.
Co-authored-by: Nagendra Tomar <[email protected]>
…e flush to complete.

We cannot hold flush_lock while waiting for any WRITE RPCs to complete as
write_iov_callback()->on_flush_complete() also grabs the flush_lock.
This can cause a deadlock.
…the flush to complete

We have a rule that anyone waiting for a write to complete must not
hold the flush_lock as the write_iov_callback() also may hold the flush
lock.
else we can have a deadlock where write_iov_callback() first releases
the membuf lock which make wait_for_ongoing_flush() believe that all
flush are done and it can take the flush_lock safely, but write_iov_callback()
calls on_flush_complete() later which grabs the flush_lock. This can
cause a deadlock. The fix is to let the callback drain. Since
wait_for_ongoing_flush() is a rarely used function, we make that
check while write_iov_callback() can run w/o any additional check or
complexity.
Another way of solving this would be to release the membuf lock
after on_flush_complete() runs, but then it would have added unncessary
restrictions on what we can do insode on_flush_complete().
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants