Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Enable FUSE SDK with S3 test #16808

Draft
wants to merge 37 commits into
base: main
Choose a base branch
from
Draft

Conversation

LuQQiu
Copy link
Contributor

@LuQQiu LuQQiu commented Jan 20, 2023

Blocked by #16767

JiamingMai and others added 30 commits November 21, 2022 14:09
### What changes are proposed in this pull request?

Remove the restriction of UFS for SDK.

### Why are the changes needed?

To meet the business requirements.

### Does this PR introduce any user facing changes?

No, it doesn't.

pr-link: Alluxio#16570
change-id: cid-cca2be339f42d6035968999f258c04ad76d1b85e
### What changes are proposed in this pull request?
Add property key to enable dora client read location polic

### Why are the changes needed?

enable dora client read location polic

### Does this PR introduce any user facing changes?

addition of property keys

pr-link: Alluxio#16601
change-id: cid-30229fa9aac78359ebdd12b1cb317245016c370c
### What changes are proposed in this pull request?

Implement the skeleton of dora work

### Why are the changes needed?


### Does this PR introduce any user facing changes?

no

pr-link: Alluxio#16615
change-id: cid-e48f5bd1e4611d5e6b300cf9851fce18adb808c1
### What changes are proposed in this pull request?

Add method to get file status.

### Why are the changes needed?

We need to get file status in Dora.

### Does this PR introduce any user facing changes?

No, it doesn't.

pr-link: Alluxio#16627
change-id: cid-d877d3d1240663afc89bc6d6d079bf75a8751711
### What changes are proposed in this pull request?

Please outline the changes and how this PR fixes the issue.

### Why are the changes needed?

Please clarify why the changes are needed. For instance,
1. If you propose a new API, clarify the use case for a new API.
2. If you fix a bug, describe the bug.

### Does this PR introduce any user facing changes?

Please list the user-facing changes introduced by your change, including
1. change in user-facing APIs
2. addition or removal of property keys
3. webui

pr-link: Alluxio#16632
change-id: cid-6affd9d6cb0a01ea8352bfa7af7731dab49333ae
### What changes are proposed in this pull request?

Please outline the changes and how this PR fixes the issue.

### Why are the changes needed?

Please clarify why the changes are needed. For instance,
1. If you propose a new API, clarify the use case for a new API.
2. If you fix a bug, describe the bug.

### Does this PR introduce any user facing changes?

Please list the user-facing changes introduced by your change, including
1. change in user-facing APIs
2. addition or removal of property keys
3. webui

pr-link: Alluxio#16637
change-id: cid-99bc5802f28799347181313efb9e37b2cd46b186
Use mdtest to get file status of 1000 files, by removing the isFile
call, the get status performance can double.

pr-link: Alluxio#16678
change-id: cid-31edf9b34080ddd5595c9ce613ddaa8e29901a66
### What changes are proposed in this pull request?

Remove extra get status call in fuse.open()

### Why are the changes needed?

Reduce one unneeded get metadata call

### Does this PR introduce any user facing changes?

NA

pr-link: Alluxio#16664
change-id: cid-5a9abb7673491737ba376cdf2eeb2a13a1baf63e
### What changes are proposed in this pull request?

Remove some overhead in big file read (cache hit/ cache miss) in fuse
sdk.

### Why are the changes needed?

Improve overall read performance

### Does this PR introduce any user facing changes?

NA

pr-link: Alluxio#16670
change-id: cid-d0d32eeebf9039753881b160443791325ccf5dad
### What changes are proposed in this pull request?

Enable default client-side metadata cache (20,000 entries with 40MB
memory).

### Why are the changes needed?

Each fuse.open() has two extra get file status calls which cannot be
provided via kernel metadata cache but can via user space metadata
cache. Enable user space metadata cache can largely improve the overall
fuse FIO small file read performance. S3FS-FUSE has similar settings of
enable user space metadata cache (100,000 entries with 40MB memory) by
default.

FIO benchmarking FUSE SDK against s3-fuse in terms of small file read:
fio --name=sequentialread --rw=read --bs=100k --numjobs=1
--filesize=100k --direct=1 --group_reporting --nrfiles 1000

<google-sheets-html-origin><style type="text/css"><!--td {border: 1px
solid #cccccc;}br {mso-data-placement:same-cell;}--></style>

| S3 FUSE | ALLUXIO FUSE
-- | -- | --
Original | 9000 KiB/s | 1500 KiB/s
Kernal metadata cache (attr_timeout=7200, entry_timeout=7200) | 9000
KiB/s | 2400 KiB/s
User metadata cache | 9000 KiB/s | 1150 MiB/s


### Does this PR introduce any user facing changes?

Yeah, alluxio-fuse mount <..> metadata cache option with default value
changed
-o metadata_cache_size=<size> (Default=\"20000\" (around 40MB memory))
-o metadata_cache_expire=<timeout> (Default=\"no expire\")

pr-link: Alluxio#16662
change-id: cid-5423f1ab7f37ed89cdaa1ac0e85328478fbd1ce4
Implement basic functionalities for Dora worker.

pr-link: Alluxio#16644
change-id: cid-522ecaf0f7f39915fc8d55a2aa83b5b114355458
### What changes are proposed in this pull request?

Fix an NPE in PermissionCheckTest.

### Why are the changes needed?

The `mMetricsMaster` was initialized to `null` and the setup was adding
a null pointer to the registry. A null pointer should have been rejected
by the registry but it was not the case until refactoring was done in
Alluxio#16644.

### Does this PR introduce any user facing changes?
No.

pr-link: Alluxio#16734
change-id: cid-7c61bd50e97dba6c211d6fea29f1555d0fa91e16
Set default fuse version to 3 because of easy to use and better
performance.
One can sudo yum install fuse3 and then alluxio-fuse to be able to enjoy
the
better performance brought by MAX_IDLE_THREADS=64 (64 threads instead 10
idle threads to execute fuse operations for better performance).

pr-link: Alluxio#16682
change-id: cid-bfc194d628a887b2508e249a2bb583c64540d8f9
### What changes are proposed in this pull request?

Add mission proto lock content.

### Why are the changes needed?

Alluxio#16578 Modified the proto file but not the `proto.lock` file.

### Does this PR introduce any user facing changes?
No user facing changes.

pr-link: Alluxio#16651
change-id: cid-adbcdf461ceb5f8c43379ff510c3c9ffabcd992f
### What changes are proposed in this pull request?

Dora client side affinity

### Why are the changes needed?

skip grpc call of the get block location from master

### Does this PR introduce any user facing changes?
new configruations

pr-link: Alluxio#16726
change-id: cid-bfda07a61fc0a9ed21e9f47652e973cc4e59ca9b
### What changes are proposed in this pull request?

Enable data transmission with Netty.

User is able to enable Netty data transmission by setting the following
configurations:
alluxio.user.netty.data.transmission.enabled=true
alluxio.worker.network.netty.channel=nio

### Why are the changes needed?

Support Netty data transmission that has higher performance.

### Does this PR introduce any user facing changes?

No, it doesn't.

pr-link: Alluxio#16599
change-id: cid-0af83e5128edee32c1512beea6ad79c545ada424
Implement `getChannel`. The implementation does not yet enable zero-copy
transfer with netty.

pr-link: Alluxio#16742
change-id: cid-a2ecf8d06c0c82f56c259f45e2348f6a90d0eeb1
### What changes are proposed in this pull request?

A temporary solution for metadata caching on worker

### Why are the changes needed?

reduce the api call to ufs

### Does this PR introduce any user facing changes?

Please list the user-facing changes introduced by your change, including
add properties

pr-link: Alluxio#16732
change-id: cid-28a01ea4081202b80a31393e097df3f59530e821
In the async page store, when worker threads run out, the caller thread
should perform the action.

The test had a race condition where all the threads could block,
including the calling thread.

pr-link: Alluxio#16731
change-id: cid-81d809c343968d68b57988978575d8b9e4ace63a

Original commit
Alluxio@7a4e0a5

pr-link: Alluxio#16797
change-id: cid-c0113006a76824e1cef41c6f44820e344e047ebe
@github-actions
Copy link

github-actions bot commented Jun 5, 2023

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale The PR/Issue does not have recent activities and will be closed automatically label Jun 5, 2023
@apc999 apc999 force-pushed the main branch 2 times, most recently from 2fec0ec to b597c61 Compare October 17, 2023 19:11
@github-actions github-actions bot removed the stale The PR/Issue does not have recent activities and will be closed automatically label Jun 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants