Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

really make timestamp reproducible: use UTC instead of forcing 'Z' st… #18529

Open
wants to merge 1,136 commits into
base: main
Choose a base branch
from

Conversation

hboutemy
Copy link

No description provided.

huanghua78 and others added 30 commits September 5, 2023 23:14
### What changes are proposed in this pull request?

Add isReadOnly() for FuseFileStream and implement this interface in various Streams.

### Why are the changes needed?

This interface is used to determine if the Stream is only. This is needed to determine if a real data flush is needed.

### Does this PR introduce any user facing changes?

N/A
			pr-link: Alluxio#18114
			change-id: cid-6a60e9e917bb0f8a439d76b51beaaba2b723dde4
Script `launch-process` was renamed to `launch-process-bash`. Update the name of the script in `entrypoint.sh`
			pr-link: Alluxio#18115
			change-id: cid-7053532e9fac93ef3f987e1525ae7260ac3a8dc0
### What changes are proposed in this pull request?

Add a config to turn on/off the sdk cache fallback

### Why are the changes needed?

the read fallback should be turn off by default to avoid retry storm to UFS

### Does this PR introduce any user facing changes?

Addition  of property keys


			pr-link: Alluxio#18099
			change-id: cid-2c893a617ac5e90086db98610434db6d763c1fe9
One more place to count the cache hit requests
			pr-link: Alluxio#17878
			change-id: cid-0cb55b7ca627832ce2b53e80ea58cfd6a2730bfa
### What changes are proposed in this pull request?
Fix the array out of bound exception in presto

### Why are the changes needed?

Because we saw earlier that the LocalCacheManager swallows all exceptions, if there is an error accessing one of the cached files, the exception will be ignored. However, the offset may have changed during the failed read attempt in LocalCacheManager. When LocalCacheManager returns -1 on error and tries to reread from the lower layer storage, the offset could be out of bounds.

Regarding why this issue only appears in versions 2.9.3 and later, it's because the offset became a member variable of our target buffer starting from 2.9.3. In earlier versions it was always a local variable, so there was no compounding of offset errors.

### Does this PR introduce any user facing changes?

no

			pr-link: Alluxio#18098
			change-id: cid-c7b949d9146847b14ba672df56e21bc4bc5ad705
Count the hit and miss of ListStatus and GetFileInfo in worker
			pr-link: Alluxio#17848
			change-id: cid-97c8efb81088d7e9b636ef927fcbc62a166fe085
### What changes are proposed in this pull request?

Change metric type of LocalCacheState from counter to gauge.

### Why are the changes needed?

The type of LocalCacheState metric is not reasonable since the value of this metric is a enum type not an increasable value

### Does this PR introduce any user facing changes?

No

			pr-link: Alluxio#18070
			change-id: cid-61037d3580e2f25f782e1c876ced0ecad5b312a4
### What changes are proposed in this pull request?

Change name of NoopMembershipManager to MasterMembershipManager.

### Why are the changes needed?

N/A

### Does this PR introduce any user facing changes?

N/A

			pr-link: Alluxio#18097
			change-id: cid-9f39e3b1a9b8cdad51e49bbb5d12e86d21b07e23
Reverts Alluxio#18070
revert per request on backward compatibility
			pr-link: Alluxio#18119
			change-id: cid-aeecbff3bc9d6bff41eec25405e87a2f2b79079f
### What changes are proposed in this pull request?

Refactor the Netty read handler of worker.

### Why are the changes needed?

The previous implementation creates a state machine per read request, instead of per channel. This implies that if two read requests are sent over the same channel, the worker would possibly use one channel to send data of different files or regions. This can lead to data corruption.

This PR proposes to use a state machine per channel, and handles channel events throughout the whole lifecycle of the channel. Things like a faulty client sending a second request over the same channel before the first request is completed, is handled gracefully with a client error.

The state transitions look like the following:
![graph(3)](https://github.com/Alluxio/alluxio/assets/6999708/8088f14c-6224-4af4-929a-d6d3e0b8b2ef)


### Does this PR introduce any user facing changes?

No.

### Tests

Tests have been done with basic Alluxio CLI tools, as well as automated PrestoDB and TPC-DS tests.



			pr-link: Alluxio#17479
			change-id: cid-bb8b2c70f0bf0bd84e73d9e858bf6e80706427aa
Refactor Netty read handler to allow subclassing.

			pr-link: Alluxio#18120
			change-id: cid-d0d4125459644d8e92c04dd9decdb622f14cbdb6
### What changes are proposed in this pull request?
Implement per-thread cache context

### Why are the changes needed?
Enable the fine-grained cache admission 

### Does this PR introduce any user facing changes?
no

			pr-link: Alluxio#18029
			change-id: cid-839bc71b2df158a4aeaedf22c5c7fb40dfd769e8
Fuse is supported on K8s. Remove the outdated limitation.
			pr-link: Alluxio#18122
			change-id: cid-99cacf64405867a720a5304ab5604b778c0b2127
### What changes are proposed in this pull request?

Change `runClass` in Benchmark to `exec class`

### Why are the changes needed?

Since new alluxio cli change from `runClass` to `exec class`, related benchmark code should be change as well to keep it work

### Does this PR introduce any user facing changes?

no

			pr-link: Alluxio#18079
			change-id: cid-2e8b23a7fc272ed0266c81f5e816442fdb9cd25b
`uname -m` on arm == `arm64`
			pr-link: Alluxio#18123
			change-id: cid-c10f5760052ff7cc3757843df86710200d70b090
### What changes are proposed in this pull request?

Remove hostname from metrics key

### Why are the changes needed?

For easy aggregation on prometheus and grafana side

### Does this PR introduce any user facing changes?

Add a flag to disable this for compatibility

			pr-link: Alluxio#18121
			change-id: cid-ba6c2f9fae625747192044168fce7dc026c66b9c
besides the User-CLI.md doc, update other doc files that refer to `bin/alluxio` commands
- remove docs on path config
- remove starting/stopping job master/worker from contributor docs
			pr-link: Alluxio#18128
			change-id: cid-fc71dd493b16ef3aeeb0b1b190941c43b9af9cab
What changes are proposed in this pull request?
I have created a test and create a liststatus test for its function.

Why are the changes needed?
Please clarify why the changes are needed. For instance,

add a unit test for DoraWorkerClientServiceHandler.
Does this PR introduce any user facing changes?
No.
			pr-link: Alluxio#18059
			change-id: cid-b82706a4419700f017584f3e5579d2ef3410aeb3
Make fuse max reader concurrency configurable. The default value was 64 and it was unchangeable.
			pr-link: Alluxio#18129
			change-id: cid-9c55821622329bd1e608da2e7445e8ab591df38a
Fix typo from alluxio.max.fuse.reader.concurrency to alluxio.fuse.max.reader.concurrency
			pr-link: Alluxio#18134
			change-id: cid-434086cf6ba9e9f8d173e3417fc8518963dfa102
update usages of bin/alluxio, bin/alluxio-start.sh and bin/alluxio-stop.sh to their new counterparts

simplify section of CephFS.md and remove sections related to mounting. the ufs must be configured as the root mount via alluxio-site.properties.
			pr-link: Alluxio#18136
			change-id: cid-fa7d0eec00c8fb136680ef6d5a2c7ee78571d123
dbw9580 and others added 27 commits December 14, 2023 05:29
### What changes are proposed in this pull request?

Fix outdated worker address info returned by consistent hash policy

Summary of changes:

1. `ConsistentHashProvider` only concerns about `WorkerIdentity` when building the hash ring. Therefore, the APIs have been limited to accept and return `WorkerIdentity`s.
2. `ConsistentHashProvider.refresh` now accepts a set of worker identities instead of a list, as the order does not matter.
3. Added a test to cover the bug fix.

### Why are the changes needed?

Fix a bug where the consistent hash provider caches the `BlockWorkerInfo` of all workers, and when a worker changes its network addresses but its ID stays the same, the hash provider won't update the worker's info. A client will continue to use the outdated network address.

The fix is to make the hash provider to only consider `WorkerIdentity`s, and let the client to figure out the worker's address with the ID provided by the consistent hash provider.

### Does this PR introduce any user facing changes?

No.

			pr-link: Alluxio#18434
			change-id: cid-93f1601d846385f314c79a556c8705d3983a1199
This PR improves RESTful load api, use JSON format as the response content.

## Examples:
### Submit Job Example:
```
// 20231211174310
// http://localhost:28080/v1/load?path=hdfs://node01:8020/testRoot/testDirectory2&opType=submit&verbose=true
{
  "success": true,
  "jobId": "0dbc0f47-580b-420f-b50e-d08a170746c8",
  "path": "hdfs://node01:8020/testRoot/testDirectory2",
  "message": "Load 'hdfs://node01:8020/testRoot/testDirectory2' is successfully submitted. JobId: 0dbc0f47-580b-420f-b50e-d08a170746c8\n"
}
```
### Get Job Progress Example:
```
// 20231211180110
// http://localhost:28080/v1/load?path=hdfs://node01:8020/testRoot/testDirectory2&opType=progress&verbose=true

{
  "jobState": "RUNNING",
  "path": "hdfs://node01:8020/testRoot/testDirectory2",
  "message": "Progress for loading path 'hdfs://node01:8020/testRoot/testDirectory2':\n\tSettings:\tbandwidth: unlimited\tverify: false\tmetadata-only: false\n\tTime Elapsed: 00:00:03\n\tJob State: RUNNING\n\tStage: RETRYING\n\tInodes Scanned: 4\n\tInodes Processed: 4\n\tBytes Loaded: 0B out of 0B\n\tThroughput: 0B/s\n\tFile Failure rate: 0.00%\n\tSubtask Failure rate: 0.00%\n\tFiles Failed: 0\n\tRecent failed subtasks: \n\tRecent retrying subtasks: \n\tSubtask Retry rate: 0.00%\n\tSubtasks on Retry Dead Letter Queue: 0\n",
  "respProperties": {
    "Files Failed": "0",
    "Recent failed subtasks": "",
    "Subtask Retry rate": "0.00%",
    "Throughput": "0B/s",
    "File Failure rate": "0.00%",
    "Subtasks on Retry Dead Letter Queue": "0",
    "Time Elapsed": "00",
    "Bytes Loaded": "0B out of 0B",
    "Stage": "RETRYING",
    "Inodes Scanned": "4",
    "Inodes Processed": "4",
    "Recent retrying subtasks": "",
    "Subtask Failure rate": "0.00%",
    "Settings": "bandwidth",
    "Job State": "RUNNING",
    "Progress for loading path 'hdfs": "//node01"
  }
}
```

```
// 20231211174358
// http://localhost:28080/v1/load?path=hdfs://node01:8020/testRoot/testDirectory2&opType=progress&verbose=true

{
  "jobState": "SUCCEEDED",
  "path": "hdfs://node01:8020/testRoot/testDirectory2",
  "message": "Progress for loading path 'hdfs://node01:8020/testRoot/testDirectory2':\n\tSettings:\tbandwidth: unlimited\tverify: false\tmetadata-only: false\n\tTime Elapsed: 00:00:16\n\tJob State: SUCCEEDED\n\tInodes Scanned: 4\n\tInodes Processed: 4\n\tBytes Loaded: 0B out of 0B\n\tThroughput: 0B/s\n\tFile Failure rate: 0.00%\n\tSubtask Failure rate: 0.00%\n\tFiles Failed: 0\n\tRecent failed subtasks: \n\tRecent retrying subtasks: \n\tSubtask Retry rate: 0.00%\n\tSubtasks on Retry Dead Letter Queue: 0\n",
  "respProperties": {
    "Files Failed": "0",
    "Recent failed subtasks": "",
    "Subtask Retry rate": "0.00%",
    "Throughput": "0B/s",
    "File Failure rate": "0.00%",
    "Subtasks on Retry Dead Letter Queue": "0",
    "Time Elapsed": "00",
    "Bytes Loaded": "0B out of 0B",
    "Inodes Scanned": "4",
    "Inodes Processed": "4",
    "Recent retrying subtasks": "",
    "Subtask Failure rate": "0.00%",
    "Settings": "bandwidth",
    "Job State": "SUCCEEDED",
    "Progress for loading path 'hdfs": "//node01"
  }
}
```
### Stop Job Example:
```
// 20231211180219
// http://localhost:28080/v1/load?path=hdfs://node01:8020/testRoot/testDirectory2&opType=stop&verbose=true

{
  "success": true,
  "path": "hdfs://node01:8020/testRoot/testDirectory2",
  "message": "Load 'hdfs://node01:8020/testRoot/testDirectory2' is successfully stopped.\n"
}
```

```
// 20231211180153
// http://localhost:28080/v1/load?path=hdfs://node01:8020/testRoot/testDirectory2&opType=stop&verbose=true

{
  "success": false,
  "path": "hdfs://node01:8020/testRoot/testDirectory2",
  "message": "Cannot find load job for path hdfs://node01:8020/testRoot/testDirectory2, it might have already been stopped or finished\n"
}
```

			pr-link: Alluxio#18464
			change-id: cid-1fff9a23457064ab71534909449c60a6b0123f22
### What changes are proposed in this pull request?

Remove additional white space in alluxio-fuse script

### Why are the changes needed?

alluxio-fuse unmount <mnt_point> is unable to find the pid of AlluxioFuse process because the grep content isn't correct.

### Does this PR introduce any user facing changes?

No

			pr-link: Alluxio#18465
			change-id: cid-3e70b0c8edbaa0ba50d744fd6155b0d494a243f9
Support get page with RESTful API by specifying offset and length.

PAGE_URL_FORMAT = (
"[http://{worker_host}:{http_port}/v1/file/{path_id}/page/{page_index}?offset=100&length=1024](http://{worker_host}:%7Bhttp_port%7D/v1/file/%7Bpath_id%7D/page/%7Bpage_index%7D?offset=100&length=1024)"
)
			pr-link: Alluxio#18474
			change-id: cid-ba5b0c5050843ccc5642950beadc8a0b049948be
Support write page with RESTful API

### Usage
```
HTTP Method: POST
Request URL: http://localhost:28080/v1/file/<fileId>/page/<pageIndex>
HTTP Body: <page bytes>
```
			pr-link: Alluxio#18481
			change-id: cid-cab175a007bfcaf294e89adbe47531419036a245
```
$ df -h /mnt/fuse/
Filesystem      Size  Used Avail Use% Mounted on
alluxio-fuse    910T     0  910T   0% /mnt/fuse
```

### What changes are proposed in this pull request?

Add fake numbers for statfs

### Why are the changes needed?

Some application checks the available space in a file system before continuing to do file operations.

### Does this PR introduce any user facing changes?

A fake number (1 Petabytes) is provided to statfs. This number does not reflect real available storage space.

			pr-link: Alluxio#18482
			change-id: cid-9f60d185393b616be02bf8f473b2026f2047f28c
### What changes are proposed in this pull request?

Add Ketama Hashing, Jump Consistent Hashing, Maglev Hashing, and Multi Probe Hashing.

### Why are the changes needed?

Now alluxio's user worker selection policy is Consistent Hash Policy.  It bings too much time cost, and it is not enough uniform, and not strictly consistent.

Ketama: https://github.com/RJ/ketama
Jump Consistent Hashing: https://arxiv.org/pdf/1406.2294.pdf
Maglev Hashing: https://static.googleusercontent.com/media/research.google.com/zh-CN//pubs/archive/44824.pdf
Multi Probe Hasing: https://arxiv.org/pdf/1505.00062.pdf

We strongly recommend using Maglev Hashing for User Worker Selection Policy. Under most situation, it has the minimum time cost, and it is the most uniform and balanced hashing policy.

### Does this PR introduce any user facing changes?

`alluxio.user.worker.selection.policy` has the following values: `CONSISTENT`, `JUMP`, `KETAMA`, `MAGLEV`, `MULTI_PROBE`, `LOCAL`, `REMOTE_ONLY`, corresponding to consistent hash policy, maglev hash policy, ketama hash policy, maglev hash policy, multi-probe respectively hash policy, local worker policy, remote only policy.

The current default value is `CONSISTENT`.

We recommend using Maglev Hash, which has the best hash consistency and is the least time-consuming. That is to say, set the value of `alluxio.user.worker.selection.policy` to `MAGLEV`. We will also consider setting this as the default value in the future.

**Ketama Hasing**
`alluxio.user.ketama.hash.replicas`: This is the value of replicas in the ketama hashing algorithm. When workers changes, it will guarantee the hash table is changed only in a minimal. The value of replicas should be X times the physical nodes in the cluster, where X is a balance between efficiency and cost.

**Jump Consistent Hashing**
None.

**Maglev Hashing**
`alluxio.user.maglev.hash.lookup.size`: This is the size of the lookup table in the maglev hashing algorithm. It must be a prime number. In the maglev hashing, it will generate a lookup table for workers. The bigger the size of the lookup table, the smaller the variance of this hashing algorithm will be. But bigger look up table will consume more time and memory.

**Multi Probe Hashing**
`alluxio.user.multi.probe.hash.probe.num`: This is the number of probes in the multi-probe hashing algorithm. In the multi-probe hashing algorithm, the bigger the number of probes, the smaller the variance of this hashing algorithm will be. But more probes will consume more time and memory.


			pr-link: Alluxio#17817
			change-id: cid-bad21c6e5ad83eb3da15a8960ba372b14c67b081
…b.login.autorenewal' in HDFS docs



### What changes are proposed in this pull request?

Update the correct kerberos configuration 'alluxio.hadoop.kerberos.keytab.login.autorenewal' in HDFS docs to avoid user confusion.

### Why are the changes needed?

Fix Alluxio#18486 



			pr-link: Alluxio#18487
			change-id: cid-b8f08e2f67e5f10aa1426de7629b8e268e339433
### What changes are proposed in this pull request?

Create metadata directory in initiateMultipartUpload method.

### Why are the changes needed?

Each request calls the initialization method of the handler and sends an `exists` request to the master, which is unnecessary.


			pr-link: Alluxio#18462
			change-id: cid-3efd076d7eb33cc063609fa1e1003e3aff480be6
`format` commands don't exist in cli anymore, so as job services. Delete some entrypoint code.

Solves Alluxio#18466
			pr-link: Alluxio#18490
			change-id: cid-0ee0b45a012a29df4a793107d7f8cce4ca98fc99
### What changes are proposed in this pull request?

fix the issue of S3 range read. 

### Why are the changes needed?

there is a bug when setting  S3 range offset.

### Does this PR introduce any user facing changes?

user can try the following cmd to validate S3 range read.
`aws --endpoint http://localhost:39999/api/v1/s3 s3api get-object --range bytes=10-20 --bucket [bucket-name] --key=[key-name] [output-file]`



			pr-link: Alluxio#18484
			change-id: cid-b5fd9832a9900fba1105bb494a96f315b20f507d
…ressWorkerBench

### What changes are proposed in this pull request?

Change to use long type store file size.

### Why are the changes needed?

Without this PR, we cannot specified a filesize lagger than Integer.MAX_VALUE.

### Does this PR introduce any user facing changes?

No

			pr-link: Alluxio#18492
			change-id: cid-2b816d1f2cbc9ebcf888b06802eb682fb76d55c2
### What changes are proposed in this pull request?
Add etcd membership manager only include active workers
For python client it would need to only look at active worker list 
### Why are the changes needed?
We only need active workers list in big tech env. If the worker is down, we don't care and just reshard small portion of data to other workers.

### Does this PR introduce any user facing changes?

na

			pr-link: Alluxio#18495
			change-id: cid-70ec6f27539f5f47b99be1ce3ff85cb9c117c3bf
### What changes are proposed in this pull request?

This PR enables registering the worker's HTTP server's port in the etcd. This helps to find worker's restful APIs from the Python client.

### Why are the changes needed?

Alluxio Python client (e.g. in ML use cases) needs to connect to the worker's REST APIs. But as the http server port isn't included in the worker's information in the etcd, the client fails to find the API endpoint.

### Does this PR introduce any user facing changes?

No.

			pr-link: Alluxio#18499
			change-id: cid-1cf7e0bdc7cc0c9702949bc313de5583d9cc2fb8
### What changes are proposed in this pull request?

Fix DoraLoadCommandIntegrationTest.

### Why are the changes needed?

Without this PR, DoraLoadCommandIntegrationTest may fail with the following exception

```
[ERROR] alluxio.client.cli.fs.command.DoraLoadCommandIntegrationTest.testCommand  Time elapsed: 7.995 s  <<< FAILURE!
java.lang.AssertionError
        at org.junit.Assert.fail(Assert.java:87)
        at org.junit.Assert.assertTrue(Assert.java:42)
        at org.junit.Assert.assertTrue(Assert.java:53)
        at alluxio.client.cli.fs.command.DoraLoadCommandIntegrationTest.testCommand(DoraLoadCommandIntegrationTest.java:107)
```

The cause is that
- The second job is using the same path as the first job
- While submitting the second job, the second job is still in "cleaning" state. As the following log shows

    ```
    2024-01-31 09:52:33,057 [master-rpc-executor-TPE-thread-494] WARN  scheduler.Scheduler (Scheduler.java:submitJob) - There's concurrent submit while job is still in cleaning state
    ```

- "progress" returns the progress of the first job instead of the second one

The PR changes to use a different path for the second job, which avoids this issue.

### Does this PR introduce any user facing changes?

NO

			pr-link: Alluxio#18504
			change-id: cid-331ba5508e86e8161006073d452ab1ba6230473a
Fix the bug that the get page RESTful API doesn't support nullable offset and length.
			pr-link: Alluxio#18506
			change-id: cid-85eab5152e501b97bc9b4678e92b0d8e665a95ce
### What changes are proposed in this pull request?

2000 is too much

### Why are the changes needed?

2000 is too much

### Does this PR introduce any user facing changes?

na

			pr-link: Alluxio#18516
			change-id: cid-98762f4a176f30b7a83399183aef6f11d5113132
### What changes are proposed in this pull request?

Throw a PageCorruptedException when the length of page inconsistent with the metadata
Do our best efforts to delete the corrupted page file
Reset the offset of buffer when we found the data has been corrupted to avoid ArrayOutOfBound exception. 

### Why are the changes needed?
We found the cache make presto keep failing when some of the page file got corrupted

### Does this PR introduce any user facing changes?

No

			pr-link: Alluxio#18498
			change-id: cid-9012c5432e2b979f8242f3b247deed94b501d194
### What changes are proposed in this pull request?
Fix runtime stats when cache hit

### Why are the changes needed?

We already count the runtime stats in local cache manager, but we didn't pass in a proper cache context

### Does this PR introduce any user facing changes?

Please list the user-facing changes introduced by your change, including
  1. change in user-facing APIs
  2. addition or removal of property keys
  3. webui

			pr-link: Alluxio#18503
			change-id: cid-233916f6b410432f8300f10ac8a2385afe969f2c
Fix Unrecognized error

### What changes are proposed in this pull request?

Please outline the changes and how this PR fixes the issue.

### Why are the changes needed?

Please clarify why the changes are needed. For instance,
  1. If you propose a new API, clarify the use case for a new API.
  2. If you fix a bug, describe the bug.

### Does this PR introduce any user facing changes?

Please list the user-facing changes introduced by your change, including
  1. change in user-facing APIs
  2. addition or removal of property keys
  3. webui

			pr-link: Alluxio#18527
			change-id: cid-9753dc51317f4ebf4ba50cd5463155e821191ac3
@alluxio-bot
Copy link
Contributor

Thank you for your pull request.
In order for us to evaluate and accept your PR, we ask that you sign a contribution license agreement (CLA).
It's all electronic and will take just a few minutes. Please download CLA form here, sign, and e-mail back to [email protected]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.