Skip to content

Commit 3831943

Browse files
committed
Cloudstore: cloudstore-1.1 plus etags
* new `etag` command: consult the docs * new version: cloudstore-1.1
1 parent fd9516a commit 3831943

23 files changed

Lines changed: 395 additions & 70 deletions

BUILDING.md

Lines changed: 27 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -14,7 +14,9 @@
1414

1515
# Building
1616

17-
With maven, with profiles for AWS java v1 and v2 SDK.
17+
## Compiling
18+
19+
With maven
1820

1921
To build a production release
2022
1. Use java8
@@ -24,6 +26,26 @@ Joint build
2426
```bash
2527
mvn clean install
2628
```
29+
## Updating cloudstore release versions
30+
31+
For a long time the version was fixed at 1.0 so that curl and other tools could retrieve it
32+
This was convenient for some use cases, but has led to a condition where the JAR downloaded
33+
for support calls was never updated, even after new releases were made, because without
34+
a version change this wasn't apparent.
35+
36+
Therefore release number increments are required for anything other than a rapid-iteration multiple-releases-in-a-day workflow.
37+
38+
Update the version, for example from 1.0 to 1.1:
39+
```bash
40+
mvn versions:set -DnewVersion=1.1
41+
```
42+
43+
Search and replace all uses of `cloudstore-1.0.jar` with the new version of the artifact.
44+
45+
*Note:* there's currently no use of the `-SNAPSHOT` suffix, used in downstream builds for the tools
46+
to recognise this should be updated nightly.
47+
This artifact is not currently intended for such use.
48+
2749

2850
## Releasing
2951

@@ -38,12 +60,13 @@ mvn clean install
3860
set -gx now (date '+%Y-%m-%d-%H.%M'); echo [$now]
3961
git add .; git status
4062
git commit -S --allow-empty -m "release $now"; git push
41-
gh release create tag-release-$now -t release-$now -n "release of $now" -d target/cloudstore-1.0.jar
63+
gh release create tag-release-$now -t release-$now -n "release of $now" -d target/cloudstore-1.1.jar
4264
# then go to the web ui to review and finalize the release
4365
```
4466

4567
* If a new release is made the same day, remember to create a new tag.
46-
* The version `cloudstore-1.0.jar` is always used, not just from laziness but because it allows
47-
for bash scripts to always be able to fetch the latest version through curl then execute it.
68+
* If you have an env var pointing to the cloudstore JAR, update it!
69+
70+
4871

4972

README.md

Lines changed: 12 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -187,6 +187,13 @@ This is harmless; it comes from the SDK thread pool being closed while
187187
a list page prefetch is in progress.
188188

189189

190+
## Command `etag`
191+
192+
Prints the etag of an object, when implemented by the filesystem
193+
and returned by the object store.
194+
195+
See [etag](src/main/site/etag.md)
196+
190197
## Command `fetchdt`
191198

192199
This is an extension of `hdfs fetchdt` which collects delegation tokens
@@ -204,7 +211,7 @@ Also prints the time to execute each operation (including instantiating the stor
204211
and with the `-verbose` option, the store statistics.
205212

206213
```
207-
hadoop jar cloudstore-1.0.jar \
214+
hadoop jar cloudstore-1.1.jar \
208215
filestatus \
209216
s3a://guarded-table/example
210217
@@ -226,7 +233,7 @@ it does ths with some better diagnostics of parsing problems.
226233
warning: at -verbose, this prints your private key
227234

228235
```
229-
hadoop jar cloudstore-1.0.jar gcscreds gs://bucket/
236+
hadoop jar cloudstore-1.1.jar gcscreds gs://bucket/
230237
231238
key uses \n for separator -gs connector must convert to line endings
232239
2022-01-19 17:55:51,016 [main] INFO gs.PemReader (PemReader.java:readNextSection(86)) - title match at line 1
@@ -252,7 +259,7 @@ Usage: list
252259
Example: list some of the AWS public landsat store.
253260

254261
```bash
255-
> bin/hadoop jar cloudstore-1.0.jar list -limit 10 s3a://landsat-pds/
262+
> bin/hadoop jar cloudstore-1.1.jar list -limit 10 s3a://landsat-pds/
256263

257264
Listing up to 10 files under s3a://landsat-pds/
258265
2019-04-05 21:32:14,523 [main] INFO tools.ListFiles (StoreDurationInfo.java:<init>(53)) - Starting: Directory list
@@ -300,7 +307,7 @@ Probes a filesystem for offering a specific named capability on the given path.
300307
Requires a version of Hadoop with the `PathCapabilities` interface, which includes Hadoop 3.3 onwards.
301308

302309
```bash
303-
bin/hadoop jar cloudstore-1.0.jar pathcapability
310+
bin/hadoop jar cloudstore-1.1.jar pathcapability
304311
Usage: pathcapability [options] <capability> <path>
305312
-D <key=value> Define a property
306313
-tokenfile <file> Hadoop token file to load
@@ -309,7 +316,7 @@ Usage: pathcapability [options] <capability> <path>
309316
```
310317

311318
```bash
312-
hadoop jar cloudstore-1.0.jar pathcapability fs.s3a.capability.select.sql s3a://landsat-pds/
319+
hadoop jar cloudstore-1.1.jar pathcapability fs.s3a.capability.select.sql s3a://landsat-pds/
313320

314321
Using filesystem s3a://landsat-pds
315322
Path s3a://landsat-pds/ has capability fs.s3a.capability.select.sql

pom.xml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919

2020
<groupId>com.cloudera.cloud</groupId>
2121
<artifactId>cloudstore</artifactId>
22-
<version>1.0</version>
22+
<version>1.1</version>
2323
<packaging>jar</packaging>
2424

2525
<name>cloudstore</name>

src/main/java/etag.java

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,33 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one
3+
* or more contributor license agreements. See the NOTICE file
4+
* distributed with this work for additional information
5+
* regarding copyright ownership. The ASF licenses this file
6+
* to you under the Apache License, Version 2.0 (the
7+
* "License"); you may not use this file except in compliance
8+
* with the License. You may obtain a copy of the License at
9+
*
10+
* http://www.apache.org/licenses/LICENSE-2.0
11+
*
12+
* Unless required by applicable law or agreed to in writing, software
13+
* distributed under the License is distributed on an "AS IS" BASIS,
14+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
* See the License for the specific language governing permissions and
16+
* limitations under the License.
17+
*/
18+
19+
import org.apache.hadoop.fs.store.commands.Command;
20+
import org.apache.hadoop.fs.store.commands.EtagCommand;
21+
import org.apache.hadoop.fs.store.commands.PrintStatus;
22+
23+
public class etag extends Command {
24+
25+
public static void main(String[] args) throws Exception {
26+
EtagCommand.main(args);
27+
}
28+
29+
public static void help() {
30+
printCommand("etag", "print the etag of an object (where supported)");
31+
}
32+
33+
}

src/main/java/help.java

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,7 @@ public static void main(String[] args) {
3838
constval.help();
3939
distcpdiag.help();
4040
dux.help();
41+
etag.help();
4142
fetchdt.help();
4243
filestatus.help();
4344
jobtokens.help();
Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one
3+
* or more contributor license agreements. See the NOTICE file
4+
* distributed with this work for additional information
5+
* regarding copyright ownership. The ASF licenses this file
6+
* to you under the Apache License, Version 2.0 (the
7+
* "License"); you may not use this file except in compliance
8+
* with the License. You may obtain a copy of the License at
9+
*
10+
* http://www.apache.org/licenses/LICENSE-2.0
11+
*
12+
* Unless required by applicable law or agreed to in writing, software
13+
* distributed under the License is distributed on an "AS IS" BASIS,
14+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
15+
* See the License for the specific language governing permissions and
16+
* limitations under the License.
17+
*/
18+
19+
package org.apache.hadoop.fs.store.commands;
20+
21+
import java.io.FileNotFoundException;
22+
import java.util.List;
23+
24+
import org.slf4j.Logger;
25+
import org.slf4j.LoggerFactory;
26+
27+
import org.apache.hadoop.conf.Configuration;
28+
import org.apache.hadoop.fs.EtagSource;
29+
import org.apache.hadoop.fs.FileStatus;
30+
import org.apache.hadoop.fs.FileSystem;
31+
import org.apache.hadoop.fs.Path;
32+
import org.apache.hadoop.fs.store.StoreDurationInfo;
33+
import org.apache.hadoop.fs.store.StoreEntryPoint;
34+
import org.apache.hadoop.util.ExitUtil;
35+
import org.apache.hadoop.util.ToolRunner;
36+
37+
import static org.apache.hadoop.fs.store.CommonParameters.STANDARD_OPTS;
38+
import static org.apache.hadoop.fs.store.StoreExitCodes.E_NOT_FOUND;
39+
import static org.apache.hadoop.fs.store.StoreExitCodes.E_SERVICE_UNAVAILABLE;
40+
import static org.apache.hadoop.fs.store.StoreExitCodes.E_UNIMPLEMENTED;
41+
42+
/**
43+
* Print the status.
44+
* <p>
45+
* Prints some performance numbers at the end.
46+
*/
47+
public class EtagCommand extends StoreEntryPoint {
48+
49+
private static final Logger LOG = LoggerFactory.getLogger(EtagCommand.class);
50+
51+
public static final String USAGE
52+
= "Usage: etag\n"
53+
+ STANDARD_OPTS
54+
+ " <path>";
55+
56+
57+
public EtagCommand() {
58+
createCommandFormat(1, 999);
59+
}
60+
61+
@Override
62+
public int run(String[] args) throws Exception {
63+
List<String> paths = processArgs(args, 1, 1, USAGE);
64+
final Configuration conf = createPreconfiguredConfig();
65+
66+
final Path source = new Path(paths.get(0));
67+
FileSystem fs = source.getFileSystem(conf);
68+
FileStatus st = null;
69+
70+
try (StoreDurationInfo duration = new StoreDurationInfo(LOG,
71+
"get path status for %s", source)) {
72+
st = fs.getFileStatus(source);
73+
} catch (FileNotFoundException e) {
74+
throw new ExitUtil.ExitException(E_NOT_FOUND, "Not found: " + source, e);
75+
}
76+
77+
if (st instanceof EtagSource) {
78+
final String etag = ((EtagSource) st).getEtag();
79+
println("Etag of %s = %s", source, etag);
80+
if (etag == null) {
81+
errorln("File status of path %s is an EtagSource but the value is null:\n%s", source, st);
82+
throw new ExitUtil.ExitException(E_SERVICE_UNAVAILABLE, "Etag is null");
83+
}
84+
if (etag.isEmpty()) {
85+
errorln("File status of path %s is an EtagSource but the value is the empty string:\n%s",
86+
source, st);
87+
throw new ExitUtil.ExitException(E_SERVICE_UNAVAILABLE, "Etag is empty string");
88+
}
89+
90+
} else {
91+
errorln("File status of path %s is not an EtagSource:\n%s", source, st);
92+
throw new ExitUtil.ExitException(E_UNIMPLEMENTED,
93+
"Filesystem does not provide Etag information");
94+
}
95+
return 0;
96+
}
97+
98+
/**
99+
* Execute the command, return the result or throw an exception,
100+
* as appropriate.
101+
* @param args argument varags.
102+
* @return return code
103+
* @throws Exception failure
104+
*/
105+
public static int exec(String... args) throws Exception {
106+
return ToolRunner.run(new EtagCommand(), args);
107+
}
108+
109+
/**
110+
* Main entry point. Calls {@code System.exit()} on all execution paths.
111+
* @param args argument list
112+
*/
113+
public static void main(String[] args) {
114+
try {
115+
exit(exec(args), "");
116+
} catch (Throwable e) {
117+
exitOnThrowable(e);
118+
}
119+
}
120+
121+
}

src/main/java/org/apache/hadoop/fs/store/diag/DiagnosticsEntryPoint.java

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,8 @@ public class DiagnosticsEntryPoint extends StoreEntryPoint {
6464
/** {@value}. */
6565
public static final String JARS = "j";
6666

67+
public static final String COUNTER = "[%03d]";
68+
6769
/**
6870
* Sort the keys.
6971
* @param keys keys to sort.
@@ -208,12 +210,14 @@ public final void lookupAndPrintSanitizedValues(
208210

209211
/**
210212
* Resolve and print values.
211-
* Takes a collection off (name, obfuscate) tuples..
213+
* Takes a collection of (name, obfuscate) tuples.
214+
* The transformation can be things like option lookup,
212215
* @param entries variables/properties.
213216
* @param section section name
214217
* @param lookup lookup function
218+
* @return count of entries.
215219
*/
216-
private void lookupAndPrint(
220+
private int lookupAndPrint(
217221
final String section,
218222
final Collection<Object[]> entries,
219223
final Function<String, String> lookup) {
@@ -232,9 +236,10 @@ private void lookupAndPrint(
232236
} else {
233237
value = "(unset)";
234238
}
235-
println("[%03d] %s = %s", ++index, var, value);
239+
println(COUNTER + " %s = %s", ++index, var, value);
236240
}
237241
}
242+
return index;
238243
}
239244

240245
/**

src/main/java/org/apache/hadoop/fs/store/diag/StoreDiagnosticsInfo.java

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,6 @@
4747
import static org.apache.hadoop.fs.store.StoreEntryPoint.DEFAULT_HIDE_ALL_SENSITIVE_CHARS;
4848
import static org.apache.hadoop.fs.store.StoreEntryPoint.getOrigins;
4949
import static org.apache.hadoop.fs.store.StoreUtils.sanitize;
50-
import static org.apache.hadoop.fs.store.diag.DiagnosticsEntryPoint.toURI;
5150
import static org.apache.hadoop.fs.store.diag.OptionSets.STANDARD_SECURITY_PROPS;
5251
import static org.apache.hadoop.fs.store.diag.OptionSets.STANDARD_SYSPROPS;
5352
import static org.apache.hadoop.fs.store.diag.StoreDiag.sortKeys;
@@ -87,7 +86,7 @@ public StoreDiagnosticsInfo(final URI fsURI, final Printout output) {
8786
/**
8887
* Bind the diagnostics to a store.
8988
* @param fsURI filesystem URI
90-
* @param output
89+
* @param output output.
9190
* @return the diagnostics info provider.
9291
*/
9392
public static StoreDiagnosticsInfo bindToStore(URI fsURI,
@@ -418,9 +417,10 @@ protected void warnOnInvalidDomain(final Printout printout,
418417
protected void printPrefixedOptions(final Printout printout,
419418
final Configuration conf,
420419
final String prefix) {
421-
printout.heading("Configuration options with prefix \"%s\" :", prefix);
422420
Map<String, String> propsWithPrefix = conf.getPropsWithPrefix(prefix);
421+
printout.heading("Configuration options with prefix \"%s\" :", prefix);
423422
if (!propsWithPrefix.isEmpty()) {
423+
int index = 0;
424424
Set<String> sorted = sortKeys(propsWithPrefix.keySet());
425425
for (String key : sorted) {
426426
final String propertyVal = propsWithPrefix.get(key);
@@ -429,7 +429,8 @@ protected void printPrefixedOptions(final Printout printout,
429429
if (propertyName.contains(".secret.") || propertyName.contains(".pass")) {
430430
value = sanitize(propertyVal, DEFAULT_HIDE_ALL_SENSITIVE_CHARS);
431431
}
432-
printout.println("%s=%s", propertyName, value);
432+
printout.println(DiagnosticsEntryPoint.COUNTER +
433+
" %s=%s", ++index, propertyName, value);
433434
}
434435
} else {
435436
printout.println("No configuration options with prefix \"%s\"", prefix);

src/main/site/bandwidth.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@
1919
Measure upload/download bandwidth; support different read policies, and optionally save the output to a CSV file.
2020

2121
```
22-
> hadoop jar cloudstore-1.0.jar bandwidth
22+
> hadoop jar cloudstore-1.1.jar bandwidth
2323
2424
Usage: bandwidth [options] size <path>
2525
-D <key=value> Define a property
@@ -42,7 +42,7 @@ Upload 128M of data to s3 with a block size of 8 megabytes; use `-verbose` outpu
4242
statistics. Save the summary to a CSV file for review.
4343

4444
```
45-
> hadoop jar cloudstore-1.0.jar bandwidth -csv tmp/s3a128m.csv -block 8 -verbose -policy whole-file 128m s3a://stevel-london/tmp
45+
> hadoop jar cloudstore-1.1.jar bandwidth -csv tmp/s3a128m.csv -block 8 -verbose -policy whole-file 128m s3a://stevel-london/tmp
4646
4747
Bandwidth test against s3a://stevel-london/tmp with data size 128m
4848
==================================================================
@@ -285,7 +285,7 @@ Here's the experiment repeated with prefetching enabled and changes to block upl
285285

286286
```
287287
```bash
288-
bin/hadoop jar cloudstore-1.0.jar bandwidth -D fs.s3a.prefetch.enabled=true -csv tmp/s3a128mp.csv -block 8 -verbose -policy whole-file 128m s3a://stevel-london
288+
bin/hadoop jar cloudstore-1.1.jar bandwidth -D fs.s3a.prefetch.enabled=true -csv tmp/s3a128mp.csv -block 8 -verbose -policy whole-file 128m s3a://stevel-london
289289

290290
Upload Summary
291291
==============

0 commit comments

Comments
 (0)