|
1 | | -[](https://travis-ci.org/spatialcurrent/railgun) [](https://godoc.org/github.com/spatialcurrent/railgun) |
| 1 | +[](https://travis-ci.org/spatialcurrent/railgun) [](https://goreportcard.com/report/spatialcurrent/railgun) [](https://godoc.org/github.com/spatialcurrent/railgun) [](https://github.com/spatialcurrent/railgun/blob/master/LICENSE) |
2 | 2 |
|
3 | 3 | # Railgun |
4 | 4 |
|
5 | 5 | # Description |
6 | 6 |
|
7 | | -**Railgun** is a simple and fast data processing tool. **Railgun** uses [go-simple-serializer](https://github.com/spatialcurrent/go-simple-serializer) (GSS) for reading/writing objects to standard formats. **Railgun** uses [go-dfl](https://github.com/spatialcurrent/go-dfl) for filtering. |
| 7 | +**Railgun** is a simple and fast data processing tool. **Railgun** uses: |
| 8 | +- [go-reader](https://github.com/spatialcurrent/go-reader) for opening and reading from URIs, |
| 9 | +- [go-simple-serializer](https://github.com/spatialcurrent/go-simple-serializer) (GSS) for reading/writing objects to standard formats, and |
| 10 | +- [go-dfl](https://github.com/spatialcurrent/go-dfl) for filtering and transforming data. |
8 | 11 |
|
9 | | -GSS supports `bson`, `csv`, `tsv`, `hcl`, `hcl2`, `json`, `jsonl`, `properties`, `toml`, `yaml`. `hcl` and `hcl2` implementation is fragile and very much in `alpha`. |
| 12 | +go-reader can read from `stdin`, `http/https`, the local filesystem, [AWS S3](https://aws.amazon.com/s3/), and [HDFS](https://hortonworks.com/apache/hdfs/). |
| 13 | + |
| 14 | +go-simple-serializer (GSS) supports `bson`, `csv`, `tsv`, `hcl`, `hcl2`, `json`, `jsonl`, `properties`, `toml`, `yaml`. `hcl` and `hcl2` implementation is fragile and very much in `alpha`. |
10 | 15 |
|
11 | 16 | # Usage |
12 | 17 |
|
13 | 18 | **CLI** |
14 | 19 |
|
15 | | -You can use the command line tool to convert between formats. |
| 20 | +You can use the command line tool to process data. |
16 | 21 |
|
17 | 22 | ``` |
18 | | -Usage: railgun -input_format INPUT_FORMAT -o OUTPUT_FORMAT [-input_uri INPUT_URI] [-input_compression [bzip2|gzip|snappy]] [-h HEADER] [-c COMMENT] [-object_path PATH] [-f FILTER] [-output_path OUTPUT_PATH] [-max MAX_COUNT] |
| 23 | +Usage: railgun -input_format INPUT_FORMAT -o OUTPUT_FORMAT [-input_uri INPUT_URI] [-input_compression [bzip2|gzip|snappy]] [-h HEADER] [-c COMMENT] [-object_path PATH] [-dfl_exp DFL_EXPRESSION] [-dfl_file DFL_FILE] [-output_path OUTPUT_PATH] [-max MAX_COUNT] |
19 | 24 | Options: |
| 25 | + -aws_access_key_id string |
| 26 | + Defaults to value of environment variable AWS_ACCESS_KEY_ID |
| 27 | + -aws_default_region string |
| 28 | + Defaults to value of environment variable AWS_DEFAULT_REGION. |
| 29 | + -aws_secret_access_key string |
| 30 | + Defaults to value of environment variable AWS_SECRET_ACCESS_KEY. |
| 31 | + -aws_session_token string |
| 32 | + Defaults to value of environment variable AWS_SESSION_TOKEN. |
20 | 33 | -c string |
21 | 34 | The input comment character, e.g., #. Commented lines are not sent to output. |
22 | | - -f string |
23 | | - The output filter |
| 35 | + -dfl_exp string |
| 36 | + Process using dfl expression |
| 37 | + -dfl_file string |
| 38 | + Process using dfl file. |
24 | 39 | -h string |
25 | 40 | The input header if the stdin input has no header. |
| 41 | + -hdfs_name_node string |
| 42 | + Defaults to value of environment variable HDFS_DEFAULT_NAME_NODE. |
26 | 43 | -help |
27 | 44 | Print help. |
28 | 45 | -input_compression string |
29 | | - The input compression: none, gzip, snappy (default "none") |
| 46 | + The input compression: none, bzip2, gzip, snappy (default "none") |
30 | 47 | -input_format string |
31 | | - The input format: csv, tsv, hcl, hcl2, json, jsonl, properties, toml, yaml |
| 48 | + The input format: bson, csv, tsv, hcl, hcl2, json, jsonl, properties, toml, yaml |
| 49 | + -input_reader_buffer_size int |
| 50 | + The input reader buffer size (default 4096) |
32 | 51 | -input_uri string |
33 | 52 | The input uri (default "stdin") |
34 | 53 | -max int |
35 | 54 | The maximum number of objects to output (default -1) |
36 | | - -o string |
37 | | - The output format: csv, tsv, hcl, hcl2, json, jsonl, properties, toml, yaml |
38 | | - -object_path string |
39 | | - The output path |
40 | | - -output_path string |
41 | | - The output path |
| 55 | + -output_format string |
| 56 | + The output format: bson, csv, tsv, hcl, hcl2, json, jsonl, properties, toml, yaml |
| 57 | + -output_uri string |
| 58 | + The output uri (default "stdout") |
42 | 59 | -version |
43 | 60 | Prints version to stdout. |
44 | 61 | ``` |
45 | 62 |
|
46 | | -**Go** |
| 63 | +# Releases |
47 | 64 |
|
48 | | -You can import **railgun** as a library with: |
| 65 | +**Railgun** is currently in **alpha**. See releases at https://github.com/spatialcurrent/railgun/releases. |
49 | 66 |
|
50 | | -```go |
51 | | -import ( |
52 | | - "github.com/spatialcurrent/go-railgun/railgun" |
53 | | -) |
54 | | -``` |
| 67 | +# Examples |
55 | 68 |
|
56 | | -The `Process` function is the core functions to use. |
| 69 | +**Search for Cuisine** |
57 | 70 |
|
58 | | -```go |
59 | | -... |
60 | | - output_object, err := railgun.Process(input_object, object_path, filter, funcs, max_count, output_path) |
61 | | -... |
62 | | - output_string, err := gss.Serialize(output_object, output_format) |
63 | | -... |
| 71 | +``` |
| 72 | +~/go/src/github.com/spatialcurrent/go-osm/bin/osm_linux_amd64 -input_uri 'http://download.geofabrik.de/north-america/us/district-of-columbia-latest.osm.bz2' -ways_to_nodes -output_format geojsonl -filter_keys_keep amenity -output_uri stdout | railgun -input_format jsonl -output_format json -dfl_file ~/go/src/github.com/spatialcurrent/railgun/examples/mexican.dfl -output_uri mexican.json |
64 | 73 | ``` |
65 | 74 |
|
66 | | -# Releases |
67 | | - |
68 | | -**Railgun** is currently in **alpha**. See releases at https://github.com/spatialcurrent/railgun/releases. |
69 | | - |
70 | | -# Examples |
| 75 | +**Tsunami Feed** |
71 | 76 |
|
72 | | -TBD |
| 77 | +``` |
| 78 | +const pipeline = ["filter(@features, '(@properties?.tsunami != null) and (@properties.tsunami == 1)')", "sort(@, '@properties?.mag', true)", "map(@, '@properties?.place ?: \"\"')", "limit(@, 10)"]; |
| 79 | +(await fetch("https://earthquake.usgs.gov/earthquakes/feed/v1.0/summary/2.5_month.geojson")).json().then(earthquakes => { |
| 80 | + result = railgun.process(earthquakes, {"dfl": pipeline, "output_format": "yaml"}); |
| 81 | + console.log(result); |
| 82 | +}) |
| 83 | +``` |
73 | 84 |
|
74 | 85 | # Building |
75 | 86 |
|
76 | 87 | **CLI** |
77 | 88 |
|
78 | 89 | The `build_cli.sh` script is used to build executables for Linux and Windows. |
79 | 90 |
|
| 91 | +**JavaScript** |
| 92 | + |
| 93 | +You can compile GSS to pure JavaScript with the `scripts/build_javascript.sh` script. |
| 94 | + |
| 95 | +**Changing Destination** |
| 96 | + |
| 97 | +The default destination for build artifacts is `railgun/bin`, but you can change the destination with a CLI argument. For building on a Chromebook consider saving the artifacts in `/usr/local/go/bin`, e.g., `bash scripts/build_cli.sh /usr/local/go/bin` |
| 98 | + |
80 | 99 | # Contributing |
81 | 100 |
|
82 | 101 | [Spatial Current, Inc.](https://spatialcurrent.io) is currently accepting pull requests for this repository. We'd love to have your contributions! Please see [Contributing.md](https://github.com/spatialcurrent/railgun/blob/master/CONTRIBUTING.md) for how to get started. |
|
0 commit comments