|
| 1 | +GOBBLIN 0.9.0 |
| 2 | +------------- |
| 3 | + |
| 4 | +### Created Date: 12/13/2016 |
| 5 | + |
| 6 | +## Highlights |
| 7 | + |
| 8 | +* Refactored project structure in Gobblin. If not importing dependencies transitively, you may need to import "gobblin-core-base". |
| 9 | +* New sources: Google analytics / drive (PR 1301), Google webmaster (PR 1422), Oracle (PR 1304). |
| 10 | +* New writers: Teradata (http://gobblin.readthedocs.io/en/latest/user-guide/Gobblin-JDBC-Writer/), object store (PR 1348). |
| 11 | +* Retention job is more generic, allowing arbitrary actions on dataset versions (https://gobblin.readthedocs.io/en/latest/data-management/Gobblin-Retention). |
| 12 | +* Docker integration (https://gobblin.readthedocs.io/en/latest/user-guide/Docker-Integration). |
| 13 | +* Gobblin jobs can be run embedded into other applications (http://gobblin.readthedocs.io/en/latest/user-guide/Gobblin-as-a-Library/). |
| 14 | +* Gobblin jobs can be run from CLI with full support for templates, plugins, etc. (http://gobblin.readthedocs.io/en/latest/user-guide/Gobblin-CLI/) |
| 15 | +* Topology based data replication: users can specify a topology for their data copy in config store, Gobblin Distcp will handle replication. (PR 1278, PR 1306, PR 1328, PR 1405) |
| 16 | +* Prioritization of work units when there is more work than can be run in a single job (PR 1283). |
| 17 | +* Enabled speculative excecution in MR mode (PR 1347). |
| 18 | + |
| 19 | +## NEW FEATURES |
| 20 | + |
| 21 | +* [Writers] [PR 1181] Teradata Writer implemented. |
| 22 | +* [Converters] [PR 1246] Added some new core converters: schema injector, avro to json string, json to string, string to bytes. |
| 23 | +* [Testing] [PR 1247] Added end-to-end testing framework for Gobblin job execution. |
| 24 | +* [Job Execution] [PR 1248] [PR 1249] Added Quartz scheduler for new Gobblin launch model. |
| 25 | +* [Core] [PR 1278] Added dataset finder using Gobblin config library. |
| 26 | +* [Retention] [PR 1279] Retention job can now apply other arbitrary actions to datasets (for example change ACL). |
| 27 | +* [Core] [PR 1280] Added a converter for parsing GoldenGate messages. |
| 28 | +* [Core] [PR 1283] Added utilities to prioritize work when there are more work units available than can be run in a single job. |
| 29 | +* [Sources] [PR 1301] Added Google analytics and google drive sources. |
| 30 | +* [Sources] [PR 1304] Added Oracle extractor. |
| 31 | +* [Core] [PR 1305] Added a schema based partitioner. |
| 32 | +* [Deploy] [PR 1308] Docker integration. |
| 33 | +* [Core] [PR 1313] [PR 1331] Gobblin in embedded mode. |
| 34 | +* [Core] [PR 1333] Support for plugins in Gobblin instances. |
| 35 | +* [Core] [PR 1337] Kerberos login plugin implemented. |
| 36 | +* [Core] [PR 1340] New Gobblin cli capable of using templates, plugins, etc. |
| 37 | +* [Core] [PR 1347] Support speculative execution in MR mode. |
| 38 | +* [Writers] [PR 1348] Object store writer. |
| 39 | +* [Compaction] [PR 1354] Delta support in Gobblin compaction. |
| 40 | +* [Core] [PR 1440] Added email notification plugin. |
| 41 | +* [Sources] [PR 1422] Google webmaster source |
| 42 | + |
| 43 | +## IMPROVEMENTS |
| 44 | + |
| 45 | +* [Templating] [PR 1228] Templates read *.conf files as `Config` objects, allowing for better interpolation of configurations. |
| 46 | +* [Core] [PR 1246] Wikipedia source changed to actually use state store. |
| 47 | +* [Core] [PR 1246] Robustness improvements on `JobScheduler`, previously it silently failed on certain exceptions. |
| 48 | +* [Core] [PR 1339] Gobblin can gracefully skip work units. |
| 49 | +* [Build] [PR 1417] Refactoring of Kafka dependent classes into separate modules for improved dependency management. |
| 50 | +* [Build] [PR 1424] Refactoring of Gobblin core module for improved dependency management. |
| 51 | +* Improved documentation for various features. |
| 52 | +* Fixed many intermittently failing unit tests (special thanks to htran1). |
| 53 | +* Various bug fixes. |
| 54 | + |
| 55 | +## EXTERNAL CONTRIBUTIONS |
| 56 | +We would like to thank all our external contributors for helping improve Gobblin. |
| 57 | + |
| 58 | +* lbending |
| 59 | + - Teradata writer (PR 1181) |
| 60 | + - Oracle extractor (PR 1304) |
| 61 | + |
| 62 | +* jsavolainen |
| 63 | + - Bug fixes in job configuration loading (PR 1259) |
| 64 | + |
| 65 | +* klyr |
| 66 | + - Update lib versions for AWS (PR 1368) |
| 67 | + |
| 68 | +* enjoyear |
| 69 | + - Google webmaster source |
| 70 | + |
1 | 71 | GOBBLIN 0.8.0
|
2 | 72 | -------------
|
3 | 73 |
|
@@ -157,19 +227,19 @@ GOBBLIN 0.8.0
|
157 | 227 |
|
158 | 228 | We would like to thank all our external contributors for helping improve Gobblin.
|
159 | 229 |
|
160 |
| -* singhd10: |
| 230 | +* singhd10: |
161 | 231 | -Add metadata after completion of job to a specific metadata directory (PR 980)
|
162 | 232 | * shelocks:
|
163 | 233 | -Fixing SOURCE_QUERYBASED_LOW_WATERMARK_BACKUP_SECS no default value (PR 1005)
|
164 |
| -* lbendig,Lorand Bendig: |
| 234 | +* lbendig,Lorand Bendig: |
165 | 235 | -Document changes in PR#952 (PR 1012)
|
166 | 236 | -Make topic suffix configurable for lookup in Confluent Schema Registry (PR 1210)
|
167 |
| -* jinhyukchang, Jinhyuk Chang: |
| 237 | +* jinhyukchang, Jinhyuk Chang: |
168 | 238 | -JDBCWriter. Bug fix on SQL statements. Bug fix on data type mapping. (PR 1050)
|
169 | 239 | -HttpWriter including SalesForceRestWriter, ThrottleWriter, etc (PR 1186)
|
170 | 240 | * ypopov, Eugene Popov:
|
171 | 241 | -Teradata JDBC Extractor and Source (PR 1090)
|
172 |
| -* pldash |
| 242 | +* pldash |
173 | 243 | -Added JsonConverter to parse Json files to a format such that JsonIntermediateToAvro converter can parse (PR 1092)
|
174 | 244 |
|
175 | 245 | GOBBLIN 0.7.0
|
@@ -434,7 +504,7 @@ NEW FEATURES
|
434 | 504 | * [Runtime] Added CliLocalJobLauncher for launching single jobs from the command line.
|
435 | 505 | * [Converters] Added AvroSchemaFieldRemover that can remove specific fields from a (possibly recursive) Avro schema.
|
436 | 506 | * [DQ] Added new row-level policies RecordTimestampLowerBoundPolicy and AvroRecordTimestampLowerBoundPolicy for checking if a record timestamp is too far in the past.
|
437 |
| -* [Kafka] Added schema registry API to KafkaAvroExtractor which enables supports for various Kafka schema registry implementations (e.g. Confluent's schema registry). |
| 507 | +* [Kafka] Added schema registry API to KafkaAvroExtractor which enables supports for various Kafka schema registry implementations (e.g. Confluent's schema registry). |
438 | 508 | * [Build/Release] Added build instrumentation to publish artifacts to Maven Central
|
439 | 509 |
|
440 | 510 | BUG FIXES
|
@@ -467,20 +537,20 @@ EXTERNAL CONTRIBUTIONS
|
467 | 537 |
|
468 | 538 | We would like to thank all our external contributors for helping improve Gobblin.
|
469 | 539 |
|
470 |
| -* kadaan, joel.baranick: |
| 540 | +* kadaan, joel.baranick: |
471 | 541 | - Separate publisher filesystem from writer filesystem
|
472 | 542 | - Support for generating Idea projects with the correct language level (Java 7)
|
473 | 543 | - Fixed yarn conf path in gobblin-yarn.sh
|
474 |
| -* mwol(Maurice Wolter) |
| 544 | +* mwol(Maurice Wolter) |
475 | 545 | - Implemented new class AvroCombineFileSplit which stores the avro schema for each split, determined by the corresponding input file.
|
476 | 546 | * cheleb(NOUGUIER Olivier)
|
477 | 547 | - Add support for maven install
|
478 |
| -* dvenkateshappa |
| 548 | +* dvenkateshappa |
479 | 549 | - bugifx to RestApiExtractor.java
|
480 | 550 | - Added an excluding column list , which can be used for salesforce configuration with huge list of columns.
|
481 |
| -* klyr (Julien Barbot) |
| 551 | +* klyr (Julien Barbot) |
482 | 552 | - bugfix to gobblin-mapreduce.sh
|
483 |
| -* gheo21 |
| 553 | +* gheo21 |
484 | 554 | - Bumped kafka dependency to 2.11
|
485 | 555 | * ahollenbach (Andrew Hollenbach)
|
486 | 556 | - configuration improvements for standalone mode
|
|
0 commit comments