|
| 1 | + |
| 2 | +GOBBLIN 0.6.0 |
| 3 | +-------------- |
| 4 | + |
| 5 | +NEW FEATURES |
| 6 | + |
| 7 | +* [Compaction] Added M/R compaction/de-duping for hourly data |
| 8 | +* [Compaction] Added late data handling for hourly and daily M/R compaction: https://github.com/linkedin/gobblin/wiki/Compaction#handling-late-records; added support for triggering M/R compaction if late data exceeds a threshold |
| 9 | +* [I/O] Added support for using Hive SerDe's through HiveWritableHdfsDataWriter |
| 10 | +* [I/O] Added the concept of data partitioning to writers: https://github.com/linkedin/gobblin/wiki/Partitioned-Writers |
| 11 | +* [Runtime] Added CliLocalJobLauncher for launching single jobs from the command line. |
| 12 | +* [Converters] Added AvroSchemaFieldRemover that can remove specific fields from a (possibly recursive) Avro schema. |
| 13 | +* [DQ] Added new row-level policies RecordTimestampLowerBoundPolicy and AvroRecordTimestampLowerBoundPolicy for checking if a record timestamp is too far in the past. |
| 14 | +* [Kafka] Added schema registry API to KafkaAvroExtractor which enables supports for various Kafka schema registry implementations (e.g. Confluent's schema registry). |
| 15 | +* [Build/Release] Added build instrumentation to publish artifacts to Maven Central |
| 16 | + |
| 17 | +BUG FIXES |
| 18 | + |
| 19 | +* [Retention management] Trash handles deletes of files already existing in trash correctly. |
| 20 | +* [Kafka] Fixed an issue that may cause Kafka adapter to miss data if the fork fails. |
| 21 | + |
| 22 | +OTHER IMPROVEMENTS |
| 23 | + |
| 24 | +* [Runtime] Added metrics for job executions |
| 25 | +* [Metrics] Added a root metric context to keep track of GC of metrics and metric contexts and make sure those are properly reported |
| 26 | +* [Compaction] Improve topic isolation in MRCompactor |
| 27 | +* [Build/release] Java version compatibility raised to Java 7. |
| 28 | +* [Runtime] Deprecated COMMIT_ON_PARTIAL_SUCCESS and added a new policy for successful extracts |
| 29 | +* [Retention management] Async trash implementation for parallel deletions. |
| 30 | +* [Metrics] Added tracking events emission when data gets published |
| 31 | +* [Retention management] Added support for parallel execution to the dataset cleaner |
| 32 | +* [Runtime] Update job execution info in the execution history store upon every task completion |
| 33 | + |
| 34 | +INCUBATION |
| 35 | + |
| 36 | +Note: these are new features which are under active development and may be subject to significant changes. |
| 37 | + |
| 38 | +* [gobblin-ce] Adding support for Gobblin Continuous Execution on Yarn |
| 39 | +* [distcp-ng] Started work on bulk transfer (file copies) using Gobblin |
| 40 | +* [distcp-ng] Added a light-weight Hadoop FileSystem implementation for file transfer from SFTP |
| 41 | +* [gobblin-config] Added API for dataset driven |
| 42 | + |
| 43 | +EXTERNAL CONTRIBUTIONS |
| 44 | + |
| 45 | +We would like to thank all our external contributors for helping improve Gobblin. |
| 46 | + |
| 47 | +* kadaan, joel.baranick: |
| 48 | + - Separate publisher filesystem from writer filesystem |
| 49 | + - Support for generating Idea projects with the correct language level (Java 7) |
| 50 | + - Fixed yarn conf path in gobblin-yarn.sh |
| 51 | +* mwol(Maurice Wolter) |
| 52 | + - Implemented new class AvroCombineFileSplit which stores the avro schema for each split, determined by the corresponding input file. |
| 53 | +* cheleb(NOUGUIER Olivier) |
| 54 | + - Add support for maven install |
| 55 | +* dvenkateshappa |
| 56 | + - bugifx to RestApiExtractor.java |
| 57 | + - Added an excluding column list , which can be used for salesforce configuration with huge list of columns. |
| 58 | +* klyr (Julien Barbot) |
| 59 | + - bugfix to gobblin-mapreduce.sh |
| 60 | +* gheo21 |
| 61 | + - Bumped kafka dependency to 2.11 |
| 62 | +* ahollenbach (Andrew Hollenbach) |
| 63 | + - configuration improvements for standalone mode |
| 64 | +* lbendig (Lorand Bendig) |
| 65 | + - fixed a bug in DatasetState creation |
0 commit comments