|
1 | 1 | # Changelog |
2 | 2 |
|
3 | | -## [Unreleased](https://github.com/MarquezProject/marquez/compare/0.26.0...HEAD) |
4 | | - |
5 | | -### Added |
6 | | -* Implemented dataset symlink feature which allows providing multiple names for a dataset and adds edges to lineage graph based on symlinks [`#2066`](https://github.com/MarquezProject/marquez/pull/2066) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |
7 | | -* Store column lineage facets in separate table [`#2096`](https://github.com/MarquezProject/marquez/pull/2096) [@mzareba382](https://github.com/mzareba382) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |
8 | | -* Lineage graph endpoint for column lineage [`#2124`](https://github.com/MarquezProject/marquez/pull/2124) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |
9 | | -* Enrich returned dataset resource with column lineage information [`#2113`](https://github.com/MarquezProject/marquez/pull/2113) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |
10 | | -* Downstream column lineage [`#2159`](https://github.com/MarquezProject/marquez/pull/2159) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |
11 | | -* Column lineage within Marquez Java client [`#2163`](https://github.com/MarquezProject/marquez/pull/2163) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |
12 | | -* Endpoint to get column lineage by a job [`#2204`](https://github.com/MarquezProject/marquez/pull/2204) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |
13 | | -* Python client for column lineage [`#2209`](https://github.com/MarquezProject/marquez/pull/2209) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |
14 | | - |
| 3 | +## [Unreleased](https://github.com/MarquezProject/marquez/compare/0.27.0...HEAD) |
| 4 | + |
| 5 | +## [0.27.0](https://github.com/MarquezProject/marquez/compare/0.26.0...0.27.0) - 2022-10-24 |
| 6 | + |
| 7 | +### Added |
| 8 | + |
| 9 | +* Implement dataset symlink feature [`#2066`](https://github.com/MarquezProject/marquez/pull/2066) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |
| 10 | + *Adds support for multiple dataset names and adds edges to the lineage graph based on symlinks.* |
| 11 | +* Store column lineage facets in separate table [`#2096`](https://github.com/MarquezProject/marquez/pull/2096) [@mzareba382](https://github.com/mzareba382) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |
| 12 | + *Adds a column-level lineage representation and API endpoint to retrieve column-level lineage data from the Marquez database.* |
| 13 | +* Add a lineage graph endpoint for column lineage [`#2124`](https://github.com/MarquezProject/marquez/pull/2124) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |
| 14 | + *Allows for the storing of column-lineage information from events in the Marquez database and exposes column lineage through a graph endpoint.* |
| 15 | +* Enrich returned dataset resource with column lineage information [`#2113`](https://github.com/MarquezProject/marquez/pull/2113) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |
| 16 | + *Extends the `/api/v1/namespaces/{namespace}/datasets` endpoint to return the `columnLineage` facet.* |
| 17 | +* Add downstream column lineage [`#2159`](https://github.com/MarquezProject/marquez/pull/2159) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |
| 18 | + *Extends the recursive query that returns column lineage nodes to traverse the graph for downstream nodes.* |
| 19 | +* Implement column lineage within Marquez Java client [`#2163`](https://github.com/MarquezProject/marquez/pull/2163) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |
| 20 | + *Adds Marquez API client methods for column lineage.* |
| 21 | +* Provide `dataset_symlinks` table for `SymlinkDatasetFacet` [`#2087`](https://github.com/MarquezProject/marquez/pull/2087) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |
| 22 | + *Modifies Marquez to handle the new `SymlinkDatasetFacet` in the OpenLineage spec.* |
| 23 | +* Display current run state for job node in lineage graph [`#2146`](https://github.com/MarquezProject/marquez/pull/2146) [@wslulciuc](https://github.com/wslulciuc) |
| 24 | + *Fills job nodes in the lineage graph with the latest run state and makes some minor changes to column names used to display dataset and job metadata.* |
| 25 | +* Include column lineage in dataset resource [`#2148`](https://github.com/MarquezProject/marquez/pull/2148) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |
| 26 | + *Creates a method in `ColumnLineageService` to enrich `Dataset` with column lineage information and uses the method in `DatasetResource`.* |
| 27 | +* Add indices on the job table [`#2161`](https://github.com/MarquezProject/marquez/pull/2161) [@phixMe](https://github.com/phixMe) |
| 28 | + *Adds indices to the fields used we join on inside the lineage query to speed up the join operation in the `/lineage` query.* |
| 29 | +* Add endpoint to get column lineage by a job [`#2204`](https://github.com/MarquezProject/marquez/pull/2204) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |
| 30 | + *Changes the API to make column lineage available for jobs.* |
| 31 | +* Add column lineage methods to Python client [`#2209`](https://github.com/MarquezProject/marquez/pull/2209) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |
| 32 | + *Implements methods for column lineage in the Python client.* |
| 33 | + |
| 34 | +### Changed |
| 35 | + |
| 36 | +* Update insert job function to avoid joining on symlinks for jobs with no symlinks [`#2144`](https://github.com/MarquezProject/marquez/pull/2144) [@collado-mike](https://github.com/collado-mike) |
| 37 | + *Radically reduces the database compute load in Marquez installations that frequently create a large number of new jobs.* |
| 38 | +* Increase size of `column-lineage.description` column [`#2205`](https://github.com/MarquezProject/marquez/pull/2205) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |
| 39 | + *`VARCHAR(255)` was too small for some users.* |
15 | 40 |
|
16 | 41 | ### Fixed |
17 | | -* Add support for `parentRun` facet as reported by older Airflow OpenLineage versions [@collado-mike](https://github.com/collado-mike) |
| 42 | + |
| 43 | +* Add support for `parentRun` facet as reported by older Airflow OpenLineage versions [`#2130`](https://github.com/MarquezProject/marquez/pull/2130) [@collado-mike](https://github.com/collado-mike) |
| 44 | + *Adds a `parentRun` alias to the `LineageEvent` `RunFacet`.* |
| 45 | +* Add fix and tests for handling Airflow DAGs with dots and task groups [`2126`](https://github.com/MarquezProject/marquez/pull/2126) [@collado-mike](https://github.com/collado-mike) [@wslulciuc](https://github.com/wslulciuc) |
| 46 | + *Fixes a recent change that broke how Marquez handles DAGs with dots and tasks within task groups and adds test cases to validate.* |
| 47 | +* Fix version bump in `docker/up.sh` [`2129`](https://github.com/MarquezProject/marquez/pull/2129) [@wslulciuc](https://github.com/wslulciuc) |
| 48 | + *Defines a `VERSION` variable to bump on a release.* |
| 49 | +* Use `clean` when running `shadowJar` in Dockerfile [`2145`](https://github.com/MarquezProject/marquez/pull/2145) [@wslulciuc](https://github.com/wslulciuc) |
| 50 | + *Ensures the directory `api/build/libs/` is cleaned before building the JAR again and updates `.dockerignore` to ignore `api/build/*`.* |
| 51 | +* Fix bug that caused a single run event to create multiple jobs [`#2162`](https://github.com/MarquezProject/marquez/pull/2162) [@collado-mike](https://github.com/collado-mike) |
| 52 | + *Checks to see if a run with the given ID already exists and uses the pre-associated job if so.* |
| 53 | +* Fix column lineage returning multiple entries for job run multiple times [`#2176`](https://github.com/MarquezProject/marquez/pull/2176) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |
| 54 | + *Makes column lineage return a column dependency only once if a job has been run several times.* |
| 55 | +* Fix API spec issues [`#2178`](https://github.com/MarquezProject/marquez/pull/2178) [@phixMe](https://github.com/phixMe) |
| 56 | + *Fixes issues with type generators in the `putDataset` API.* |
| 57 | +* Fix downstream recursion [`#2181`](https://github.com/MarquezProject/marquez/pull/2181) [@pawel-big-lebowski](https://github.com/pawel-big-lebowski) |
| 58 | + *Fixes issue causing same node to be added to recursive table multiple times.* |
| 59 | +* Update `jobs_current_version_uuid_index` and `jobs_symlink_target_uuid_index` to ignore `NULL` values [`#2186`](https://github.com/MarquezProject/marquez/pull/2186) [@collado-mike](https://github.com/collado-mike) |
| 60 | + *Avoids writing to the indices when the indexed values added by [#2161](https://github.com/MarquezProject/marquez/pull/2161) are null.* |
18 | 61 |
|
19 | 62 | ## [0.26.0](https://github.com/MarquezProject/marquez/compare/0.25.0...0.26.0) - 2022-09-15 |
20 | 63 |
|
|
0 commit comments