- N/A
- Depends on
protobuf>=4.21.6,<6.32for 3.9 and 3.10
- N/A
- N/A
- N/A
- Relax dependency on Protobuf to include version 6.x
- Remove upper bound for Protobuf dependency
- N/A
- N/A
- N/A
- Bumped the minimum bazel version required to build
tfmdto 6.5.0.
- N/A
- N/A
- N/A
- Add Audio as a schema domain.
- Add Video as a schema domain.
- Resolve issue where pre-release versions of protobuf are installed.
- Depends on
protobuf>=4.25.2,<5for Python 3.11 and onprotobuf>=4.21.6,<4.22for 3.9 and 3.10
- N/A
- N/A
- Relax dependency on Protobuf to include version 5.x
- N/A
- For nested features with N nested levels (N > 1), the statistics counting
the number of values in
CommonStatisticsandWeightedCommonStatisticswill rely on the innermost level.
- N/A
- N/A
- N/A
- Bump the Ubuntu version on which TFMD is tested to 20.04 (previously was 16.04).
- Bumped the minimum bazel version required to build
tfmdto 6.1.0. - Depends on
protobuf>=4.25.2,<5for Python 3.11 and onprotobuf>3.20.3,<4.21for 3.9 and 3.10. - Depends on
googleapis-common-protos>=1.56.4,<2for Python 3.11 and ongoogleapis-common-protos>=1.52.0,<2for 3.9 and 3.10. - Relax dependency on
absl-pyto include version 2.
- Removed
NaturalLanguageDomain.location_constraint_regex. It was documented as "please do not use" and never implemented. - Change to the semantics of min/max/avg/tot num-values for nested features (see above).
- Deprecated Python 3.8 support.
- N/A
- Add
joint_grouptoSequenceMetadatato specify which group this sequence feature belongs to so that they can be modeled jointly. - Add
BOOL_TYPE_INVALID_CONFIGanomaly type. - Add
embedding_dimtoFloatDomainto specify the embedding dimension, which is useful for use cases such as restoring shapes for flattened sequence of embeddings. - Add
sequence_truncation_limittoSequenceMetadatato specify the maximum sequence length that should be processed. - Depends on
protobuf>=3.20.3,<4.21. Upper bound is required to avoid breaking changes. - Add
embedding_typetoFloatDomainto specify the semantic type of the embedding. This is useful for use cases where the embedding dimension is inferred from the embedding type.
- N/A
- N/A
- N/A
- Depends on
protobuf>=3.20.3,<5.
- N/A
- N/A
- Introduce
Schema.represent_variable_length_as_raggedknob to automatically generateRaggedTensors for variable length features. - Introduces a Schema option
HistogramSelectionto allow numeric drift/skew calculations to use QUANTILES histograms, which are more robust to outliers.
- N/A
- N/A
- Deprecated Python 3.7 support.
- N/A
- N/A
- N/A
- N/A
- N/A
- Add a categorical indicator to the schema for
StringDomain. - Add ProblemStatement Task.is_auxiliary field to allow specifying auxiliary tasks in multi-task learning problems.
- Add the SequenceMetadata field to the schema to specify if this feature could be treated as a sequence feature.
- Add a
CUSTOM_VALIDATIONType in anomalies.proto.
- Histogram Buckets include their upper bound instead of their lower bound.
- N/A
- N/A
- ThresholdConfig.threshold field is made into a oneof.
- Clarifies the meaning of num_non_missing in statistics.proto.
- N/A
- ProblemStatement Task.task_weight and MetaOptimizationTarget.weight are deprecated.
- N/A
- N/A
- N/A
- N/A
- N/A
- Adds experimental support within statistics.proto and schema.proto for marking features that are derived during statistics generation for data exploration or validation, but not actually present in input data.
- Adds an experimental DERIVED_FEATURE_BAD_LIFECYCLE and DERIVED_FEATURE_INVALID_SOURCE anomaly type.
- N/A
- N/A
- N/A
- N/A
- N/A
- N/A
- N/A
- statistics.proto: Includes a field
invalid_utf8_countinStringStatisticsto store the number of non-utf8 encoded strings for a feature. - Depends on
absl-py>=0.9,<2.0.0.
- Removes deprecated field
objective_functionfrom ProblemStatement.
- Deprecates
multi_objectivefield in ProblemStatement. - Deprecates several unused PerformanceMetrics.
- N/A
- A
threshold_configis added to MetaOptimizationTarget to allow for expressing thresholded optimization goals.
- N/A
- N/A
- N/A
- Added a new field to
FloatDomainin schema to allow expression of categorical floats.
- N/A
- Deprecated Python 3.6 support.
- To maintain version consistency among TFX Family libraries we skipped the 1.3.x release for TFMD library.
- Added
PositiveNegativeSpectoProblemStatement.BinaryClassificationfor specifying positive and negative class values.
- N/A
- N/A
- N/A
- N/A
- Depends on
protobuf>=3.13,<4.
- N/A
- N/A
- Added public python interface for proto/* in proto/init.py
- N/A
- N/A
- N/A
- N/A
- Added new anomaly types:
MULTIPLE_REASONSandINVALID_DOMAIN_SPECIFICATION. - Added new anomaly type:
STATS_NOT_AVAILABLE.
- N/A
- N/A
- Adding the ability to specify and detect sequence length issues.
- Depends on
absl-py>=0.9,<0.13.
- N/A
- N/A
- Added new anomaly type
MAX_IMAGE_BYTE_SIZE_EXCEEDEDfor image_domain. - Added new anomaly type
INVALID_FEATURE_SHAPE. - The
RaggedTensorTensorRepresentation now supports additional partitions.
- N/A
- N/A
- N/A
- Added new anomaly types to AnamalyInfo to report data issues with NL features.
- Added new FloatDomain field and anomaly type to designate and validate features that represent fixed dimensional embeddings.
- N/A
- N/A
- Added new fields to NaturalLanguageDomain message in the schema, including support for specifying vocabularies, constraints on sequence values (SequenceValueConstraints), constraints on vocabulary coverage (FeatureCoverageConstraints), and constraints on token location (location_constraints_regex).
- Added new NaturalLanguageStatistics message to the statistics.proto so that we can compute statistics corresponding to Natural Language features.
- N/A
- N/A
- N/A
-
Added new Anomaly and Schema field to support drift and distribution skew detection for numeric features.
-
Added a new field in Anomalies proto to report the raw measurements of distribution skew detection.
-
From this release TFMD will also be hosting nightly packages on https://pypi-nightly.tensorflow.org. To install the nightly package use the following command:
pip install --extra-index-url https://pypi-nightly.tensorflow.org/simple tensorflow-metadataNote: These nightly packages are unstable and breakages are likely to happen. The fix could often take a week or more depending on the complexity involved for the wheels to be available on the PyPI cloud service. You can always use the stable version of TFMD available on PyPI by running the command
pip install tensorflow-metadata.
- Added new Anomaly type to describe when a domain is incompatible with the data type.
- Added new Anomaly types for invalid schema configurations (missing name, missing type, etc).
- Added new Anomaly type to describe when type does not match the data.
- Added new LifecycleStage:DISABLED.
- N/A
- N/A
- From this version we will be releasing python 3.8 wheels.
- When installing from source, you don't need any steps other than
pip install(needs Bazel). - Labels can be specified as Paths in addition to string names.
- Depends on
absl-py>=0.9,<0.11. - Depends on
googleapis-common-protos>=1.52.0,<2.
- N/A
- Deprecated Python 3.5 support.
- Added disallow_inf to FloatDomain message in schema.proto.
- Added new Anomaly type to describe data that has unexpected Infs / -Infs.
- Added new Anomaly and Schema field for specifying ratio of supported images.
- Added value_counts field to Feature message in schema.proto, which describes the number of values for features that have more than one nestedness level.
- Added new anomaly type VALUE_NESTEDNESS_MISMATCH to describe data that has a nestedness level that does not match the schema.
- Added new Any type value to CustomStatistic.
- Add ProblemStatement and Metric Python proto stubs.
- Use absltest instead of unittest.
- N/A
- Drops Python 2 support.
- Note: We plan to remove Python 3.5 support after this release.
- Added UniqueConstraints to Feature message in schema.proto.
- Added new Anomaly types to describe data that does not conform to UniqueConstraints.
- Added PresenceAndValencyStatistics to CommonStatistics.
- Added RaggedTensor in TensorRepresentation
- Added a new type of Anomaly: DATASET_HIGH_NUM_EXAMPLES
- Added a new field to dataset_constraints: max_examples_count
- Added a multi-label TaskType.
- Removed ProblemStatementNamespace proto
- Removed ProblemStatementReference proto
- Removed field ProblemStatement.implements
- Fixed a compatibility issue with newer bazel versions.
- Started pulling TF 1.15.2 source for building.
- Added support for specifying behavior of rare / OOV multiclass labels.
- Added anomaly types related to weighted features.
- Added support for storing lift stats on weighted examples.
- The removal of
lift_seriesfromCategoricalCrossStatsand the change of type ofLiftSeries.LiftValue.liftfrom float to double will cause parsing failures for serialized protos written written by version 0.21.0 which contained the deleted or changed fields.
- Added protos for categorical cross statistics using lift.
- Added a new type of Anomaly: FLOAT_TYPE_HAS_NAN
- Added a new field to float_domain: disallow_nans
- Added SparseTensor to TensorRepresentation.
- Added a new type of Anomaly
- Add WeightedFeature to schema.
- Add min_examples_count to DatasetConstraints and DATASET_LOW_NUM_EXAMPLES anomaly type.
- Add TimeOfDay domain and UNIX_DAY granularity for TimeDomain in schema.
- Added TensorRepresentation to schema.
No significant changes. Upgrading to keep version alignment.
- Adding CustomMetric to PerformanceMetric.
- Added an Any field to Schema Feature, for storing arbitrary structured data.
- Refactoring ProblemStatement and related protos. At present, these are not stable.
- Added ProblemStatement.
- Add support for declaring sparse features.
- Add support for schema diff regions.
- Adding functionality for handling structured data.
- StructStatistics.common_statistics changed to StructStatistics.common_stats to agree with Facets.
- The change from StructStatistics.common_statistics to StructStatistics.common_stats may break code that had this field set and was serializing to some text format. The wire format should be fine.
- Use the same version of protobuf as tensorflow.
- Added support for structural statistics.
- Added new error types.
- Removed DiffRegion.
- added RankHistogram to CustomStatistics.
- Removed DiffRegion.
- Established tf.Metadata as a standalone package.
- Moved tf.Metadata code out of TF-Transform code tree, requiring package dependency updates and import updates.