Releases: tensorflow/serving
2.5.0-rc0
TensorFlow Serving using TensorFlow 2.5.0-rc0
Major Features and Improvements
- Upgrade to CUDA 11.2. (commit: 1975e3e)
- Experimental support for serving JAX and XLA/CPU models. (commit: 3c1b2b3)
- Add latency and availability metrics to the Prometheus API (#1623) (commit: dfb41f1)
- Update TF Text to v2.4.3. (commit: ccfb606)
- Support URL reserved characters for REST API (#1726) (commit: dd9c467)
- Add Cross-Origin Resource Sharing (CORS) headers to REST API (#1817)
Breaking Changes
- No breaking changes
Bug Fixes and Other Changes
- Fix typo in REQUIRED_PACKAGES for grpcio (commit: b9ed0f8)
- Update resnet_k8s.yaml file (commit: e7b7b33)
- Fix a compile warning thrown by gcc-9 (commit: 38a017d)
- Fix typo (commit: dbcd54f)
- Update json_tensor.cc (commit: a0a9d14)
- Add TfLiteInterpreterPool to make concurrent use of TfliteSession better (commit: d9efa43)
- Enable download of TF Serving sources at arbitrary commit for CPU docker image. (commit: de1ab9e)
- Updated tests to newer API (commit: 30dd2fe)
- Control number of grpc threads for request handling to avoid OOM (Fixes #1785). (commit: ac0eb73)
- Add dedicated aliases field to ModelServerConfig. (commit: 358f7d1)
- Update docker command line to work with GPUs (Fixes #1768). (commit: b41a28b)
- Option to disable grpc over http (Fixes #1764) (commit: f087290)
- Remove an unused experimental config option "experimental_fixed_input_tensors_filepath". (commit: 3234fca)
- Removing CurriedSession, since it is no longer used. (commit: 87793ad)
- Improve error message for file not exists. (commit: 78d47f7)
- Fix inference request delay when model is switched (Fixes #1796). (commit: 803dd42)
- Transition TensorFlow Serving to TensorFlow's new WORKSPACE protocol. (commit: 50a7ef3)
- Clarifying object values in REST requests to include B64 encoding and similar key:value pair objects. (commit: 0536678)
- Remove experimental comment on TfLiteSession (commit: ab7f9a5)
- Register custom TfLite ParseExample and add benchmark (commit: 20fe3ca)
- Use respectful terms. (commit: b73bd7b)
- Pre-allocate memory for certain vectors where the size is known. (commit: e208b6e)
- Updating serving_basic for adjusting serving_basic.md file and making it up2date with TF2.x - including: (commit: cea306a)
- Use NullSafeStringView for potentially null pointer returned from libevent (commit: a46fdb2)
- Replace nullptr constructor for string_view with empty strings (commit: a98d164)
- Fixing MKL builds due to missing 'build_with_openmp' option (commit: 0ed23df)
- Implement batch parallelism for tflite sessions (commit: fec1d5d)
- Fix GPU docker image massive increase in size (#1813) (commit: 5a0dfd9)
- Fix TensorFlow Serving build with MKL+OpenMP (commit: ddad074)
- Remove hashtable custom op dependencies (commit: bb51722)
- Enable aspired version which failed to load to attempt reload. (commit: 2530a33)
- Fixed a compilation error in aspired_versions_manager.cc (commit: 4ca9a4b)
- Add "_r" root event annotation to ProcessBatch events. (commit: e5c3aec)
- Bump minimum bazel version 3.7.2. (commit: 5edcd13)
- Dont hardcode path to python3. (commit: 63b2d1c)
- Fix package build due to config move in: (commit: 18dd766)
- Add model_service_cc_grpc_proto (commit: a670ff5)
- Fix memory leak from allocating input tensors (commit: 2f9b6a0)
- Allowing lossy floating point conversions for JSON inputs (commit: 57dac6c)
- Adding enable_profiler command line flag. (commit: 7e8720d)
- Add logging in ServerCore. (commit: 623da67)
- Removes mention of ASCII (commit: 8e97b59)
Thanks to our Contributors
This release contains contributions from many people at Google, as well as:
Abhinav Pundir, Abolfazl Shahbazi, Aurélien Geron, Bairen Yi, gbaned, handong, Hao Ziyu, Junqin Zhang, kiddos, Oliver Sampson, OniB, Runzhen Wang, skawasak, zou000
2.2.2
2.4.1
2.4.0
Major Features and Improvements
- Update TF Text to v2.3.0.
- Upgrade to CUDA Version 11.0.
- Update CUDNN_VERSION to 8.0.4.30.
- Adds user guide for Remote Predict Op.
- Add support for serving regress/classify (native keras) TF2 models.
Breaking Changes
Bug Fixes and Other Changes
- Adding /usr/local/cuda/extras/CUPTI/lib64 to LD_LIBRARY_PATH in order to unblock profiling (commit: 1270b8c)
- Improve error message when version directory is not found (commit: d687d3e)
- Migrate the remaining references of tf.app to compat.v1. (commit: 06fbf87)
- Cleanup TraceMe idioms (commit: f22f802)
- Adds LICENSE file to tensorflow-serving-api python package. (commit: 41188d4)
- Enable a way to 'forget' unloaded models in the ServableStateMonitor. (commit: 53c5a65)
- Added abstract layer for remote predict op over different RPC protocols with template. (commit: c54ca7e)
- Add an example which call the Remote Predict Op directly. (commit: d5b980f)
- For batching session in TF serving model server, introduce options to enable large batch splitting. (commit: f84187e)
- Add multi-inference support for TF2 models that use (commit: abb8d3b)
- Use absl::optional instead of tensorflow::serving::optional. (commit: c809305)
- Use absl::optional instead of tensorflow::serving::optional. (commit: cf1cf93)
- Remove tensorflow::serving::MakeCleanup and use tensorflow::gtl::MakeCleanup. (commit: 6ccb003)
- Use absl::optional and remove tensorflow::serving::optional. (commit: e8e5222)
- Deprecate tensorflow::CreateProfilerService() and update serving client. (commit: 98a5503)
- Change the SASS & PTX we ship with TF (commit: 0869292)
- Adding custom op support. (commit: 892ea42)
- Upgrade to PY3 for tests. (commit: 02624a8)
- Makes clear how to make a default config file for serving multiple models. (commit: 084eaeb)
- Use TraceMeEncode in BatchingSession's TraceMe. (commit: 78ff058)
- Export metrics for runtime latency for predict/classify/regress. (commit: c317582)
- Refactor net_http/client to expose request/response functionality as a public API (not yet finalized) for usage testing ServerRequestInterface and HttpServerInterface instances. (commit: 0b951c8)
- In model warm-up path, re-write error code out-of-range (intended when reading EOF in a file) to ok. (commit: d9bde73)
- fix Client Rest API endpoint (commit: b847bac)
- Support multiple SignatureDefs by key in TFLite models (commit: 2e14cd9)
- Add dedicated aliases field to ModelServerConfig. (commit: 718152d)
- Remove deprecated flag fail_if_no_model_versions_found from tensorflow serving binary (commit: 4b62462)
- Fix TraceMe instrumentation for the padding size. (commit: 0cb94cd)
- Add vlog to dump updated model label map (for debugging) each time the map is updated. (commit: ac10e74)
- Add python wrapper for remote predict op and clean the build and include files. (commit: d0daa10)
- Add
portpicker
module required to run modelserver e2e tests. (commit: 82f8cc0) - changing "infintiy" to "really high value" (commit: c96474c)
- Minimal commandline client to trigger profiler on the modelserver. (commit: c0a5619)
- Add signature name to RPOp. (commit: 84dfc8b)
- When RPC error occurs, the output tensors should still get allocated. (commit: 9113de2)
- Fix BM_MobileNet benchmark (commit: af66562)
- Add JSPB BUILD targets for inference and example proto files. (commit: f1009eb)
- Fall back to legacy TFLite tensor naming when parsing signature defs in TFLiteSession. (commit: 3884187)
Thanks to our Contributors
This release contains contributions from many people at Google, as well as:
Adarshreddy Adelli, Lescurel
2.4.0-rc4
TensorFlow Serving using TensorFlow 2.4.0-rc4.
Major Features and Improvements
- Update TF Text to v2.3.0.
- Upgrade to CUDA Version 11.0.
- Update CUDNN_VERSION to 8.0.4.30.
- Adds user guide for Remote Predict Op
- Upgrade to PY3 for tests.
Breaking Changes
Bug Fixes and Other Changes
- Migrate the remaining references of tf.app to compat.v1. (commit: 06fbf87)
- Enable a way to 'forget' unloaded models in the ServableStateMonitor. (commit: 53c5a65)d91cd4a00)
- Added abstract layer for remote predict op over different RPC protocols with template. (commit: c54ca7e)
- Adds user guide for Remote Predict Op (commit: fc82463)
- Add support for serving regress/classify (native) TF2 models. (commit: b724ced)
- For batching session in TF serving model server, introduce options to enable large batch splitting. (commit: f84187e)
- Add multi-inference support for TF2 models that use (commit: abb8d3b)
- Change the SASS & PTX we ship with TF (commit: 0869292)
- Adding custom op support. (commit: 892ea42)
- Broaden net_http client visibility and use HTTPStatusCode in ClientResponse status (commit: 7292714)
- In model warm-up path, re-write error code out-of-range (intended when reading EOF in a file) to ok. (commit: d9bde73)
- Support multiple SignatureDefs by key in TFLite models (commit: 2e14cd9)
- Add dedicated aliases field to ModelServerConfig. (commit: 718152d)
- Add
portpicker
module required to run modelserver e2e tests. (commit: 82f8cc0) - Minimal commandline client to trigger profiler on the modelserver. (commit: c0a5619)
- Add signature name to RPOp. (commit: 84dfc8b)
Thanks to our Contributors
This release contains contributions from many people at Google.
2.4.0-rc3
TensorFlow Serving using TensorFlow 2.4.0-rc3
Major Features and Improvements
- Update TF Text to v2.3.0.
- Upgrade to CUDA Version 11.0.
- Update CUDNN_VERSION to 8.0.4.30.
- Upgrade to PY3 for tests.
Breaking Changes
Bug Fixes and Other Changes
- Migrate the remaining references of tf.app to compat.v1. (commit: 06fbf87)
- Enable a way to 'forget' unloaded models in the ServableStateMonitor. (commit: 53c5a65)d91cd4a00)
- Added abstract layer for remote predict op over different RPC protocols with template. (commit: c54ca7e)
- Adds user guide for Remote Predict Op (commit: fc82463)
- Add support for serving regress/classify (native) TF2 models. (commit: b724ced)
- For batching session in TF serving model server, introduce options to enable large batch splitting. (commit: f84187e)
- Add multi-inference support for TF2 models that use (commit: abb8d3b)
- Change the SASS & PTX we ship with TF (commit: 0869292)
- Adding custom op support. (commit: 892ea42)
- Broaden net_http client visibility and use HTTPStatusCode in ClientResponse status (commit: 7292714)
- In model warm-up path, re-write error code out-of-range (intended when reading EOF in a file) to ok. (commit: d9bde73)
- Support multiple SignatureDefs by key in TFLite models (commit: 2e14cd9)
- Add dedicated aliases field to ModelServerConfig. (commit: 718152d)
- Add
portpicker
module required to run modelserver e2e tests. (commit: 82f8cc0) - Minimal commandline client to trigger profiler on the modelserver. (commit: c0a5619)
- Add signature name to RPOp. (commit: 84dfc8b)
Thanks to our Contributors
This release contains contributions from many people at Google.
2.4.0-rc2
TensorFlow Serving using TensorFlow 2.4.0-rc2
Major Features and Improvements
- Update TF Text to v2.3.0.
- Upgrade to CUDA Version 11.0.
- Update CUDNN_VERSION to 8.0.4.30.
- Upgrade to PY3 for tests.
Breaking Changes
Bug Fixes and Other Changes
- Update TF Text to v2.3.0. (commit: 390732a)
- Migrate the remaining references of tf.app to compat.v1. (commit: 06fbf87)
- Enable a way to 'forget' unloaded models in the ServableStateMonitor. (commit: 53c5a65)d91cd4a00)
- Added abstract layer for remote predict op over different RPC protocols with template. (commit: c54ca7e)
- Adds user guide for Remote Predict Op (commit: fc82463)
- Add support for serving regress/classify (native) TF2 models. (commit: b724ced)
- For batching session in TF serving model server, introduce options to enable large batch splitting. (commit: f84187e)
- Add multi-inference support for TF2 models that use (commit: abb8d3b)
- Change the SASS & PTX we ship with TF (commit: 0869292)
- Adding custom op support. (commit: 892ea42)
- Upgrade to PY3 for tests. (commit: 02624a8)
- Upgrade to CUDA Version 11.0. (commit: b291c4d)
- Broaden net_http client visibility and use HTTPStatusCode in ClientResponse status (commit: 7292714)
- In model warm-up path, re-write error code out-of-range (intended when reading EOF in a file) to ok. (commit: d9bde73)
- Update CUDNN_VERSION to 8.0.4.30. (commit: 84e0189)
- Support multiple SignatureDefs by key in TFLite models (commit: 2e14cd9)
- Add dedicated aliases field to ModelServerConfig. (commit: 718152d)
- Add
portpicker
module required to run modelserver e2e tests. (commit: 82f8cc0) - Minimal commandline client to trigger profiler on the modelserver. (commit: c0a5619)
- Add signature name to RPOp. (commit: 84dfc8b)
Thanks to our Contributors
This release contains contributions from many people at Google.
2.3.0
TensorFlow Serving using TensorFlow 2.3.0
Bug Fixes and Other Changes
- Add a ThreadPoolFactory abstraction for returning inter- and intra- thread pools, and update PredictRequest handling logic to use the new abstraction. (commit: 8e3a00c)
- Update Dockerfile.devel* with py3.6 installed. (commit: b3f46d4)
- Add more metrics for batching. (commit: f0bd9cf)
- Rename method to clarify intent. (commit: 9feac12)
- Plug ThreadPoolFactory into Classify request handling logic. (commit: 975f474)
- Plug ThreadPoolFactory into Regress request handling logic. (commit: ff9ebf2)
- Plug ThreadPoolFactory into MultiInference request handling logic. (commit: 9a2db1d)
- Add a tflite benchmark for Mobilenet v1 quant (commit: e266822)
- Allow batch size of zero in row format JSON (commit: fee9d12)
- Add tests for zero-sized batch (commit: b064c1d)
- Support for MLMD(https://www.tensorflow.org/tfx/guide/mlmd) broadcast in TensorFlow serving. (commit: 4f8d3b7)
- Fix docker based builds (fixes #1596) (commit: ca2e003)
- Fix order dependency in batching_session_test. (commit: 58540f7)
- Split BasicTest in machine_learning_metadata_test into multiple test methods without order dependency. (commit: 745c735)
- Revert pinning the version for "com_google_absl". (commit: ff9e950)
- Minimize the diffs between mkl and non-mkl Dockerfiles (commit: e783014)
- Pin "com_google_absl" at the same version(with same patch) with Tensorflow. (commit: f46b88a)
- Update TF Text to v2.2.0. (commit: f8ea95d)
- fix broken web link (commit: 0cb123f)
- Test zero-sized batch with tensors of different shapes (commit: 1f7aebd)
- Test inconsistent batch size between zero and non-zero (commit: 91afd42)
- Fix broken GetModelMetadata request processing (#1612) (commit: c1ce075)
- Adds support for SignatureDefs stored in metadata buffers to tflite sessions (commit: 4867fed)
- Update ICU library to include knowledge of built-in data files. (commit: c32ebd5)
- Add support for version labels to REST API (Fixes #1555). (commit: 3df0362)
- Update TF Text regression model to catch errors thrown from within ops. (commit: 425d596)
- Upgrade to CUDA Version 10.1. (commit: fd5a2a2)
- Migrates profiler_client trace to the new api in tensorflow_model_server_test. (commit: 8d7d1d6)
- Update the testing model for TRT to fix the test. (commit: 28f812d)
- Add release notes for TF Serving 2.2.0 (commit: 54475e6)
- Update bazel version requirement and version used in the docker images to match with TF (3.0.0). (commit: 56854d3)
- Fixes instructions on sample commands to serve a model with docker. (commit: a5cd1ca)
- Change use_tflite_model to prefer_tflite_model to allow multi-tenancy of Tensorflow models with Tensorflow Lite models. (commit: 8589d81)
- Introducing Arena usage to TensorFlow Serving's HTTP handlers. (commit: a33978c)
- Fix tensorflow::errors:* calls, which use StrCat instead of StrFormat (commit: 2c0bcec)
- Instrumentation for BatchingSession: (commit: 3ca9e89)
- adjust error message for incorrect keys of instances object (commit: 83863b8)
- Update rules_pkg to latest (0.2.5) release. (commit: 932358e)
- In batching session, implement the support for 'enable_large_batch_splitting'. (commit: d7c6a65)
- Update version for 2.3.0-rc0 release. (commit: 3af3303)
- Set cuda compute capabilities for
cuda
build config. (commit: 731a34f) - Update version for 2.3.0 release. (commit: 8b4c709)
Thanks to our Contributors
This release contains contributions from many people at Google.
2.3.0-rc0
TensorFlow Serving using TensorFlow 2.3.0-rc2
Major Features and Improvements
- Update TF Text to v2.2.0. (commit: f8ea95d)
Breaking Changes
Bug Fixes and Other Changes
- Add a ThreadPoolFactory abstraction for returning inter- and intra- thread pools, and update PredictRequest handling logic to use the new abstraction. (commit: 8e3a00c)
- Update Dockerfile.devel* with py3.6 installed. (commit: b3f46d4)
- Add more metrics for batching. (commit: f0bd9cf)
- Plug ThreadPoolFactory into Classify request handling logic. (commit: 975f474)
- Plug ThreadPoolFactory into Regress request handling logic. (commit: ff9ebf2)
- Plug ThreadPoolFactory into MultiInference request handling logic. (commit: 9a2db1d)
- Allow batch size of zero in row format JSON (commit: fee9d12)
- Support for MLMD(https://www.tensorflow.org/tfx/guide/mlmd) broadcast in TensorFlow serving. (commit: 4f8d3b7)
- Adds support for SignatureDefs stored in metadata buffers to tflite sessions (commit: 4867fed)
- Add support for version labels to REST API (Fixes #1555). (commit: 3df0362)
- Update TF Text regression model to catch errors thrown from within ops. (commit: 425d596)
- Upgrade to CUDA Version 10.1. (commit: fd5a2a2)
- Migrates profiler_client trace to the new api in tensorflow_model_server_test. (commit: 8d7d1d6)
Thanks to our Contributors
This release contains contributions from many people at Google.
2.2.0
TensorFlow Serving using TensorFlow 2.2.0
Major Features and Improvements
Breaking Changes
Bug Fixes and Other Changes
- This release is based on TensorFlow version 2.2.0
- Add a SourceAdapter that adds a prefix to StoragePath. (commit: f337623)
- Switch users of
tensorflow::Env::Now*()
toEnvTime::Now*()
. (commit: 8a0895e) - Remove SessionBundle support from Predictor. (commit: 2090d67)
- Replace the error_codes.proto references in tf serving. (commit: ab475bf)
- Adds performance guide and documentation for TensorBoard integration (commit: f1e4eb2)
- Remove SessionBundleSourceAdapter as we load Session bundles via (commit: d50aa2b)
- Use SavedModelBundleSourceAdapterConfig instead of (commit: 8ed3cee)
- Update minimum bazel version to 1.2.1. (commit: 1a36026)
- Drop support for beta gRPC APIs. (commit: 13d01fc)
- API spec for httpserver response-streaming (with flow-control). (commit: fd597f0)
- Change Python version to PY3. (commit: 7516746)
- Update Python tests in PY3. (commit: 0cf65d2)
- Upgrade bazel version for Dockerfiles. (commit: e507aa1)
- Change dockerfile for PY3. (commit: 7cbd06e)
- Reduce contention in FastReadDynamicPtr by sharding the ReadPtrs, by default one per CPU. (commit: d3b374b)
- Ensure that all outstanding ReadPtrs are destroyed before allowing a (commit: e41ee40)
- Allow splitting fields from batched session metadata into individual sessions (commit: caf2a92)
- Allow passing ThreadPoolOptions in various Session implementations. (commit: 2b6212c)
- Update bazel version used in the docker images. (commit: 162f729)
- Format error strings correctly in JSON response (Fixes #1600). (commit: 1ff4d31)
- Fix broken GetModelMetadata request processing (#1612) (commit: 55c4037)
- Support Python 3.7 in tensorflow-serving-api package (Fixes #1640) (commit: f775bb2)
- Update ICU library to include knowledge of built-in data files. (commit: 774f248)
- Adds storage.googleapis.com as the primary download location for the ICU, and resets the sha256 to match this archive. (commit: 028d050)