Skip to content

Add command-line flags to set model memory limit #1561

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion tensorflow_serving/model_servers/main.cc
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,11 @@ int main(int argc, char** argv) {
"EXPERIMENTAL; CAN BE REMOVED ANYTIME! Load and use "
"TensorFlow Lite model from `model.tflite` file in "
"SavedModel directory instead of the TensorFlow model "
"from `saved_model.pb` file.")};
"from `saved_model.pb` file."),
tensorflow::Flag("total_memory_limit_megabytes",
&options.total_model_memory_limit_megabytes,
"Total model memory limit in megabytes"),
};

const auto& usage = tensorflow::Flags::Usage(argv[0], flag_list);
if (!tensorflow::Flags::Parse(&argc, argv, flag_list)) {
Expand Down
6 changes: 5 additions & 1 deletion tensorflow_serving/model_servers/server.cc
Original file line number Diff line number Diff line change
Expand Up @@ -19,8 +19,10 @@ limitations under the License.

#include <iostream>
#include <memory>
#include <limits>
#include <utility>
#include <vector>
#include <algorithm>

#include "google/protobuf/wrappers.pb.h"
#include "grpc/grpc.h"
Expand Down Expand Up @@ -285,7 +287,9 @@ Status Server::BuildAndStart(const Options& server_options) {
options.flush_filesystem_caches = server_options.flush_filesystem_caches;
options.allow_version_labels_for_unavailable_models =
server_options.allow_version_labels_for_unavailable_models;

options.total_model_memory_limit_bytes = std::min(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update total_model_memory_limit_bytes only when server_options.total_model_memory_limit_megabytes is not zero (implying someone changed the value via command line). this ensures that default limit continues to be applied as before. your current change assumes that the default is unit64_max.

((uint64)server_options.total_model_memory_limit_megabytes) << 20,
std::numeric_limits<uint64>::max());
TF_RETURN_IF_ERROR(ServerCore::Create(std::move(options), &server_core_));

// Model config polling thread must be started after the call to
Expand Down
3 changes: 3 additions & 0 deletions tensorflow_serving/model_servers/server.h
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ limitations under the License.
#define TENSORFLOW_SERVING_MODEL_SERVERS_SERVER_H_

#include <memory>
#include <limits>

#include "grpcpp/server.h"
#include "tensorflow/core/kernels/batching_util/periodic_function.h"
Expand Down Expand Up @@ -59,6 +60,8 @@ class Server {
tensorflow::string batching_parameters_file;
tensorflow::string model_name;
tensorflow::int32 max_num_load_retries = 5;
tensorflow::int64 total_model_memory_limit_megabytes =
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set default to 0

std::numeric_limits<uint64>::max() >> 20;
tensorflow::int64 load_retry_interval_micros = 1LL * 60 * 1000 * 1000;
tensorflow::int32 file_system_poll_wait_seconds = 1;
bool flush_filesystem_caches = true;
Expand Down