Skip to content

configuring genai model deployment with model readonly filesystem #4139

Open
dtrawins wants to merge 11 commits intomainfrom
ro-model_path-configuration
Open

configuring genai model deployment with model readonly filesystem #4139
dtrawins wants to merge 11 commits intomainfrom
ro-model_path-configuration

Conversation

@dtrawins
Copy link
Copy Markdown
Collaborator

🛠 Summary

JIRA/Issue if applicable.
Describe the changes.

🧪 Checklist

  • Unit tests added.
  • The documentation updated.
  • Change follows security best practices.
    ``

@dtrawins dtrawins marked this pull request as ready for review April 15, 2026 23:27
Copilot AI review requested due to automatic review settings April 15, 2026 23:27
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates OVMS HF pull-and-start / --task startup flows to support deployments where the model directory is read-only by generating and consuming graph.pbtxt purely in memory instead of writing it to disk.

Changes:

  • Add an in-memory graph export path (GraphExport) and update graph loading logic to accept in-memory graph content.
  • Adjust CLI/config validation and module startup logic to allow --task + --model_path without requiring HF download parameters.
  • Update and add tests to validate “no graph.pbtxt written” behavior in pull-and-start and local model-path task scenarios.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
src/test/test_utils.hpp Adds a new SetUpServer overload supporting --task.
src/test/test_utils.cpp Implements the new SetUpServer overload building argv with --task.
src/test/pull_hf_model_test.cpp Updates tests to validate graph is stored in memory (no graph.pbtxt on disk) in pull-and-start mode.
src/test/llm/llmnode_test.cpp Adds a test ensuring --task + --model_path works without writing graph.pbtxt (read-only model dir).
src/server.cpp Updates module startup flow: skip HF pull when sourceModel is empty and generate graph in memory from model_path.
src/pull_module/hf_pull_model_module.cpp Controls whether graph is written to disk or stored in memory based on server mode.
src/modelmanager.cpp Treats in-memory graph content as satisfying “graph available” checks for Mediapipe startup.
src/mediapipe_internal/mediapipegraphdefinition.cpp Loads graph definition from in-memory content when available.
src/mediapipe_internal/mediapipegraphconfig.cpp Logs in-memory graph content instead of reading from file when available.
src/mediapipe_internal/BUILD Adds Bazel dependency on //src/graph_export:graph_export.
src/graph_export/graph_export.hpp Extends createServableConfig with writeToFile and adds in-memory graph accessors.
src/graph_export/graph_export.cpp Implements process-global in-memory graph storage and conditional file writing.
src/config.cpp Relaxes validation for HF pull-and-start when using --task with --model_path.
src/cli_parser.cpp Prevents model_name from being treated as source_model when model_path is set; preserves explicit --model_path.
src/BUILD Adds Bazel dependency on //src/graph_export:graph_export for server build.

Comment thread src/server.cpp
Comment on lines 381 to 404
if (config.getServerSettings().serverMode == HF_PULL_MODE || config.getServerSettings().serverMode == HF_PULL_AND_START_MODE) {
INSERT_MODULE(HF_MODEL_PULL_MODULE_NAME, it);
START_MODULE(it);
if (!status.ok()) {
return status;
bool needsHfPull = !config.getServerSettings().hfSettings.sourceModel.empty();
if (needsHfPull) {
INSERT_MODULE(HF_MODEL_PULL_MODULE_NAME, it);
START_MODULE(it);
if (!status.ok()) {
return status;
}
auto hfModule = dynamic_cast<const HfPullModelModule*>(it->second.get());
status = hfModule->clone();
// Return from modules only in --pull mode or error, otherwise start the rest of modules
if (config.getServerSettings().serverMode == HF_PULL_MODE || !status.ok())
return status;
} else {
// --task with --model_path: create graph in memory without HF download
GraphExport graphExporter;
const auto& hfSettings = config.getServerSettings().hfSettings;
status = graphExporter.createServableConfig(config.modelPath(), hfSettings, false);
if (!status.ok()) {
SPDLOG_ERROR("Failed to create in-memory graph config: {}", status.string());
return status;
}
SPDLOG_INFO("Graph config created in memory from model_path: {}", config.modelPath());
}
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

startModules() can enter HF_PULL(_AND_START)_MODE with an existing GraphExport in-memory graph still set from a previous run (GraphExport is process-global). Because this function doesn’t clear that state, later graph detection paths may incorrectly treat a graph as available even when graph.pbtxt isn’t present on disk. Consider explicitly calling GraphExport::clearInMemoryGraphContent() at the beginning of the HF_PULL/HF_PULL_AND_START handling (and/or after successful disk write) to prevent stale in-memory graphs from affecting subsequent startups.

Copilot uses AI. Check for mistakes.
Comment thread src/modelmanager.cpp
Comment on lines 232 to 235
std::ifstream ifs(mpConfig.getGraphPath());
if (ifs.is_open()) {
bool graphAvailable = ifs.is_open() || GraphExport::hasInMemoryGraphContent();
if (graphAvailable) {
// Single model with graph.pbtxt, check if user passed model unsupported model parameters in cmd arguments
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ModelManager::startFromConfig() treats GraphExport::hasInMemoryGraphContent() as equivalent to having a graph.pbtxt on disk. Because the in-memory graph is process-global and not tied to mpConfig/modelPath, a stale value could make graphAvailable true and skip the non-graph startup path incorrectly. To avoid incorrect graph detection, gate this on the current server mode/CLI path (only use in-memory graph in the specific HF/task+model_path flow) and/or ensure the in-memory content is cleared deterministically after use.

Copilot uses AI. Check for mistakes.
Comment on lines 63 to +70
Status MediapipeGraphDefinition::validateForConfigFileExistence() {
if (GraphExport::hasInMemoryGraphContent()) {
const std::string& content = GraphExport::getInMemoryGraphContent();
this->chosenConfig = content;
this->mgconfig.setCurrentGraphPbTxtMD5(ovms::FileSystem::getStringMD5(content));
SPDLOG_LOGGER_DEBUG(modelmanager_logger, "Using in-memory graph content for mediapipe graph definition: {}", this->getName());
return StatusCode::OK;
}
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validateForConfigFileExistence() unconditionally prefers GraphExport in-memory content whenever it is set. Since that content is global and not scoped to a particular graph/model, this can cause the wrong graph to be loaded (e.g., if a previous startup/test left in-memory content set) and will also bypass filesystem-based reload semantics. Consider scoping the in-memory graph to the specific startup flow (or at least clearing it after it has been consumed) so normal file-based graphs aren’t shadowed.

Copilot uses AI. Check for mistakes.
Comment on lines +360 to 362
ASSERT_TRUE(ovms::GraphExport::hasInMemoryGraphContent());
std::string graphContents = ovms::GraphExport::getInMemoryGraphContent();
ASSERT_EQ(expectedGraphContents, removeVersionString(graphContents)) << graphContents;
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests rely on GraphExport’s in-memory graph content, but GraphExport uses a process-global static string. Without clearing it in TearDown (or at the start of each test), the in-memory graph can leak into later tests and change their behavior (e.g., later graph existence checks may pass unexpectedly). Add GraphExport::clearInMemoryGraphContent() to the fixture TearDown (and possibly SetUp) to keep tests isolated.

Copilot uses AI. Check for mistakes.
Comment on lines +4543 to +4547
void SetUp() override {
// Rename graph.pbtxt so it's not used from file
if (std::filesystem::exists(graphPath)) {
std::filesystem::rename(graphPath, graphPathRenamed);
}
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetUp() renames graph.pbtxt to graph.pbtxt.bak without handling the case where graph.pbtxt.bak already exists (e.g., from a previous failed run). std::filesystem::rename will throw in that situation and can make the test flaky. Consider removing/uniquifying the backup path or using copy+remove with overwrite semantics so the setup is idempotent.

Copilot uses AI. Check for mistakes.
Comment on lines +4551 to +4559
void TearDown() override {
ovms::Server& server = ovms::Server::instance();
server.setShutdownRequest(1);
if (t && t->joinable())
t->join();
server.setShutdownRequest(0);
// Restore write permissions and rename graph.pbtxt back
RemoveReadonlyFileAttributeFromDir(modelDir);
if (std::filesystem::exists(graphPathRenamed)) {
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixture starts the server in a mode that generates an in-memory graph (GraphExport uses a process-global static string). TearDown() restores filesystem state, but it doesn’t clear the in-memory graph content, which can leak into subsequent tests within the same gtest binary. Call GraphExport::clearInMemoryGraphContent() in TearDown (and/or SetUp) to keep test state isolated.

Copilot uses AI. Check for mistakes.
Comment thread src/test/test_utils.cpp
Comment on lines +895 to +903
char* argv[] = {(char*)"ovms",
(char*)"--model_name",
(char*)modelName,
(char*)"--model_path",
(char*)getGenericFullPathForSrcTest(modelPath).c_str(),
(char*)"--port",
(char*)port.c_str(),
(char*)"--task",
(char*)task};
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this overload, argv stores a pointer from getGenericFullPathForSrcTest(modelPath).c_str(), but that c_str() comes from a temporary std::string that is destroyed at the end of the initializer expression. That leaves argv containing a dangling pointer and can cause undefined behavior in server.start(). Keep the full model path in a named std::string (with lifetime spanning the server.start call) and build argv from stable storage (ideally avoid casting away const on string literals as well).

Copilot uses AI. Check for mistakes.
Comment on lines +118 to 124
if (!writeToFile) {
s_inMemoryGraphContent = pbtxtContent;
return StatusCode::OK;
}
// clang-format on
std::string fullPath = FileSystem::joinPath({directoryPath, "graph.pbtxt"});
return FileSystem::createFileOverwrite(fullPath, pbtxtContent);
Copy link

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When writeToFile==false, s_inMemoryGraphContent is populated, but there is no corresponding clearing when writeToFile==true (or before generating a new graph). This means a previous in-memory graph can persist and later be picked up by GraphExport::hasInMemoryGraphContent(), causing unrelated runs/tests to load stale graph content instead of reading graph.pbtxt from disk. Clear the in-memory content whenever you generate/write a file-based graph (and/or at the start of each server startup) so the in-memory fallback cannot leak across runs.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants