configuring genai model deployment with model readonly filesystem #4139
configuring genai model deployment with model readonly filesystem #4139
Conversation
There was a problem hiding this comment.
Pull request overview
This PR updates OVMS HF pull-and-start / --task startup flows to support deployments where the model directory is read-only by generating and consuming graph.pbtxt purely in memory instead of writing it to disk.
Changes:
- Add an in-memory graph export path (
GraphExport) and update graph loading logic to accept in-memory graph content. - Adjust CLI/config validation and module startup logic to allow
--task+--model_pathwithout requiring HF download parameters. - Update and add tests to validate “no
graph.pbtxtwritten” behavior in pull-and-start and local model-path task scenarios.
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| src/test/test_utils.hpp | Adds a new SetUpServer overload supporting --task. |
| src/test/test_utils.cpp | Implements the new SetUpServer overload building argv with --task. |
| src/test/pull_hf_model_test.cpp | Updates tests to validate graph is stored in memory (no graph.pbtxt on disk) in pull-and-start mode. |
| src/test/llm/llmnode_test.cpp | Adds a test ensuring --task + --model_path works without writing graph.pbtxt (read-only model dir). |
| src/server.cpp | Updates module startup flow: skip HF pull when sourceModel is empty and generate graph in memory from model_path. |
| src/pull_module/hf_pull_model_module.cpp | Controls whether graph is written to disk or stored in memory based on server mode. |
| src/modelmanager.cpp | Treats in-memory graph content as satisfying “graph available” checks for Mediapipe startup. |
| src/mediapipe_internal/mediapipegraphdefinition.cpp | Loads graph definition from in-memory content when available. |
| src/mediapipe_internal/mediapipegraphconfig.cpp | Logs in-memory graph content instead of reading from file when available. |
| src/mediapipe_internal/BUILD | Adds Bazel dependency on //src/graph_export:graph_export. |
| src/graph_export/graph_export.hpp | Extends createServableConfig with writeToFile and adds in-memory graph accessors. |
| src/graph_export/graph_export.cpp | Implements process-global in-memory graph storage and conditional file writing. |
| src/config.cpp | Relaxes validation for HF pull-and-start when using --task with --model_path. |
| src/cli_parser.cpp | Prevents model_name from being treated as source_model when model_path is set; preserves explicit --model_path. |
| src/BUILD | Adds Bazel dependency on //src/graph_export:graph_export for server build. |
| if (config.getServerSettings().serverMode == HF_PULL_MODE || config.getServerSettings().serverMode == HF_PULL_AND_START_MODE) { | ||
| INSERT_MODULE(HF_MODEL_PULL_MODULE_NAME, it); | ||
| START_MODULE(it); | ||
| if (!status.ok()) { | ||
| return status; | ||
| bool needsHfPull = !config.getServerSettings().hfSettings.sourceModel.empty(); | ||
| if (needsHfPull) { | ||
| INSERT_MODULE(HF_MODEL_PULL_MODULE_NAME, it); | ||
| START_MODULE(it); | ||
| if (!status.ok()) { | ||
| return status; | ||
| } | ||
| auto hfModule = dynamic_cast<const HfPullModelModule*>(it->second.get()); | ||
| status = hfModule->clone(); | ||
| // Return from modules only in --pull mode or error, otherwise start the rest of modules | ||
| if (config.getServerSettings().serverMode == HF_PULL_MODE || !status.ok()) | ||
| return status; | ||
| } else { | ||
| // --task with --model_path: create graph in memory without HF download | ||
| GraphExport graphExporter; | ||
| const auto& hfSettings = config.getServerSettings().hfSettings; | ||
| status = graphExporter.createServableConfig(config.modelPath(), hfSettings, false); | ||
| if (!status.ok()) { | ||
| SPDLOG_ERROR("Failed to create in-memory graph config: {}", status.string()); | ||
| return status; | ||
| } | ||
| SPDLOG_INFO("Graph config created in memory from model_path: {}", config.modelPath()); | ||
| } |
There was a problem hiding this comment.
startModules() can enter HF_PULL(_AND_START)_MODE with an existing GraphExport in-memory graph still set from a previous run (GraphExport is process-global). Because this function doesn’t clear that state, later graph detection paths may incorrectly treat a graph as available even when graph.pbtxt isn’t present on disk. Consider explicitly calling GraphExport::clearInMemoryGraphContent() at the beginning of the HF_PULL/HF_PULL_AND_START handling (and/or after successful disk write) to prevent stale in-memory graphs from affecting subsequent startups.
| std::ifstream ifs(mpConfig.getGraphPath()); | ||
| if (ifs.is_open()) { | ||
| bool graphAvailable = ifs.is_open() || GraphExport::hasInMemoryGraphContent(); | ||
| if (graphAvailable) { | ||
| // Single model with graph.pbtxt, check if user passed model unsupported model parameters in cmd arguments |
There was a problem hiding this comment.
ModelManager::startFromConfig() treats GraphExport::hasInMemoryGraphContent() as equivalent to having a graph.pbtxt on disk. Because the in-memory graph is process-global and not tied to mpConfig/modelPath, a stale value could make graphAvailable true and skip the non-graph startup path incorrectly. To avoid incorrect graph detection, gate this on the current server mode/CLI path (only use in-memory graph in the specific HF/task+model_path flow) and/or ensure the in-memory content is cleared deterministically after use.
| Status MediapipeGraphDefinition::validateForConfigFileExistence() { | ||
| if (GraphExport::hasInMemoryGraphContent()) { | ||
| const std::string& content = GraphExport::getInMemoryGraphContent(); | ||
| this->chosenConfig = content; | ||
| this->mgconfig.setCurrentGraphPbTxtMD5(ovms::FileSystem::getStringMD5(content)); | ||
| SPDLOG_LOGGER_DEBUG(modelmanager_logger, "Using in-memory graph content for mediapipe graph definition: {}", this->getName()); | ||
| return StatusCode::OK; | ||
| } |
There was a problem hiding this comment.
validateForConfigFileExistence() unconditionally prefers GraphExport in-memory content whenever it is set. Since that content is global and not scoped to a particular graph/model, this can cause the wrong graph to be loaded (e.g., if a previous startup/test left in-memory content set) and will also bypass filesystem-based reload semantics. Consider scoping the in-memory graph to the specific startup flow (or at least clearing it after it has been consumed) so normal file-based graphs aren’t shadowed.
| ASSERT_TRUE(ovms::GraphExport::hasInMemoryGraphContent()); | ||
| std::string graphContents = ovms::GraphExport::getInMemoryGraphContent(); | ||
| ASSERT_EQ(expectedGraphContents, removeVersionString(graphContents)) << graphContents; |
There was a problem hiding this comment.
These tests rely on GraphExport’s in-memory graph content, but GraphExport uses a process-global static string. Without clearing it in TearDown (or at the start of each test), the in-memory graph can leak into later tests and change their behavior (e.g., later graph existence checks may pass unexpectedly). Add GraphExport::clearInMemoryGraphContent() to the fixture TearDown (and possibly SetUp) to keep tests isolated.
| void SetUp() override { | ||
| // Rename graph.pbtxt so it's not used from file | ||
| if (std::filesystem::exists(graphPath)) { | ||
| std::filesystem::rename(graphPath, graphPathRenamed); | ||
| } |
There was a problem hiding this comment.
SetUp() renames graph.pbtxt to graph.pbtxt.bak without handling the case where graph.pbtxt.bak already exists (e.g., from a previous failed run). std::filesystem::rename will throw in that situation and can make the test flaky. Consider removing/uniquifying the backup path or using copy+remove with overwrite semantics so the setup is idempotent.
| void TearDown() override { | ||
| ovms::Server& server = ovms::Server::instance(); | ||
| server.setShutdownRequest(1); | ||
| if (t && t->joinable()) | ||
| t->join(); | ||
| server.setShutdownRequest(0); | ||
| // Restore write permissions and rename graph.pbtxt back | ||
| RemoveReadonlyFileAttributeFromDir(modelDir); | ||
| if (std::filesystem::exists(graphPathRenamed)) { |
There was a problem hiding this comment.
This fixture starts the server in a mode that generates an in-memory graph (GraphExport uses a process-global static string). TearDown() restores filesystem state, but it doesn’t clear the in-memory graph content, which can leak into subsequent tests within the same gtest binary. Call GraphExport::clearInMemoryGraphContent() in TearDown (and/or SetUp) to keep test state isolated.
| char* argv[] = {(char*)"ovms", | ||
| (char*)"--model_name", | ||
| (char*)modelName, | ||
| (char*)"--model_path", | ||
| (char*)getGenericFullPathForSrcTest(modelPath).c_str(), | ||
| (char*)"--port", | ||
| (char*)port.c_str(), | ||
| (char*)"--task", | ||
| (char*)task}; |
There was a problem hiding this comment.
In this overload, argv stores a pointer from getGenericFullPathForSrcTest(modelPath).c_str(), but that c_str() comes from a temporary std::string that is destroyed at the end of the initializer expression. That leaves argv containing a dangling pointer and can cause undefined behavior in server.start(). Keep the full model path in a named std::string (with lifetime spanning the server.start call) and build argv from stable storage (ideally avoid casting away const on string literals as well).
| if (!writeToFile) { | ||
| s_inMemoryGraphContent = pbtxtContent; | ||
| return StatusCode::OK; | ||
| } | ||
| // clang-format on | ||
| std::string fullPath = FileSystem::joinPath({directoryPath, "graph.pbtxt"}); | ||
| return FileSystem::createFileOverwrite(fullPath, pbtxtContent); |
There was a problem hiding this comment.
When writeToFile==false, s_inMemoryGraphContent is populated, but there is no corresponding clearing when writeToFile==true (or before generating a new graph). This means a previous in-memory graph can persist and later be picked up by GraphExport::hasInMemoryGraphContent(), causing unrelated runs/tests to load stale graph content instead of reading graph.pbtxt from disk. Clear the in-memory content whenever you generate/write a file-based graph (and/or at the start of each server startup) so the in-memory fallback cannot leak across runs.
🛠 Summary
JIRA/Issue if applicable.
Describe the changes.
🧪 Checklist
``