configuring genai model deployment with model readonly filesystem by dtrawins · Pull Request #4139 · openvinotoolkit/model_server

dtrawins · 2026-04-15T10:03:28Z

🛠 Summary

JIRA/Issue if applicable.
Describe the changes.

🧪 Checklist

Unit tests added.
The documentation updated.
Change follows security best practices.
``

…lesystem

…ration

Copilot

Pull request overview

This PR updates OVMS HF pull-and-start / --task startup flows to support deployments where the model directory is read-only by generating and consuming graph.pbtxt purely in memory instead of writing it to disk.

Changes:

Add an in-memory graph export path (GraphExport) and update graph loading logic to accept in-memory graph content.
Adjust CLI/config validation and module startup logic to allow --task + --model_path without requiring HF download parameters.
Update and add tests to validate “no graph.pbtxt written” behavior in pull-and-start and local model-path task scenarios.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
src/test/test_utils.hpp	Adds a new `SetUpServer` overload supporting `--task`.
src/test/test_utils.cpp	Implements the new `SetUpServer` overload building argv with `--task`.
src/test/pull_hf_model_test.cpp	Updates tests to validate graph is stored in memory (no `graph.pbtxt` on disk) in pull-and-start mode.
src/test/llm/llmnode_test.cpp	Adds a test ensuring `--task` + `--model_path` works without writing `graph.pbtxt` (read-only model dir).
src/server.cpp	Updates module startup flow: skip HF pull when `sourceModel` is empty and generate graph in memory from `model_path`.
src/pull_module/hf_pull_model_module.cpp	Controls whether graph is written to disk or stored in memory based on server mode.
src/modelmanager.cpp	Treats in-memory graph content as satisfying “graph available” checks for Mediapipe startup.
src/mediapipe_internal/mediapipegraphdefinition.cpp	Loads graph definition from in-memory content when available.
src/mediapipe_internal/mediapipegraphconfig.cpp	Logs in-memory graph content instead of reading from file when available.
src/mediapipe_internal/BUILD	Adds Bazel dependency on `//src/graph_export:graph_export`.
src/graph_export/graph_export.hpp	Extends `createServableConfig` with `writeToFile` and adds in-memory graph accessors.
src/graph_export/graph_export.cpp	Implements process-global in-memory graph storage and conditional file writing.
src/config.cpp	Relaxes validation for HF pull-and-start when using `--task` with `--model_path`.
src/cli_parser.cpp	Prevents `model_name` from being treated as `source_model` when `model_path` is set; preserves explicit `--model_path`.
src/BUILD	Adds Bazel dependency on `//src/graph_export:graph_export` for server build.

Copilot · 2026-04-15T23:33:12Z

    if (config.getServerSettings().serverMode == HF_PULL_MODE || config.getServerSettings().serverMode == HF_PULL_AND_START_MODE) {
-        INSERT_MODULE(HF_MODEL_PULL_MODULE_NAME, it);
-        START_MODULE(it);
-        if (!status.ok()) {
-            return status;
+        bool needsHfPull = !config.getServerSettings().hfSettings.sourceModel.empty();
+        if (needsHfPull) {
+            INSERT_MODULE(HF_MODEL_PULL_MODULE_NAME, it);
+            START_MODULE(it);
+            if (!status.ok()) {
+                return status;
+            }
+            auto hfModule = dynamic_cast<const HfPullModelModule*>(it->second.get());
+            status = hfModule->clone();
+            // Return from modules only in --pull mode or error, otherwise start the rest of modules
+            if (config.getServerSettings().serverMode == HF_PULL_MODE || !status.ok())
+                return status;
+        } else {
+            // --task with --model_path: create graph in memory without HF download
+            GraphExport graphExporter;
+            const auto& hfSettings = config.getServerSettings().hfSettings;
+            status = graphExporter.createServableConfig(config.modelPath(), hfSettings, false);
+            if (!status.ok()) {
+                SPDLOG_ERROR("Failed to create in-memory graph config: {}", status.string());
+                return status;
+            }
+            SPDLOG_INFO("Graph config created in memory from model_path: {}", config.modelPath());
        }


startModules() can enter HF_PULL(_AND_START)_MODE with an existing GraphExport in-memory graph still set from a previous run (GraphExport is process-global). Because this function doesn’t clear that state, later graph detection paths may incorrectly treat a graph as available even when graph.pbtxt isn’t present on disk. Consider explicitly calling GraphExport::clearInMemoryGraphContent() at the beginning of the HF_PULL/HF_PULL_AND_START handling (and/or after successful disk write) to prevent stale in-memory graphs from affecting subsequent startups.

Copilot · 2026-04-15T23:33:12Z

    std::ifstream ifs(mpConfig.getGraphPath());
-    if (ifs.is_open()) {
+    bool graphAvailable = ifs.is_open() || GraphExport::hasInMemoryGraphContent();
+    if (graphAvailable) {
        // Single model with graph.pbtxt, check if user passed model unsupported model parameters in cmd arguments


ModelManager::startFromConfig() treats GraphExport::hasInMemoryGraphContent() as equivalent to having a graph.pbtxt on disk. Because the in-memory graph is process-global and not tied to mpConfig/modelPath, a stale value could make graphAvailable true and skip the non-graph startup path incorrectly. To avoid incorrect graph detection, gate this on the current server mode/CLI path (only use in-memory graph in the specific HF/task+model_path flow) and/or ensure the in-memory content is cleared deterministically after use.

Copilot · 2026-04-15T23:33:13Z

 Status MediapipeGraphDefinition::validateForConfigFileExistence() {
+    if (GraphExport::hasInMemoryGraphContent()) {
+        const std::string& content = GraphExport::getInMemoryGraphContent();
+        this->chosenConfig = content;
+        this->mgconfig.setCurrentGraphPbTxtMD5(ovms::FileSystem::getStringMD5(content));
+        SPDLOG_LOGGER_DEBUG(modelmanager_logger, "Using in-memory graph content for mediapipe graph definition: {}", this->getName());
+        return StatusCode::OK;
+    }


validateForConfigFileExistence() unconditionally prefers GraphExport in-memory content whenever it is set. Since that content is global and not scoped to a particular graph/model, this can cause the wrong graph to be loaded (e.g., if a previous startup/test left in-memory content set) and will also bypass filesystem-based reload semantics. Consider scoping the in-memory graph to the specific startup flow (or at least clearing it after it has been consumed) so normal file-based graphs aren’t shadowed.

Copilot · 2026-04-15T23:33:13Z

+    ASSERT_TRUE(ovms::GraphExport::hasInMemoryGraphContent());
+    std::string graphContents = ovms::GraphExport::getInMemoryGraphContent();
    ASSERT_EQ(expectedGraphContents, removeVersionString(graphContents)) << graphContents;


These tests rely on GraphExport’s in-memory graph content, but GraphExport uses a process-global static string. Without clearing it in TearDown (or at the start of each test), the in-memory graph can leak into later tests and change their behavior (e.g., later graph existence checks may pass unexpectedly). Add GraphExport::clearInMemoryGraphContent() to the fixture TearDown (and possibly SetUp) to keep tests isolated.

Copilot · 2026-04-15T23:33:13Z

+    void SetUp() override {
+        // Rename graph.pbtxt so it's not used from file
+        if (std::filesystem::exists(graphPath)) {
+            std::filesystem::rename(graphPath, graphPathRenamed);
+        }


SetUp() renames graph.pbtxt to graph.pbtxt.bak without handling the case where graph.pbtxt.bak already exists (e.g., from a previous failed run). std::filesystem::rename will throw in that situation and can make the test flaky. Consider removing/uniquifying the backup path or using copy+remove with overwrite semantics so the setup is idempotent.

Copilot · 2026-04-15T23:33:13Z

+    void TearDown() override {
+        ovms::Server& server = ovms::Server::instance();
+        server.setShutdownRequest(1);
+        if (t && t->joinable())
+            t->join();
+        server.setShutdownRequest(0);
+        // Restore write permissions and rename graph.pbtxt back
+        RemoveReadonlyFileAttributeFromDir(modelDir);
+        if (std::filesystem::exists(graphPathRenamed)) {


This fixture starts the server in a mode that generates an in-memory graph (GraphExport uses a process-global static string). TearDown() restores filesystem state, but it doesn’t clear the in-memory graph content, which can leak into subsequent tests within the same gtest binary. Call GraphExport::clearInMemoryGraphContent() in TearDown (and/or SetUp) to keep test state isolated.

Copilot · 2026-04-15T23:33:14Z

+    char* argv[] = {(char*)"ovms",
+        (char*)"--model_name",
+        (char*)modelName,
+        (char*)"--model_path",
+        (char*)getGenericFullPathForSrcTest(modelPath).c_str(),
+        (char*)"--port",
+        (char*)port.c_str(),
+        (char*)"--task",
+        (char*)task};


In this overload, argv stores a pointer from getGenericFullPathForSrcTest(modelPath).c_str(), but that c_str() comes from a temporary std::string that is destroyed at the end of the initializer expression. That leaves argv containing a dangling pointer and can cause undefined behavior in server.start(). Keep the full model path in a named std::string (with lifetime spanning the server.start call) and build argv from stable storage (ideally avoid casting away const on string literals as well).

Copilot · 2026-04-15T23:33:14Z

+    if (!writeToFile) {
+        s_inMemoryGraphContent = pbtxtContent;
+        return StatusCode::OK;
+    }
    // clang-format on
    std::string fullPath = FileSystem::joinPath({directoryPath, "graph.pbtxt"});
    return FileSystem::createFileOverwrite(fullPath, pbtxtContent);


When writeToFile==false, s_inMemoryGraphContent is populated, but there is no corresponding clearing when writeToFile==true (or before generating a new graph). This means a previous in-memory graph can persist and later be picked up by GraphExport::hasInMemoryGraphContent(), causing unrelated runs/tests to load stale graph content instead of reading graph.pbtxt from disk. Clear the in-memory content whenever you generate/write a file-based graph (and/or at the start of each server startup) so the in-memory fallback cannot leak across runs.

dtrawins added 11 commits April 13, 2026 17:13

update ov runtime

4e9850c

update

933e65a

update tests

84f9cbb

update tests

2fc1646

review changes

02846f3

style

8167677

Enable generative deployment configuration with models on readonly fi…

eafe795

…lesystem

fix unit test

fdd58ed

Merge remote-tracking branch 'origin/main' into ro-model_path-configu…

d00e994

…ration

add more tests

5916e53

style

e34585a

dtrawins marked this pull request as ready for review April 15, 2026 23:27

Copilot AI review requested due to automatic review settings April 15, 2026 23:27

Copilot started reviewing on behalf of dtrawins April 15, 2026 23:28 View session

Copilot AI reviewed Apr 15, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

configuring genai model deployment with model readonly filesystem #4139

configuring genai model deployment with model readonly filesystem #4139
dtrawins wants to merge 11 commits intomainfrom
ro-model_path-configuration

dtrawins commented Apr 15, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 15, 2026

Uh oh!

Copilot AI Apr 15, 2026

Uh oh!

Copilot AI Apr 15, 2026

Uh oh!

Copilot AI Apr 15, 2026

Uh oh!

Copilot AI Apr 15, 2026

Uh oh!

Copilot AI Apr 15, 2026

Uh oh!

Copilot AI Apr 15, 2026

Uh oh!

Copilot AI Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

dtrawins commented Apr 15, 2026

🛠 Summary

🧪 Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants