Refresh aspired servables/versions following config update #1518

njhill · 2019-12-18T18:19:17Z

Currently when the configured model list is updated via a call to handleReloadConfigRequest, the request thread blocks until any newly added models become available.

Their availability however depends on the filesystem polling thread rescanning the filesystem at some periodic interval, meaning that there's an arbitrary delay before the requested changes actually take effect and the RPC returns.

This problem may not be very noticeable with the default polling interval of 1 second, but seems undesirable for longer intervals and in particular makes API-based dynamic reconfiguration incompatible with the --file_system_poll_wait_seconds=0 setting (in this case all handleReloadConfigRequest calls time-out and do not take effect).

Fixes #1519

Currently when the configured model list is updated via a call to handleReloadConfigRequest, the request thread blocks until any newly added models become available. Their availability however depends on the filesystem polling thread rescanning the filesystem at some periodic interval, meaning that there's an arbitrary delay before the requested changes actually take effect and the RPC returns. This problem may not be very noticeable with the default polling interval of 1 second, but seems undesirable for longer intervals and in particular makes API-based dynamic reconfiguration incompatible with the --file_system_poll_wait_seconds=0 setting (in this case all handleReloadConfigRequest calls time-out and do not take effect).

njhill · 2019-12-18T18:22:00Z

I have opened this against 1.15 since that's the version we are using, but can rebase on a different branch if needed.

Also apologies in advance for the code, I am not very familiar with C++.

tensorflow_serving/sources/storage_path/file_system_storage_path_source.cc

christisg

Can we add unit test coverage?

njhill · 2020-02-12T22:52:42Z

Thanks @christisg, I've pushed a commit to address your logging comment. I will aim to add unit test coverage when I get a chance... it will take me a bit longer due to unfamiliarity with C++ and the codebase/test framework.

astleychen · 2020-02-20T03:22:05Z

@njhill thanks for reporting this bug. It's painful and took me more than 3 hrs to figure out REAL behavior when setting --file_system_poll_wait_seconds=0 to mitigate GCS bucket class A/B operation request calls in polling. Hope we can see your fixes soon. :)

netfs · 2020-11-18T04:54:29Z

@njhill do you want wrap this PR by adding unit-test as requested by the reviewer?

thanks!

njhill · 2020-11-18T19:25:04Z

@netfs apologies for letting this lag. I am not sure when I will realistically have a chance to do this since I'm especially busy right now and not very familiar with C++ or the testing setup so it would take me a decent chunk of time to do.

Any help with that part would be appreciated!

netfs · 2020-11-18T19:33:19Z

no worries @njhill.

@astleychen do you want to help here and add tests?

googlebot added the cla: yes label Dec 18, 2019

njhill mentioned this pull request Dec 18, 2019

Aspired servables/versions aren't refreshed following config updates #1519

Closed

Minor rearrangement of logic and rename var for readability

0ef7c5a

njhill requested review from nrobeR and christisg and removed request for nrobeR January 15, 2020 23:56

christisg reviewed Feb 6, 2020

View reviewed changes

tensorflow_serving/sources/storage_path/file_system_storage_path_source.cc Show resolved Hide resolved

christisg reviewed Feb 6, 2020

View reviewed changes

Split logging logic into separate method per @christisg's suggestion

eaba886

njhill mentioned this pull request Apr 13, 2020

Support multi-model serving and container sharing kserve/kserve#773

Closed

gkumbhat mentioned this pull request Jan 22, 2021

Add reconfig poll patch #1801

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refresh aspired servables/versions following config update #1518

Refresh aspired servables/versions following config update #1518

Uh oh!

njhill commented Dec 18, 2019 •

edited

Loading

Uh oh!

njhill commented Dec 18, 2019

Uh oh!

Uh oh!

christisg left a comment

Uh oh!

njhill commented Feb 12, 2020

Uh oh!

astleychen commented Feb 20, 2020

Uh oh!

netfs commented Nov 18, 2020

Uh oh!

njhill commented Nov 18, 2020

Uh oh!

netfs commented Nov 18, 2020

Uh oh!

Uh oh!

Refresh aspired servables/versions following config update #1518

Are you sure you want to change the base?

Refresh aspired servables/versions following config update #1518

Uh oh!

Conversation

njhill commented Dec 18, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

njhill commented Dec 18, 2019

Uh oh!

Uh oh!

christisg left a comment

Choose a reason for hiding this comment

Uh oh!

njhill commented Feb 12, 2020

Uh oh!

astleychen commented Feb 20, 2020

Uh oh!

netfs commented Nov 18, 2020

Uh oh!

njhill commented Nov 18, 2020

Uh oh!

netfs commented Nov 18, 2020

Uh oh!

Uh oh!

njhill commented Dec 18, 2019 •

edited

Loading