Skip to content

Conversation

@CodexRaunak
Copy link
Contributor

This PR fixes #481

Description

  • Added integration test where meshsync handles ReSync request and publishes cluster data back
  1. Firstly the Test publishes the ReSync request using the same br broker.
  2. MeshSync’s ListenToRequests() goroutine receives the message → sees Entity == resync-discovery → sends a struct into the ReSync channel
  3. calls debouncedRestartDiscovery
  4. Then we receives those objects on the out channel, then we calls unmarshalObject() to check for Kubernetes resource
  5. Final Handler checks the count.

Signed commits

  • Yes, I signed my commits.

@welcome
Copy link

welcome bot commented Dec 1, 2025

Yay, your first pull request! 👍 A contributor will be by to give feedback soon. In the meantime, please review the Meshery Contributors' Welcome Guide and sure to join the community Slack.
Be sure to double-check that you have signed your commits. Here are instructions for making signing an implicit activity while peforming a commit.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @CodexRaunak, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the integration test suite by introducing a new test case that specifically validates the meshsync component's behavior when a ReSync request is issued. The test ensures that meshsync correctly responds to such requests by re-discovering and publishing current Kubernetes cluster data, thereby confirming its resilience and data consistency mechanisms.

Highlights

  • Integration Test Coverage: Added a new integration test to verify meshsync's handling of ReSync requests.
  • ReSync Request Handling: The new test confirms that meshsync correctly receives a ReSync request, processes it, and subsequently publishes updated cluster data.
  • Broker Interaction Validation: The test simulates publishing a ReSync request via the NATS broker and then listens for and validates the subsequent data publications from meshsync.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds an integration test to verify that meshsync correctly handles a ReSync request and republishes cluster data. The changes look good and the test case is a valuable addition. I've provided a few suggestions to improve maintainability and fix a minor bug in the new test case. Specifically, I've recommended refactoring the use of a global variable for the broker instance in favor of explicit dependency passing, fixing an issue where the test was logging incorrect data, and replacing magic numbers with named constants for better readability.

@CodexRaunak CodexRaunak force-pushed the integration-test-coverage branch from 5e5da85 to a500f92 Compare December 2, 2025 21:44
@CodexRaunak
Copy link
Contributor Author

@n2h9 can we have a review on this. Added a integration test where meshsync handles ReSync request and publishes cluster data back.
Thank You

Copy link
Contributor

@n2h9 n2h9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution @CodexRaunak 🙇‍♀️ .

  1. It could be that the implemented test is not testing the resync functionality, but the initial data produced by meshsync. The code reads from out channel, this channel is being populated even before call to resync. It is how meshsync works, it starts, discovers k8s cluster, and pushes data to the broker. We need to first read up all the messages from the channel, only when there is no more active publishing from meshsync, we send resync-requests and then check that we receive something again. Preferably do some intelligent checks that we received the same or almost the same data.

  2. I think brokerMessageHandler is not the best place for triggering resync, this is a part which handles messages from the broker. We still can use it for both initial messages processing, and after resync is being triggered.

  3. The current tests, neither TestMeshsyncBinaryWithK8sClusterIntegration nor TestMeshsyncLibraryWithK8sClusterCustomBrokerIntegration do not provide an option to publish to meshsync topics right now. We will need
    or enhance one (or both) of them (and hence the test case structure as well);
    or write a new top level test;

…nd publishes cluster data back

Signed-off-by: Raunak Madan <[email protected]>
@CodexRaunak CodexRaunak force-pushed the integration-test-coverage branch from a500f92 to 49df6cf Compare December 3, 2025 14:45
@github-actions github-actions bot added the language/go Golang related label Dec 7, 2025
@CodexRaunak
Copy link
Contributor Author

Thanks for this feedback Nikita
I have refactored my approach,

Before triggering resync, as we want to wait for the initial discovery done by the meshsync, there are 2 ways (as of my research) we could check that:

  • Use timers and drain all the data produced by meshsync
  • Use events to listen the initial discovery is completed.
    Using timers is not a very good approach as it's uncertain how much time meshsync can take, and may result to flaky tests.
    @aabidsofi19 also suggested to use events than timers. But when I looked there wasn't any event that tell the broker that the initial discovery has been completed. Therefore I added an event locally in messaging.go
    DiscoveryComplete EventType = "DISCOVERY_COMPLETE".
    This event gets fired up when the initial discovery of meshsync is done. And then we trigger resync.

For the intelligent checks we are comparing the data before resync and the data we got after the resync.
If following this approach we need to modify the broker package in meshkit as well, also becz of it the checks are failing.

Could u please give me review on this approach, I am open for any suggestions.

@CodexRaunak CodexRaunak requested a review from n2h9 December 8, 2025 16:47
@n2h9
Copy link
Contributor

n2h9 commented Dec 10, 2025

Thanks for this feedback Nikita I have refactored my approach,

Before triggering resync, as we want to wait for the initial discovery done by the meshsync, there are 2 ways (as of my research) we could check that:

  • Use timers and drain all the data produced by meshsync
  • Use events to listen the initial discovery is completed.
    Using timers is not a very good approach as it's uncertain how much time meshsync can take, and may result to flaky tests.
    @aabidsofi19 also suggested to use events than timers. But when I looked there wasn't any event that tell the broker that the initial discovery has been completed. Therefore I added an event locally in messaging.go
    DiscoveryComplete EventType = "DISCOVERY_COMPLETE".
    This event gets fired up when the initial discovery of meshsync is done. And then we trigger resync.

For the intelligent checks we are comparing the data before resync and the data we got after the resync. If following this approach we need to modify the broker package in meshkit as well, also becz of it the checks are failing.

Could u please give me review on this approach, I am open for any suggestions.

Hey @CodexRaunak hello 👋 !

Thank you for you contribution 👍

There are could be couple potential issues with this approach:

  • We add code to the production build which is only used in integration tests.
  • The logic of determination of initial discovery is probably more complicated then the place where DISCOVERY_COMPLETE message is being emitted in the current pr (see below).

There are dynamic informers, which emit events on any updates to tracked resources. The point in this pr where you added DISCOVERY_COMPLETE message emission, as far as I understand, is a point where dynamic informers are set up. It does not necessarily mean that all "initial events" are already published.
I think this coincide for our test cluster with 1 deployment and 3 pods, could not be the case for a real bigger cluster.

A possible approach could be to perform this determination only on the integration test side.
You could think about to consider a debounce pattern: set up a small timeout and reset it on every message received, until it eventually gets triggered. When it is triggered assume that initial discovery is finished.

@CodexRaunak
Copy link
Contributor Author

A possible approach could be to perform this determination only on the integration test side. You could think about to consider a debounce pattern: set up a small timeout and reset it on every message received, until it eventually gets triggered. When it is triggered assume that initial discovery is finished.

Can u re-review it,
We are using a debounce timer of 1sec, and then triggering resync

@codecov
Copy link

codecov bot commented Dec 13, 2025

Codecov Report

❌ Patch coverage is 0% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 7.31%. Comparing base (eed65e7) to head (1bbfe98).
⚠️ Report is 39 commits behind head on master.

Files with missing lines Patch % Lines
meshsync/handlers.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff            @@
##           master    #493      +/-   ##
=========================================
- Coverage    7.32%   7.31%   -0.01%     
=========================================
  Files          35      35              
  Lines        1762    1764       +2     
=========================================
  Hits          129     129              
- Misses       1623    1625       +2     
  Partials       10      10              
Flag Coverage Δ
unittests 7.31% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Raunak Madan <[email protected]>
Copy link
Contributor

@n2h9 n2h9 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a good progress here 👍

left a couple comments
thank you 🙇‍♀️


if msg.Object != nil {
kr, err := unmarshalObject(msg.Object)
if err == nil && kr.Kind != "" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should handle unmarshaling error, how we do it in other test cases (for both discovery and after resync cases).


// Stop once threshold reached
if len(afterResync) >= resyncSuccessThreshold {
goto DONE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the purpose of this goto is to break from the loop, in that case it is more common to put a label on the loop and then inside the loop use break <label>, check: Effective Go#switch (scroll down a bit to "break out of a surrounding loop").

Also, I think, we have a single loop, no nested loops, so simple break should do the thing as well.


resyncRequested := false

for msg := range out {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current approach mixes in together "discovery", "after discovery" stages, which looks to me as independent and sequential.
The triggering of resync messages does not depend on the messages, probably should be outside of the loop.

I suggest to break this loop into 3 sequential pieces:

  • discovery stage: reading from out, handling debounce logic, when debounce condition is met, stops reading from out;
  • trigger resync message;
  • after the discovery stage: reading from out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

language/go Golang related

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[integration tests] Add integration test for the case when meshsync receive messages from broker

2 participants