Skip to content

[BUG] Simultaneously creating a snapshot and updating the repository can potentially trigger an infinite loop to snapshot creation #17531

Open
@kkewwei

Description

@kkewwei

Describe the bug

When I added unit test for the #17488, I found that Simultaneously creating a snapshot and updating the repository can potentially trigger an infinite loop to snapshot creation.

The reason is that when the repository is updated, RepositoriesService will create a new one to replace the old, and close the old one.

But creating a snapshot still depend on the closed repository, which will trigger an infinite when snapshot creation.

if (repositoryMetadataStart.equals(getRepoMetadata(currentState))) {

The condition will always fail.
Image

executeConsistentStateUpdate(createUpdateTask, source, onFailure);

Related component

Storage:Snapshots

To Reproduce

Put the unit test in RepositoriesServiceIT, and run it.

public void testSnapAndRestoreWhenChangeBytesPerSecSetting() throws ExecutionException, InterruptedException {
        // create index
        final InternalTestCluster cluster = internalCluster();
        String indexName = "test-index";
        createIndex(indexName, Settings.builder().put(SETTING_NUMBER_OF_REPLICAS, 0).put(SETTING_NUMBER_OF_SHARDS, 1).build());
        index(indexName, "_doc", "1", Collections.singletonMap("user",
            generateRandomStringArray(1, 1<<19, false, false)));
        flush(indexName);
        IndicesStatsRequest indicesStatsRequest = new IndicesStatsRequest().indices(indexName);
        long indexSize = client().admin().indices().stats(indicesStatsRequest).get().getIndex(indexName).getPrimaries().getStore().getSize().getBytes();
        System.out.println("index size: " + indexSize);

        // create repository
        final String repositoryName = "test-repo";
        //snapshot will cost about 4s
        long costSecond = indexSize/4;
        Settings.Builder repoSettings = Settings.builder().put("location", randomRepoPath()).put("max_snapshot_bytes_per_sec", (costSecond + "b")).put("max_restore_bytes_per_sec", (costSecond + "b"));
        OpenSearchIntegTestCase.putRepositoryWithNoSettingOverrides(
            client().admin().cluster(),
            repositoryName,
            FsRepository.TYPE,
            true,
            repoSettings
        );

        Thread thread = new Thread(() -> {
            String snapshotName = "test-snapshot";
            logger.info("--> starting snapshot");
            long startTime = System.currentTimeMillis();
            CreateSnapshotResponse createSnapshotResponse = client().admin()
                .cluster()
                .prepareCreateSnapshot(repositoryName, snapshotName)
                .setWaitForCompletion(false)
                .setIndices(indexName)
                .get();
            logger.info("--> finishing snapshot");

        });
        thread.start();


        logger.info("--> begin to reset repository");
        costSecond = indexSize;
        repoSettings = Settings.builder().put("location", randomRepoPath()).put("max_snapshot_bytes_per_sec", (costSecond + "b"));
        OpenSearchIntegTestCase.putRepositoryWithNoSettingOverrides(
            client().admin().cluster(),
            repositoryName,
            FsRepository.TYPE,
            true,
            repoSettings
        );
        logger.info("--> finish to reset repository");

        thread.join();
    }

Expected behavior

Fix the infinite loop.

I'm trying to fix it.

Additional Details

No response

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

  • Status

    🆕 New

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions