-
Notifications
You must be signed in to change notification settings - Fork 460
Fix race condition in Manager.partitionMigrations #5531
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
in the Manager to new columns in the root and metadata tables. The Manager.partitionMigrations method was changed to scan the root and metadata tables migration column to gather the current migrations. However, if the root and metadata tables are not hosted, then this scan will hang until the tablet locations can be resolved. ScanServerUpgrade11to12TestIT.testScanRefTableCreation has been failing since apache#5416 was merged. The test deletes the scanref table, shuts down the TabletServers and Manager, then restarts them. This leaves the root tablet in a state where it needs to perform recovery. The Manager.partitionMigrations method is called via the StatusThread, and it appears that the Manager starts up and the StatusThread gets hung trying to scan the root tablet (it's waiting for a location). Meanwhile, the root tablet can't be assigned a location because the Manager.tserverStatus map is not populated, which is done from the StatusThread as well. Also modified the IT to set the filesystem to the RawLocalFileSystem so that warnings about missing checksum files were not in the logs.
Seen problems with the way this code works before. Would probably be best to move balancing out of the status thread and give its own thread. Could have a balancing thread for each data level, with this setup each balancing thread would only read the migrations for its level which would avoid trying to read all data levels at once. Also would be good to minimize the dependencies between balancing and TGW as much as possible. Because of fundamental problems in the existing code the fix in this PR may avoid the problem in some situations, but the status thread could still get stuck if things changes after its done its checks. |
Saw #5533 recently and its related to the overall problems w/ this general code. |
if (watchers.size() != 3) { | ||
log.debug("Skipping migration check, not all TabletGroupWatchers are started"); | ||
skipMigrationCheck = true; | ||
} else { | ||
for (TabletGroupWatcher watcher : watchers) { | ||
if (!watcher.isAlive()) { | ||
log.debug("Skipping migration check, not all TabletGroupWatchers are started"); | ||
skipMigrationCheck = true; | ||
break; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a problem with one of the ITs in this case where the TGWs wouldn't be started and it would hang on the scan. It hung every time then suddenly after a few changes it stopped hanging, so I assumed it was fixed. Looking back, I should have still been checking that the TGWs were started...
This is a good catch and looks good to me
// Don't try to check migrations if the Root and Metadata | ||
// tables are not hosted. | ||
boolean skipMigrationCheck = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm thinking we want to do this check everywhere we scan for migrations. This is done in several places in Manager.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be better to structure the code such that it ok if the thread gets stuck trying to read migrations.
if (level == DataLevel.ROOT || level == DataLevel.METADATA) { | ||
final TableId tid = level == DataLevel.ROOT ? SystemTables.ROOT.tableId() | ||
: SystemTables.METADATA.tableId(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This use of DataLevel
doesn't seem correct. See:
accumulo/core/src/main/java/org/apache/accumulo/core/metadata/schema/Ample.java
Lines 82 to 85 in f676217
public enum DataLevel { | |
ROOT(null, null), | |
METADATA(SystemTables.ROOT.tableName(), SystemTables.ROOT.tableId()), | |
USER(SystemTables.METADATA.tableName(), SystemTables.METADATA.tableId()); |
Closing this in favor of a different solution. @keith-turner created #5533 and suggested above a different change in the Manager where balancing is done in its own thread. |
Opened #5537 as a first step in cleaning up some of the code and thread dependencies in the balancing code. |
PR #5416 modified the Manager to move the migrations from an in-memory data structure
in the Manager to new columns in the root and metadata tables. The Manager.partitionMigrations method was changed to scan the root and metadata tables migration column to gather the current migrations. However, if the root and metadata tables are not hosted, then this scan will hang until the tablet locations can be resolved.
ScanServerUpgrade11to12TestIT.testScanRefTableCreation has been failing since #5416 was merged. The test deletes the scanref table, shuts down the TabletServers and Manager, then restarts them. This leaves the root tablet in a state where it needs to perform recovery. The Manager.partitionMigrations method is called via the StatusThread, and it appears that the Manager starts up and the StatusThread gets hung trying to scan the root tablet (it's waiting for a location). Meanwhile, the root tablet can't be assigned a location because the Manager.tserverStatus map is not populated, which is done from the StatusThread as well.
Also modified the IT to set the filesystem to the RawLocalFileSystem so that warnings about missing checksum files were not in the logs.