-
Notifications
You must be signed in to change notification settings - Fork 196
Description
Hi, I am trying to implement dynamic core management where I can set which cores each process runs on at runtime. At this point, I have a fixed policy that whenever I see some functions, I migrate the entire process from one set of cores to another. i.e., I hope to run function X on core set B, while the rest of the program on core set A.
As there is already a field called mask for each process, I first add a new filed called newmask for each process, which represents the new set of cores this process can run on (so mask will be A and newmask will be B in the example). I then add two zsim hooks at the beginning and end of function X (in the source codes), which denote the start and end of this migration. FInally I try to add support of process migration in scheduler.h. I followed sync() in scheduler.h that handles context switches, and my code looks like this (I have another migrateBack() function that is similar to this one):
uint32_t migrate(uint32_t pid, uint32_t tid) { // migrate this thread to another set of cores futex_lock(&schedLock); uint32_t gid = getGid(pid, tid); assert((gidMap.find(gid) != gidMap.end())); ThreadInfo* th = gidMap[gid]; ContextInfo* ctx = &contexts[th->cid]; zinfo->cores[th->cid]->leave(); deschedule(th, ctx, QUEUED); freeList.push_back(ctx); th->updateMask(); // update the core mask of this thread to newmask ctx = schedThread(th); if (ctx) { schedule(th, ctx); zinfo->cores[ctx->cid]->join(); bar.join(ctx->cid, &schedLock); info("switched to core %u", ctx->cid); } else { runQueue.push_back(th); waitForContext(th); } assert(th->state == RUNNING); return th->cid; }
However, I am getting ACCESS_INVALID_ADDRESS after a thread is migrated to a new core, and the exception is thrown from insWindow.schedule(). I'm not sure what this exception means. Is there any memory leak? I am also getting a deadlock when another thread is scheduled to the old core after the previous thread is migrated to a new core, and the deadlock happens in sync() function. I'm not sure if the second problem is related to the first one.
I am having trouble debugging what's wrong. I would appreciate any advice that may be helpful. Thanks!