Skip to content

Dynamic core pinning #233

@sc2682cornell

Description

@sc2682cornell

Hi, I am trying to implement dynamic core management where I can set which cores each process runs on at runtime. At this point, I have a fixed policy that whenever I see some functions, I migrate the entire process from one set of cores to another. i.e., I hope to run function X on core set B, while the rest of the program on core set A.

As there is already a field called mask for each process, I first add a new filed called newmask for each process, which represents the new set of cores this process can run on (so mask will be A and newmask will be B in the example). I then add two zsim hooks at the beginning and end of function X (in the source codes), which denote the start and end of this migration. FInally I try to add support of process migration in scheduler.h. I followed sync() in scheduler.h that handles context switches, and my code looks like this (I have another migrateBack() function that is similar to this one):

uint32_t migrate(uint32_t pid, uint32_t tid) { // migrate this thread to another set of cores futex_lock(&schedLock); uint32_t gid = getGid(pid, tid); assert((gidMap.find(gid) != gidMap.end())); ThreadInfo* th = gidMap[gid]; ContextInfo* ctx = &contexts[th->cid]; zinfo->cores[th->cid]->leave(); deschedule(th, ctx, QUEUED); freeList.push_back(ctx); th->updateMask(); // update the core mask of this thread to newmask ctx = schedThread(th); if (ctx) { schedule(th, ctx); zinfo->cores[ctx->cid]->join(); bar.join(ctx->cid, &schedLock); info("switched to core %u", ctx->cid); } else { runQueue.push_back(th); waitForContext(th); } assert(th->state == RUNNING); return th->cid; }

However, I am getting ACCESS_INVALID_ADDRESS after a thread is migrated to a new core, and the exception is thrown from insWindow.schedule(). I'm not sure what this exception means. Is there any memory leak? I am also getting a deadlock when another thread is scheduled to the old core after the previous thread is migrated to a new core, and the deadlock happens in sync() function. I'm not sure if the second problem is related to the first one.

I am having trouble debugging what's wrong. I would appreciate any advice that may be helpful. Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions