Ideas

# Ideas

## Segmented Flat Array - Custom Domain

Use the segmented idea from `FlatObjectPool` to allow concurrent operations during resizing.
Can possibly work with distribution as well as layouts.

## FCHLock - Localization

Localize `data` when we get combiner status and perform it on that, then write back later.

## MPMC List - Per Task Blocks

Have `here.maxTaskPar` blocks of data.


## Global Queue using GDT

Idea: Enqueue 'promises' like normal but this time does an `exchange` operation.

```chpl
// Inside 'add' function...
var node = new node();
var prev = tail.exchange(node);
if prev == nil then head.write(node);
else prev.next = node;
```

Of course, using GDT for global objects!

### Active Request Bias

Have separate active request lists so that a locale serves only its own tasks. Possible to define some fairness or starvation-free principles by just giving combiner thread access to another locale that has requests.

### FIFO Extension

Use the above algorithm for obtaining the head index. This eliminates contention and ensures good system throughput (aside for communication cost, which is necessary anyway). Compared to the number of contending Compare-And-Swap operations being O(numLocales), its 0, with drawback of having communication (which is unavoidable).

## Benchmark Framework Tool

Idea: Convert [this](https://github.com/golang/go/blob/master/src/testing/benchmark.go#L567-L649) from Go to Chapel so that benchmarks and tests can be ran with more ease.

## Queue

## Global wait free queue + FCHLock

Idea: Wait-Freedom guarantees that we have a bounded number of remote operations, which ensures forward progress and scalability (eventually). By using global atomics, we can make use of a bounded global queue that is proven to scale very well. Since the queue is only SPMC (without MCAS), we can use the Flat Combining strategy to ensure scalability even for a single a producer. Consumers use the normal wait-free dequeue solution.

Overall this will ensure global wait freedom! The queue itself is wait free, and because the operations to be combined is wait free, the wait for a task to have its operation be completed is bounded assuming that scheduling inside FCHLock is adequate. Overall, this can guarantee scalability for bounded *globally FIFO* queues.

### Round Robin + Work Stealing

Idea: Reduce amount of work stealing by performing a round robin enqueue like FIFO does.

Rationale: Queue has more consumers than producers, means it won't need to steal work too often. Might increase overall throughput of queue.

Drawback: Enqueue does scale but is magnitudes slower than local enqueue. Good only if enqueue is not a huge priority.

## Optimization

### FIFO

#### Allow Enqueues and Dequeues to skip indexes...

Idea: Have enqueue and dequeue solely rely on `fetchAdd`, then have them update another set of global arrays using `fetchAdd` for enqueue or `fetchSub` for dequeue. If during an enqueue, a `fetchAdd` yields a negative, skip it. If during a dequeue, a `fetchSub` yields a negative, skip it.

Rationale: Allows enqueues to be unhindered with only one additional wait-free atomic operation. Allows dequeue to rely on two wait-free atomic operations.

Drawback: Having a large amount of dequeues means you have an equal amount more of enqueues having to retry, but is fair and *should* scale.

## Abstractions

### QueueSet

Idea: Have a set of queues that are linked together that allow work stealing. This is the MPMC queue, but in a *much* reusable manner that allows multiple queues, even multiple local queues, all linked together.

## RCU With Descriptors

Idea: Have descriptors into a locale's address space which adds one more step of indirection.

Rationale: Allows atomic updates of variables, especially with RCU if its for `record` types. A compare-and-swap is just one on descriptors. A read and write are also just on descriptors. RCU can do a read on the descriptor, a copy of what is read and modification into a new descriptor, and a replacement to point to the new one (with reference counting for the old one).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ideas #12

Ideas

Segmented Flat Array - Custom Domain

FCHLock - Localization

MPMC List - Per Task Blocks

Global Queue using GDT

Active Request Bias

FIFO Extension

Benchmark Framework Tool

Queue

Global wait free queue + FCHLock

Round Robin + Work Stealing

Optimization

FIFO

Allow Enqueues and Dequeues to skip indexes...

Abstractions

QueueSet

RCU With Descriptors

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Ideas #12

Description

Ideas

Segmented Flat Array - Custom Domain

FCHLock - Localization

MPMC List - Per Task Blocks

Global Queue using GDT

Active Request Bias

FIFO Extension

Benchmark Framework Tool

Queue

Global wait free queue + FCHLock

Round Robin + Work Stealing

Optimization

FIFO

Allow Enqueues and Dequeues to skip indexes...

Abstractions

QueueSet

RCU With Descriptors

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions