Currently Futhark only uses a reduce-then-scan in the multicore backend, switching to a chained-scan when possible will improve performance and improve performance further for my current work with scan-scatter fusion. Some context for chained-scan can be found here.