Skip to content

Panic-safety issue with Zip specializations #137255

@steffahn

Description

@steffahn

while trying to come up with a more precise formulation of TrustedRandomAccess properties, I wondered

  • what should size_hint’s behavior be on subsequent next_back calls?
    • answer: probably it should keep decrementing size by 1 for each call; i.e. length changes at the back via next_back are tracked in the size information, but __iterator_get_unchecked has no impact on the size
  • then… what should be the size_hint/size/len change when next_back is called but panics!?

That’s where I took a detour reviewing all the relevant code that uses it. And sure enough, this case wasn’t consistently handled.


E.g. this section

if A::MAY_HAVE_SIDE_EFFECT || B::MAY_HAVE_SIDE_EFFECT {
let sz_a = self.a.size();
let sz_b = self.b.size();
// Adjust a, b to equal length, make sure that only the first call
// of `next_back` does this, otherwise we will break the restriction
// on calls to `self.next_back()` after calling `get_unchecked()`.
if sz_a != sz_b {
let sz_a = self.a.size();
if A::MAY_HAVE_SIDE_EFFECT && sz_a > self.len {
for _ in 0..sz_a - self.len {
// since next_back() may panic we increment the counters beforehand
// to keep Zip's state in sync with the underlying iterator source
self.a_len -= 1;
self.a.next_back();
}
debug_assert_eq!(self.a_len, self.len);
}
let sz_b = self.b.size();
if B::MAY_HAVE_SIDE_EFFECT && sz_b > self.len {
for _ in 0..sz_b - self.len {
self.b.next_back();
}
}
}
}

containing in particular

self.a_len -= 1;
self.a.next_back();

– whilst being conditional’d under logic based on self.a.size()
for sure looks like it assumes that a panicking next_back call will always at least decrement the size.

But – funnily enough – the very same code section lives in a next_back impl – at the beginning of that impl – being code that can panic but up to this point nothing happened that would touch self.index or self.len, the information in Zip that determines size_hint.

fn size_hint(&self) -> (usize, Option<usize>) {
let len = self.len - self.index;
(len, Some(len))
}

So okay… this is potentially bad… well… it in fact is bad:


use std::{
    iter::zip,
    panic::{catch_unwind, resume_unwind, AssertUnwindSafe},
};

fn main() {
    let mut panic = true;
    let a = [42, 1337];
    let a2 = zip(
        a.iter().map(|i| {
            println!("reading item {i:>11} from slice");
            if panic {
                panic = false;
                resume_unwind(Box::new(()))
            }
        }),
        &[(); 1],
    ); // inner `zip` of length `1`, but `a` has length `2`:
    // so here inner.len == 1, inner.a_len == 2, inner.a.size() == 2

    let mut i = zip(a2, &[(); 0]); // outer `zip` of length `0`, but `a2` has length `1`
    // so here outer.len == 0, outer.a_len == 1, outer.a.size() == 1

    catch_unwind(AssertUnwindSafe(|| i.next_back())).ok(); // outer’s `next_back` tries to shorten outer.a by 1 element first
    // based on outer.a.size() == inner.size() == 1 > 0 == outer.len
    // which decrements `outer.a_len from 1 to 0`, then recurses into `inner`:

    // inner’s `next_back` tries to shorten inner.a by 1 element first
    // based on inner.a.size() == 2 > 1 == inner.len
    // which which decrements `inner.a_len from 2 to 1`, then calls `next_back` on the `Map<slice::Iter>`
    // shortening the slice iterator by 1 element, now `inner.a.size()` is 1
    // then panicking from the `Map`

    // panic is caught. NEW STATE:
    // ===========================
    // inner.len == 1, inner.a_len == 1, inner.a.size() == 1
    // outer.len == 0, outer.a_len == 0, outer.a.size() == … well … inner.len == 1

    // (note that all .index fields are untouched and still 0, else more accurately
    // the formula would be outer.a.size() == inner.len - inner.index)

    // the IMPORTANT effect: outer.a.size() is unchanged, still larger than outer.len.

    i.next_back(); // <- next line of code, one more `next_back`, no more `panic` necessary…
    // outer’s `next_back` tries to shorten outer.a by 1 element first
    // based on outer.a.size() == inner.size() == 1 > 0 == outer.len
    //                                         !!!!!!!!!!!!!!!!!!!!
    // which decrements `outer.a_len from 0 to …INTEGER UNDERFLOWS… to usize::MAX`, then recurses into `inner`:
    //                                         !!!!!!!!!!!!!!!!!!!!

    // inner’s `next_back` needs no further shortening, based on inner.a.size() == 1 == inner.b.size()
    // (which is the so-far untouched iterator over &[(); 1],)
    // so we reach the main logic which decrements inner.len (1 -> 0) and inner.a_len (1 -> 0),
    // and successfully reads the items at position 0

    // NEW STATE:
    // ==========
    // inner.len == 0, inner.a_len == 0, inner.a.size() == 1 [inner.b.size() == 1]
    // outer.len == 0, outer.a_len == usize::MAX, outer.a.size() == inner.len == 0

    // now … the `Zip` is empty (`outer.len == 0`) – okay really it was never not empty but anyway
    // we poll it from the front now, reaching the side-effects simulation code (linked below)

    println!("-----------------------------------");
    i.next(); // each of these now does the action of
/*          self.index += 1;
            self.len += 1;
            // match the base implementation's potential side effects
            // SAFETY: we just checked that `i` < `self.a.len()`
            unsafe {
                self.a.__iterator_get_unchecked(i); // `i` is previous value of `.index`
            }
*/  // which means for the first call, we get outer.len (0 -> 1) and outer.index(0 -> 1)
    // notably, this code is always executed since the overflow we had
    // disarmed the `self.index < self.a_len` conditional – which is always true with `outer.a_len == usize::MAX`
    // so this time we access `__iterator_get_unchecked(0)` which accesses the 42 again

    i.next(); // and this time it accesses `__iterator_get_unchecked(1)` which accesses the 1337 again
    i.next(); // and this time it accesses `__iterator_get_unchecked(2)` which is out of bounds for the slice :-/
}

(side-effect simulation code mentioned above)

} else if A::MAY_HAVE_SIDE_EFFECT && self.index < self.a_len {
let i = self.index;
// as above, increment before executing code that may panic
self.index += 1;
self.len += 1;
// match the base implementation's potential side effects
// SAFETY: we just checked that `i` < `self.a.len()`
unsafe {
self.a.__iterator_get_unchecked(i);
}
None
} else {

output:

reading item        1337 from slice
reading item          42 from slice
-----------------------------------
reading item          42 from slice
reading item        1337 from slice
reading item   957756800 from slice

And here is a playground version that runs this i.next() in a longer loop up to segfault

@rustbot label T-libs, A-iterators, A-specialization, I-unsound

Metadata

Metadata

Assignees

Labels

A-iteratorsArea: IteratorsA-specializationArea: Trait impl specializationC-bugCategory: This is a bug.I-unsoundIssue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/SoundnessP-highHigh priorityT-libsRelevant to the library team, which will review and decide on the PR/issue.

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions