Panic-safety issue with `Zip` specializations

while trying to come up with a more precise formulation of `TrustedRandomAccess` properties, I wondered
* what should `size_hint`’s behavior be on subsequent `next_back` calls?
  * answer: probably it should keep decrementing `size` by 1 for each call; i.e. length changes at the back via `next_back` are tracked in the `size` information, but `__iterator_get_unchecked` has no impact on the size
* then… what should be the `size_hint`/`size`/`len` change when `next_back` is called but *panics*!?

That’s where I took a detour reviewing all the relevant code that uses it. And sure enough, this case wasn’t consistently handled.

---

E.g. this section

https://github.com/rust-lang/rust/blob/f44efbf9e11b1b6bba77c046d7dd150de37e0e0f/library/core/src/iter/adapters/zip.rs#L374-L398

containing in particular

https://github.com/rust-lang/rust/blob/f44efbf9e11b1b6bba77c046d7dd150de37e0e0f/library/core/src/iter/adapters/zip.rs#L386-L387

_– whilst being conditional’d under logic based on `self.a.size()` –_
for sure looks like it assumes that a panicking `next_back` call *will* always at least decrement the size.

But – funnily enough – the *very same* code section lives in a `next_back` impl – at the beginning of that impl – being code that **can panic** but up to this point nothing happened that would touch `self.index` or `self.len`, the information in `Zip` that determines `size_hint`.

https://github.com/rust-lang/rust/blob/f44efbf9e11b1b6bba77c046d7dd150de37e0e0f/library/core/src/iter/adapters/zip.rs#L335-L338

So okay… this is potentially bad… well… it in fact **is bad**:

---

```rust
use std::{
    iter::zip,
    panic::{catch_unwind, resume_unwind, AssertUnwindSafe},
};

fn main() {
    let mut panic = true;
    let a = [42, 1337];
    let a2 = zip(
        a.iter().map(|i| {
            println!("reading item {i:>11} from slice");
            if panic {
                panic = false;
                resume_unwind(Box::new(()))
            }
        }),
        &[(); 1],
    ); // inner `zip` of length `1`, but `a` has length `2`:
    // so here inner.len == 1, inner.a_len == 2, inner.a.size() == 2

    let mut i = zip(a2, &[(); 0]); // outer `zip` of length `0`, but `a2` has length `1`
    // so here outer.len == 0, outer.a_len == 1, outer.a.size() == 1

    catch_unwind(AssertUnwindSafe(|| i.next_back())).ok(); // outer’s `next_back` tries to shorten outer.a by 1 element first
    // based on outer.a.size() == inner.size() == 1 > 0 == outer.len
    // which decrements `outer.a_len from 1 to 0`, then recurses into `inner`:

    // inner’s `next_back` tries to shorten inner.a by 1 element first
    // based on inner.a.size() == 2 > 1 == inner.len
    // which which decrements `inner.a_len from 2 to 1`, then calls `next_back` on the `Map<slice::Iter>`
    // shortening the slice iterator by 1 element, now `inner.a.size()` is 1
    // then panicking from the `Map`

    // panic is caught. NEW STATE:
    // ===========================
    // inner.len == 1, inner.a_len == 1, inner.a.size() == 1
    // outer.len == 0, outer.a_len == 0, outer.a.size() == … well … inner.len == 1

    // (note that all .index fields are untouched and still 0, else more accurately
    // the formula would be outer.a.size() == inner.len - inner.index)

    // the IMPORTANT effect: outer.a.size() is unchanged, still larger than outer.len.

    i.next_back(); // <- next line of code, one more `next_back`, no more `panic` necessary…
    // outer’s `next_back` tries to shorten outer.a by 1 element first
    // based on outer.a.size() == inner.size() == 1 > 0 == outer.len
    //                                         !!!!!!!!!!!!!!!!!!!!
    // which decrements `outer.a_len from 0 to …INTEGER UNDERFLOWS… to usize::MAX`, then recurses into `inner`:
    //                                         !!!!!!!!!!!!!!!!!!!!

    // inner’s `next_back` needs no further shortening, based on inner.a.size() == 1 == inner.b.size()
    // (which is the so-far untouched iterator over &[(); 1],)
    // so we reach the main logic which decrements inner.len (1 -> 0) and inner.a_len (1 -> 0),
    // and successfully reads the items at position 0

    // NEW STATE:
    // ==========
    // inner.len == 0, inner.a_len == 0, inner.a.size() == 1 [inner.b.size() == 1]
    // outer.len == 0, outer.a_len == usize::MAX, outer.a.size() == inner.len == 0

    // now … the `Zip` is empty (`outer.len == 0`) – okay really it was never not empty but anyway
    // we poll it from the front now, reaching the side-effects simulation code (linked below)

    println!("-----------------------------------");
    i.next(); // each of these now does the action of
/*          self.index += 1;
            self.len += 1;
            // match the base implementation's potential side effects
            // SAFETY: we just checked that `i` < `self.a.len()`
            unsafe {
                self.a.__iterator_get_unchecked(i); // `i` is previous value of `.index`
            }
*/  // which means for the first call, we get outer.len (0 -> 1) and outer.index(0 -> 1)
    // notably, this code is always executed since the overflow we had
    // disarmed the `self.index < self.a_len` conditional – which is always true with `outer.a_len == usize::MAX`
    // so this time we access `__iterator_get_unchecked(0)` which accesses the 42 again

    i.next(); // and this time it accesses `__iterator_get_unchecked(1)` which accesses the 1337 again
    i.next(); // and this time it accesses `__iterator_get_unchecked(2)` which is out of bounds for the slice :-/
}
```

(side-effect simulation code mentioned above)

https://github.com/rust-lang/rust/blob/f44efbf9e11b1b6bba77c046d7dd150de37e0e0f/library/core/src/iter/adapters/zip.rs#L318-L329

output:
```
reading item        1337 from slice
reading item          42 from slice
-----------------------------------
reading item          42 from slice
reading item        1337 from slice
reading item   957756800 from slice
```


And [here is a playground version](https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=14bf7a016f410203bdab0237c748c458) that runs this `i.next()` in a longer loop up to segfault

@rustbot label T-libs, A-iterators, A-specialization, I-unsound

	if A::MAY_HAVE_SIDE_EFFECT \|\| B::MAY_HAVE_SIDE_EFFECT {
	let sz_a = self.a.size();
	let sz_b = self.b.size();
	// Adjust a, b to equal length, make sure that only the first call
	// of `next_back` does this, otherwise we will break the restriction
	// on calls to `self.next_back()` after calling `get_unchecked()`.
	if sz_a != sz_b {
	let sz_a = self.a.size();
	if A::MAY_HAVE_SIDE_EFFECT && sz_a > self.len {
	for _ in 0..sz_a - self.len {
	// since next_back() may panic we increment the counters beforehand
	// to keep Zip's state in sync with the underlying iterator source
	self.a_len -= 1;
	self.a.next_back();
	}
	debug_assert_eq!(self.a_len, self.len);
	}
	let sz_b = self.b.size();
	if B::MAY_HAVE_SIDE_EFFECT && sz_b > self.len {
	for _ in 0..sz_b - self.len {
	self.b.next_back();
	}
	}
	}
	}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Panic-safety issue with `Zip` specializations #137255

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	fn size_hint(&self) -> (usize, Option<usize>) {
	let len = self.len - self.index;
	(len, Some(len))
	}

	} else if A::MAY_HAVE_SIDE_EFFECT && self.index < self.a_len {
	let i = self.index;
	// as above, increment before executing code that may panic
	self.index += 1;
	self.len += 1;
	// match the base implementation's potential side effects
	// SAFETY: we just checked that `i` < `self.a.len()`
	unsafe {
	self.a.__iterator_get_unchecked(i);
	}
	None
	} else {

Panic-safety issue with Zip specializations #137255

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Panic-safety issue with `Zip` specializations #137255