Description
while trying to come up with a more precise formulation of TrustedRandomAccess
properties, I wondered
- what should
size_hint
’s behavior be on subsequentnext_back
calls?- answer: probably it should keep decrementing
size
by 1 for each call; i.e. length changes at the back vianext_back
are tracked in thesize
information, but__iterator_get_unchecked
has no impact on the size
- answer: probably it should keep decrementing
- then… what should be the
size_hint
/size
/len
change whennext_back
is called but panics!?
That’s where I took a detour reviewing all the relevant code that uses it. And sure enough, this case wasn’t consistently handled.
E.g. this section
rust/library/core/src/iter/adapters/zip.rs
Lines 374 to 398 in f44efbf
containing in particular
rust/library/core/src/iter/adapters/zip.rs
Lines 386 to 387 in f44efbf
– whilst being conditional’d under logic based on self.a.size()
–
for sure looks like it assumes that a panicking next_back
call will always at least decrement the size.
But – funnily enough – the very same code section lives in a next_back
impl – at the beginning of that impl – being code that can panic but up to this point nothing happened that would touch self.index
or self.len
, the information in Zip
that determines size_hint
.
rust/library/core/src/iter/adapters/zip.rs
Lines 335 to 338 in f44efbf
So okay… this is potentially bad… well… it in fact is bad:
use std::{
iter::zip,
panic::{catch_unwind, resume_unwind, AssertUnwindSafe},
};
fn main() {
let mut panic = true;
let a = [42, 1337];
let a2 = zip(
a.iter().map(|i| {
println!("reading item {i:>11} from slice");
if panic {
panic = false;
resume_unwind(Box::new(()))
}
}),
&[(); 1],
); // inner `zip` of length `1`, but `a` has length `2`:
// so here inner.len == 1, inner.a_len == 2, inner.a.size() == 2
let mut i = zip(a2, &[(); 0]); // outer `zip` of length `0`, but `a2` has length `1`
// so here outer.len == 0, outer.a_len == 1, outer.a.size() == 1
catch_unwind(AssertUnwindSafe(|| i.next_back())).ok(); // outer’s `next_back` tries to shorten outer.a by 1 element first
// based on outer.a.size() == inner.size() == 1 > 0 == outer.len
// which decrements `outer.a_len from 1 to 0`, then recurses into `inner`:
// inner’s `next_back` tries to shorten inner.a by 1 element first
// based on inner.a.size() == 2 > 1 == inner.len
// which which decrements `inner.a_len from 2 to 1`, then calls `next_back` on the `Map<slice::Iter>`
// shortening the slice iterator by 1 element, now `inner.a.size()` is 1
// then panicking from the `Map`
// panic is caught. NEW STATE:
// ===========================
// inner.len == 1, inner.a_len == 1, inner.a.size() == 1
// outer.len == 0, outer.a_len == 0, outer.a.size() == … well … inner.len == 1
// (note that all .index fields are untouched and still 0, else more accurately
// the formula would be outer.a.size() == inner.len - inner.index)
// the IMPORTANT effect: outer.a.size() is unchanged, still larger than outer.len.
i.next_back(); // <- next line of code, one more `next_back`, no more `panic` necessary…
// outer’s `next_back` tries to shorten outer.a by 1 element first
// based on outer.a.size() == inner.size() == 1 > 0 == outer.len
// !!!!!!!!!!!!!!!!!!!!
// which decrements `outer.a_len from 0 to …INTEGER UNDERFLOWS… to usize::MAX`, then recurses into `inner`:
// !!!!!!!!!!!!!!!!!!!!
// inner’s `next_back` needs no further shortening, based on inner.a.size() == 1 == inner.b.size()
// (which is the so-far untouched iterator over &[(); 1],)
// so we reach the main logic which decrements inner.len (1 -> 0) and inner.a_len (1 -> 0),
// and successfully reads the items at position 0
// NEW STATE:
// ==========
// inner.len == 0, inner.a_len == 0, inner.a.size() == 1 [inner.b.size() == 1]
// outer.len == 0, outer.a_len == usize::MAX, outer.a.size() == inner.len == 0
// now … the `Zip` is empty (`outer.len == 0`) – okay really it was never not empty but anyway
// we poll it from the front now, reaching the side-effects simulation code (linked below)
println!("-----------------------------------");
i.next(); // each of these now does the action of
/* self.index += 1;
self.len += 1;
// match the base implementation's potential side effects
// SAFETY: we just checked that `i` < `self.a.len()`
unsafe {
self.a.__iterator_get_unchecked(i); // `i` is previous value of `.index`
}
*/ // which means for the first call, we get outer.len (0 -> 1) and outer.index(0 -> 1)
// notably, this code is always executed since the overflow we had
// disarmed the `self.index < self.a_len` conditional – which is always true with `outer.a_len == usize::MAX`
// so this time we access `__iterator_get_unchecked(0)` which accesses the 42 again
i.next(); // and this time it accesses `__iterator_get_unchecked(1)` which accesses the 1337 again
i.next(); // and this time it accesses `__iterator_get_unchecked(2)` which is out of bounds for the slice :-/
}
(side-effect simulation code mentioned above)
rust/library/core/src/iter/adapters/zip.rs
Lines 318 to 329 in f44efbf
output:
reading item 1337 from slice
reading item 42 from slice
-----------------------------------
reading item 42 from slice
reading item 1337 from slice
reading item 957756800 from slice
And here is a playground version that runs this i.next()
in a longer loop up to segfault
@rustbot label T-libs, A-iterators, A-specialization, I-unsound