I read this a few weeks ago, and was inspired to do some experiments of my own on compiler explorer, and found the following interesting things:

  // This compiles down to a memmove() call
  let my_vec: Vec<_> = my_vec.into_iter().skip(n).collect();

  // this results in significantly smaller machine code than `v.retain(f)`
  v.into_iter().filter(f).collect();
This was all with -C opt-level=2. I only looked at generated code size, didn't have time to benchmark any of these.