The auto labeling work (which has been partially described/presented at Tesla AI day events) seems more like engineering than research, a grab bag of techniques that I would guess the whole team must have contributed to. For example, they auto label low resolution/indeterminate objects (image segments) by temporal continuity... Something that is a low-res blob in the distance becomes a hi-res and easy to identify object when you drive by it, so by tracking objects backwards across frames you can learn how to more confidently label the lo-res blob. Things like this are useful, but it's the sort of stuff that engineers and developers are coming up with every day.
Not back in 2016.
You don't think that tracking objects from frame to frame is obvious ?!
I can guarantee you this was built-in from day #1
I'm guessing you're not a developer if you don't then automatically think of end cases like "what if car # 1 isn't in the preceding frame" ... (then you look at some relevant test data and see it was there, unlabelled ...)