If the above list gives the mistaken impression that flagging is basically random, that's an artifact of the way I cherry-picked the list. The flagging system has problems, for sure, but it's a vital part of how HN's system functions.
If you squint and look closely, though, I think you can detect this in the above list. The weirdest "wtf?" cases of flagging are ones where the threads had a lot of comments and were on the frontpage. That means upvotes won the tug-of-war with flags, as they should have in most of those cases.
Conversely, it you look at the submissions in the list which had 0 comments or very few, it looks to me like most were either spam, low-quality articles, or dupes.
Remember, also, that some flags are just mistakes - the link is easy to fat-finger or misclick, and the UI doesn't provide feedback about that. That's likely to change soon as part of work that tomhow and I are planning.