You are describing a garbage in, garbage out problem. However, LLMs introduce a new type of issue, the “valid data in, garbage out” problem. The existence of the former doesn’t make the latter less of an issue.
“Just” checking a million rows is trivial depending on the types of checks you’re running. In any case, you would never want a check which yields false positives and false negatives, since that defeats the entire purpose of the check.