Sure, I like using LLMs in this way, and it often shows that it's very important to verify, because often a claim is "sourced" by what appears to be more of a fuzzy text or semantic match, sometimes even ignoring logical negations.
Especially in niche subjects.
For factual claims, I've fared better with Wikipedia and looking up the sources linked there.
Anyway, as AI text and media generation erodes the credibility of all online sources, these questions about source checking matter less and less: what if the source itself is a long and convincing-sounding text with poor sources?
This problem existed before already, but it boils down to a simple fact:
logic or maths alone cannot derive an authority that verifies claims about the real world other than weighting texts.
The question "what is the current population if Paris" can be answered by LLMs, but basically only by weighting sources, and assigning some credibility to them.
There's no real point in getting some weighted average of sources on this question, but so far, it doesn't hurt either.