I would use an agent (Codex) for this task: use the Pro model in ChatGPT for deep research and to assemble the information and citations, then have Codex systematically go through the citations with a task list to web search and verify or correct each. Codex can be used like a test suite.