Seems to me like the last thing you want to do is worry whether the LLM has a large enough context window to keep an eye on all duplicates. So I'd argue to deduplicate directly, where possible.