I mean that's the dirty secret of any RAG chatbot. The concept of "grounding" is arbitrary. It doesn't matter if you use embeddings, or use a tool that uses your usual search and gets the top items, like most web search tools or google's. Is still relies on the model to not hallucinate given this info, which is very hard since too much info -> model gets confused, but too little info -> model assumes the info might not be there so useless. The fine balance depends on the user's query, and all approaches like score cutoff for embeddings etc just don't generalize.

This is the same exact problem in coding assistants when they hallucinate functions or cannot find the needed dependencies etc.

There are better and more complex approaches that use multiple agents to summarize different smaller queries and then iteratively buildup etc, internally we and a lot of companies have them, but for external customer queries, way too expensive. You can't spend 30 cents on every query