The base model is Qwen2.5-VL-3B and the announcement says a limitation is "Model can suffer from hallucination"
Seems a bit scary that the "source" text from the pdfs could actually be hallucinated.
Given that input is image and not raw pdf, its not completely unexpected
Seems a bit scary that the "source" text from the pdfs could actually be hallucinated.
Given that input is image and not raw pdf, its not completely unexpected