What order of magnitude would you define as „large“ in this case?

like over 1tb.

Some people are using DuckDB for large datasets, https://duckdb.org/docs/stable/guides/performance/working_wi... , but you'd probably do some testing under the specific conditions of your rig to figure out if it is a good match or not.

its clear many DuckDB sql queries can handle terabytes of data, but the question here was about vector search..