But isn’t that what “training” is anyway? They train LLM today like that and the database becomes the parameters. You can post train on smaller corpus for purpose-built stuff.