i wonder if multiple choice is the best method to test this. given the ubiquity of LLMs, perhaps an open ended, free text field would be better. that way you’re forced to define the word as you see fit and the LLM checks?
also, some of these words are actually not good ‘obscure vocabulary’ but trivia crap. overall a bit AI slop and too easy.