Animals do in fact "string words together", e.g. parrots. You're also misidentifying what "language" is. Language in this context is not just the ability to string word together. Consider a musician, when they learn to play an instrument, they are learning the language of that instrument. Notes are tokens, ensembles are sentences and paragraphs. I'm afraid you're experiencing conformational bias, because every piece of evidence presented to you has been dismissed with things like "stringing together words is not synonymous with intelligence, since my cat can't do that".