In practice, very short texts don't carry very high value so watermarking is (usually) less important. For longer text false positives are not an issue at all since you have a large amount of data to extract your signal from.