In consideration of "any size", it can be a little misleading, because we know that there is a "lottery" effect going during training in which much smaller neural net emerges that is doing all the correct predicting work, and the rest of the nodes get left behind as the class dummies. It is the winning smaller subgraph that is poisoned.