There's just a certain amount of things the image encoder can process at once. It's pretty apparent when you give the models a big table in an image.
There's just a certain amount of things the image encoder can process at once. It's pretty apparent when you give the models a big table in an image.
But isn't this basically what the conv layer does...?