Hacker News

Your comment takes 630 bits, the screenshot of your comment on my computer takes 2.1 MB, about 218k times the size. Either this is a compute overhead the LLM has to do before it can think about the meaning of the text, or if it's a E2E feedforward architecture, less thinking about it. This is simple for us because neurons in the retina pre-process its stream so that less than 0.8 % is sent to the visual cortex and because we have evolved to very efficiently and quickly extract meaning from our vision. This is a prime example of the Moravec's paradox.