> the amount of memory that gets copied is miniscule per process

This is what is not always true. Sometimes you have a processing context that is GB in size, in which case scaling via multiple processes is not as simple as if you had access to threads that share that context by design. You will run out of memory really quickly if you spin up enough processes, and even if you have unlimited RAM you could have used 1/nth the memory. If you implement some method of sharing memory between processes to consume less overall, it will still come at the expense of speed and complexity.

> Forgetting the main class there

In Java 21+ you can omit it:

    void main() {     
        System.out.println("Hello, World!"); 
    }
But this is neither here nor there, I think it's well worth paying that tax. You only type it once, right?