Once you have three layers (i.e. one "hidden" layer) then you can map to arbitrary functions, so a three layer network has the same "power" as an arbitrarily large network.

I'm sure that's what the text book meant, rather than any point about the expense of computing power.