Since my comment, I remembered I had a RK3588 board, a Rock 5B, and tried llama.cpp CPU over that, and performance was not amazing. But also I realized this is LPDDR4X, so don't get the cheapest RK3558 boards. My Orange Pi 5 is actually worse. This one has LPDDR4. Looking at the rest of Orange Pi's line-up, they don't actually have a board with both LPDDR5 and 32GB, only 16GB or LPDDR4(X).

Using llama-bench, and Llama 2 7B Q4_0 like https://github.com/ggml-org/llama.cpp/discussions/10879 how does yours compare? Cuz I'm also comparing it with a few a few Ryzen 5 3000 Series mini-pcs for less than 150$, and that gets 8 t/s on this list and I've gotten myself

With my Rock 5B and this bench, I get 3.65 t/s. On my Orange Pi 5 (not B) 8GB LPDDR4 (not X), I get 2.44 t/s.