Can someone explain how he is running inference at decent speeds on a CPU or integrated mobile gpu? This seems to be the most important part that he just fails to mention anything about.