How are you handling the on device speech pipeline, especially around model size, latency, and accuracy tradeoffs on consumer hardware?