Awesome, but given the Apple Silicon population and configuration, how does this fare on a M1 with 8GB of total ram? I'd imagine this makes running another llm for tool-calls and inference tough to impossible.