use llama.cpp with cuda

The problem may be that it's a 7800XT which handles memory contention by freezing.