paper is a bit old, but matches current empirical recommandation: a good starting point is the biggest model you can fit at 4 bit
paper is a bit old, but matches current empirical recommandation: a good starting point is the biggest model you can fit at 4 bit