toolsjust now
llama.cpp adds 1-bit inference #
New 1-bit quantization runs 70B models on a laptop.
Local inference just got dramatically cheaper for indie builders.
The whole AI world, one second per item.
New 1-bit quantization runs 70B models on a laptop.
Local inference just got dramatically cheaper for indie builders.