Skip to content

Commit b31877e

Browse files
committed
Merge branch 'upstream' into concedo_experimental
# Conflicts: # .github/pull_request_template.md # .gitignore # docs/backend/SYCL.md # docs/ops.md # docs/ops/WebGPU.csv # examples/sycl/test.sh # examples/sycl/win-test.bat # ggml/src/ggml-sycl/common.hpp # ggml/src/ggml-sycl/ggml-sycl.cpp # ggml/src/ggml-sycl/sycl_hw.cpp # ggml/src/ggml-sycl/sycl_hw.hpp # ggml/src/ggml-webgpu/ggml-webgpu-shader-lib.hpp # ggml/src/ggml-webgpu/ggml-webgpu.cpp
2 parents c04832b + 9d34231 commit b31877e

1 file changed

Lines changed: 1 addition & 1 deletion

File tree

src/llama-quant.cpp

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1285,7 +1285,7 @@ static void llama_model_quantize_impl(const std::string & fname_inp, const std::
12851285
llama_model_quantize_params llama_model_quantize_default_params() {
12861286
llama_model_quantize_params result = {
12871287
/*.nthread =*/ 0,
1288-
/*.ftype =*/ LLAMA_FTYPE_MOSTLY_Q5_1,
1288+
/*.ftype =*/ LLAMA_FTYPE_MOSTLY_Q8_0,
12891289
/*.output_tensor_type =*/ GGML_TYPE_COUNT,
12901290
/*.token_embedding_type =*/ GGML_TYPE_COUNT,
12911291
/*.allow_requantize =*/ false,

0 commit comments

Comments
 (0)