-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Description
Version: 0.7.1
Describe the Bug
Using the Docker MCP Toolkit doesn't work. All tools fail with a tool calling error saying "Transport closed".
Steps to Reproduce
-
Install Docker Desktop and add some MCP servers (e.g. "Fetch (Reference)") in the MCP Toolkit section.
-
Connect Claude Desktop or Cursor to the Docker MCP Toolkit and make sure the servers work by having the model list the top 3 quotes from the URL https://quotes.toscrape.com/ in a chat.
-
Connect Jan to the Docker MCP Toolkit and enable it. JSON of the MCP server should look like this:
{
"Docker MCP Toolkit": {
"command": "docker",
"args": [
"mcp",
"gateway",
"run"
],
"env": {},
"type": "stdio",
"active": true
}
}
-
In Jan → Settings → Model Providers → Your model, make sure "Tools" is enabled under "Capabilities".
-
Create a new chat with the prompt "List me the top 3 quotes from here: https://quotes.toscrape.com/".
-
Model launches the fetch tool with the URL as argument and fails with the following output:
{
"text": "Error calling tool fetch: Transport closed",
"type": "text"
}
Log
[2025-10-13][06:19:55][tauri_plugin_llamacpp::gguf::commands][INFO] modelSize: 2263241856
[2025-10-13][06:19:55][tauri_plugin_llamacpp::gguf::commands][INFO] Using ctx_size: 8192
[2025-10-13][06:19:55][tauri_plugin_llamacpp::gguf::utils][INFO] Received ctx_size parameter: Some(8192)
[2025-10-13][06:19:55][tauri_plugin_llamacpp::gguf::utils][INFO] Received model metadata:
{"quantize.imatrix.chunks_count": "663", "general.file_type": "30", "quantize.imatrix.dataset": "unsloth_calibration_gemma-3-4b-it.txt", "tokenizer.chat_template": "{{ bos_token }}\n{%- if messages[0]['role'] == 'system' -%}\n {%- if messages[0]['content'] is string -%}\n {%- set first_user_prefix = messages[0]['content'] + '\n\n' -%}\n {%- else -%}\n {%- set first_user_prefix = messages[0]['content'][0]['text'] + '\n\n' -%}\n {%- endif -%}\n {%- set loop_messages = messages[1:] -%}\n{%- else -%}\n {%- set first_user_prefix = \"\" -%}\n {%- set loop_messages = messages -%}\n{%- endif -%}\n{%- for message in loop_messages -%}\n {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) -%}\n {{ raise_exception(\"Conversation roles must alternate user/assistant/user/assistant/...\") }}\n {%- endif -%}\n {%- if (message['role'] == 'assistant') -%}\n {%- set role = \"model\" -%}\n {%- else -%}\n {%- set role = message['role'] -%}\n {%- endif -%}\n {{ '<start_of_turn>' + role + '\n' + (first_user_prefix if loop.first else \"\") }}\n {%- if message['content'] is string -%}\n {{ message['content'] | trim }}\n {%- elif message['content'] is iterable -%}\n {%- for item in message['content'] -%}\n {%- if item['type'] == 'image' -%}\n {{ '<start_of_image>' }}\n {%- elif item['type'] == 'text' -%}\n {{ item['text'] | trim }}\n {%- endif -%}\n {%- endfor -%}\n {%- else -%}\n {{ raise_exception(\"Invalid content type\") }}\n {%- endif -%}\n {{ '<end_of_turn>\n' }}\n{%- endfor -%}\n{%- if add_generation_prompt -%}\n {{'<start_of_turn>model\n'}}\n{%- endif -%}\n", "gemma3.rope.scaling.type": "linear", "tokenizer.ggml.scores": "<Array of type Float32 with 262208 elements, data skipped>", "general.quantization_version": "2", "tokenizer.ggml.token_type": "<Array of type Int32 with 262208 elements, data skipped>", "tokenizer.ggml.pre": "default", "tokenizer.ggml.add_eos_token": "false", "gemma3.block_count": "34", "general.name": "Gemma-3-4B-It", "general.finetune": "it", "general.quantized_by": "Unsloth", "general.size_label": "4B", "gemma3.attention.head_count_kv": "4", "tokenizer.ggml.bos_token_id": "2", "tokenizer.ggml.unknown_token_id": "3", "quantize.imatrix.entries_count": "238", "general.repo_url": "https://huggingface.co/unsloth", "tokenizer.ggml.add_space_prefix": "false", "tokenizer.ggml.padding_token_id": "0", "tokenizer.ggml.eos_token_id": "106", "general.type": "model", "general.basename": "Gemma-3-4B-It", "tokenizer.ggml.model": "llama", "gemma3.embedding_length": "2560", "tokenizer.ggml.tokens": "<Array of type String with 262208 elements, data skipped>", "gemma3.attention.layer_norm_rms_epsilon": "0.000001", "gemma3.context_length": "131072", "gemma3.rope.scaling.factor": "8", "gemma3.attention.head_count": "8", "gemma3.feed_forward_length": "10240", "gemma3.attention.sliding_window": "1024", "gemma3.attention.key_length": "256", "gemma3.attention.value_length": "256", "gemma3.rope.freq_base": "1000000", "quantize.imatrix.file": "gemma-3-4b-it-GGUF/imatrix_unsloth.dat", "general.architecture": "gemma3", "tokenizer.ggml.add_bos_token": "true"}
[2025-10-13][06:19:55][tauri_plugin_llamacpp::gguf::utils][INFO] KV estimates -> sliding: 142606336 bytes (~136.00 MB), full: 1140850688 bytes (~1088.00 MB), middle: 641728512 bytes (~612.00 MB)
[2025-10-13][06:19:55][tauri_plugin_llamacpp::gguf::commands][INFO] isModelSupported: Total memory requirement: 2904970368 for /Users/joedoe/Library/Application Support/Jan/data/llamacpp/models/gemma-3-4b-it-IQ4_XS/model.gguf; Got kvCacheSize: 641728512 from BE
[2025-10-13][06:19:55][tauri_plugin_llamacpp::gguf::commands][INFO] No GPUs detected (likely unified memory system), using total RAM as VRAM
[2025-10-13][06:19:55][tauri_plugin_llamacpp::gguf::commands][INFO] Total VRAM reported/calculated (in bytes): 51539607552
[2025-10-13][06:19:55][tauri_plugin_llamacpp::gguf::commands][INFO] System RAM: 0 bytes
[2025-10-13][06:19:55][tauri_plugin_llamacpp::gguf::commands][INFO] Total VRAM: 51539607552 bytes
[2025-10-13][06:19:55][tauri_plugin_llamacpp::gguf::commands][INFO] Usable total memory: 49251117363 bytes
[2025-10-13][06:19:55][tauri_plugin_llamacpp::gguf::commands][INFO] Usable VRAM: 49251117363 bytes
[2025-10-13][06:19:55][tauri_plugin_llamacpp::gguf::commands][INFO] Required: 2904970368 bytes
[2025-10-13][06:20:00][tauri_plugin_llamacpp::gguf::commands][INFO] modelSize: 2263241856
[2025-10-13][06:20:00][tauri_plugin_llamacpp::gguf::commands][INFO] Using ctx_size: 8192
[2025-10-13][06:20:00][tauri_plugin_llamacpp::gguf::utils][INFO] Received ctx_size parameter: Some(8192)
[2025-10-13][06:20:00][tauri_plugin_llamacpp::gguf::utils][INFO] Received model metadata:
{"quantize.imatrix.chunks_count": "663", "gemma3.feed_forward_length": "10240", "quantize.imatrix.entries_count": "238", "gemma3.block_count": "34", "gemma3.attention.head_count_kv": "4", "quantize.imatrix.dataset": "unsloth_calibration_gemma-3-4b-it.txt", "tokenizer.ggml.add_space_prefix": "false", "gemma3.embedding_length": "2560", "quantize.imatrix.file": "gemma-3-4b-it-GGUF/imatrix_unsloth.dat", "general.name": "Gemma-3-4B-It", "general.size_label": "4B", "tokenizer.ggml.token_type": "<Array of type Int32 with 262208 elements, data skipped>", "general.type": "model", "general.finetune": "it", "gemma3.attention.value_length": "256", "gemma3.rope.freq_base": "1000000", "tokenizer.ggml.pre": "default", "gemma3.rope.scaling.factor": "8", "tokenizer.ggml.unknown_token_id": "3", "tokenizer.ggml.tokens": "<Array of type String with 262208 elements, data skipped>", "gemma3.rope.scaling.type": "linear", "gemma3.attention.sliding_window": "1024", "tokenizer.ggml.model": "llama", "tokenizer.chat_template": "{{ bos_token }}\n{%- if messages[0]['role'] == 'system' -%}\n {%- if messages[0]['content'] is string -%}\n {%- set first_user_prefix = messages[0]['content'] + '\n\n' -%}\n {%- else -%}\n {%- set first_user_prefix = messages[0]['content'][0]['text'] + '\n\n' -%}\n {%- endif -%}\n {%- set loop_messages = messages[1:] -%}\n{%- else -%}\n {%- set first_user_prefix = \"\" -%}\n {%- set loop_messages = messages -%}\n{%- endif -%}\n{%- for message in loop_messages -%}\n {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) -%}\n {{ raise_exception(\"Conversation roles must alternate user/assistant/user/assistant/...\") }}\n {%- endif -%}\n {%- if (message['role'] == 'assistant') -%}\n {%- set role = \"model\" -%}\n {%- else -%}\n {%- set role = message['role'] -%}\n {%- endif -%}\n {{ '<start_of_turn>' + role + '\n' + (first_user_prefix if loop.first else \"\") }}\n {%- if message['content'] is string -%}\n {{ message['content'] | trim }}\n {%- elif message['content'] is iterable -%}\n {%- for item in message['content'] -%}\n {%- if item['type'] == 'image' -%}\n {{ '<start_of_image>' }}\n {%- elif item['type'] == 'text' -%}\n {{ item['text'] | trim }}\n {%- endif -%}\n {%- endfor -%}\n {%- else -%}\n {{ raise_exception(\"Invalid content type\") }}\n {%- endif -%}\n {{ '<end_of_turn>\n' }}\n{%- endfor -%}\n{%- if add_generation_prompt -%}\n {{'<start_of_turn>model\n'}}\n{%- endif -%}\n", "general.repo_url": "https://huggingface.co/unsloth", "tokenizer.ggml.bos_token_id": "2", "gemma3.attention.layer_norm_rms_epsilon": "0.000001", "general.quantization_version": "2", "general.basename": "Gemma-3-4B-It", "tokenizer.ggml.add_eos_token": "false", "gemma3.context_length": "131072", "gemma3.attention.key_length": "256", "gemma3.attention.head_count": "8", "general.architecture": "gemma3", "tokenizer.ggml.scores": "<Array of type Float32 with 262208 elements, data skipped>", "tokenizer.ggml.padding_token_id": "0", "tokenizer.ggml.eos_token_id": "106", "general.quantized_by": "Unsloth", "tokenizer.ggml.add_bos_token": "true", "general.file_type": "30"}
[2025-10-13][06:20:00][tauri_plugin_llamacpp::gguf::utils][INFO] KV estimates -> sliding: 142606336 bytes (~136.00 MB), full: 1140850688 bytes (~1088.00 MB), middle: 641728512 bytes (~612.00 MB)
[2025-10-13][06:20:00][tauri_plugin_llamacpp::gguf::commands][INFO] isModelSupported: Total memory requirement: 2904970368 for /Users/joedoe/Library/Application Support/Jan/data/llamacpp/models/gemma-3-4b-it-IQ4_XS/model.gguf; Got kvCacheSize: 641728512 from BE
[2025-10-13][06:20:00][tauri_plugin_llamacpp::gguf::commands][INFO] No GPUs detected (likely unified memory system), using total RAM as VRAM
[2025-10-13][06:20:00][tauri_plugin_llamacpp::gguf::commands][INFO] Total VRAM reported/calculated (in bytes): 51539607552
[2025-10-13][06:20:00][tauri_plugin_llamacpp::gguf::commands][INFO] System RAM: 0 bytes
[2025-10-13][06:20:00][tauri_plugin_llamacpp::gguf::commands][INFO] Total VRAM: 51539607552 bytes
[2025-10-13][06:20:00][tauri_plugin_llamacpp::gguf::commands][INFO] Usable total memory: 49251117363 bytes
[2025-10-13][06:20:00][tauri_plugin_llamacpp::gguf::commands][INFO] Usable VRAM: 49251117363 bytes
[2025-10-13][06:20:00][tauri_plugin_llamacpp::gguf::commands][INFO] Required: 2904970368 bytes
[2025-10-13][06:20:07][webview:info@asset://localhost/%2FUsers%2Fjoedoe%2FLibrary%2FApplication%20Support%2FJan%2Fdata%2Fextensions%2F%40janhq%2Fllamacpp-extension%2Fdist%2Findex.js:1908:7][INFO] Calling Tauri command llama_load with args: --no-webui,--jinja,-m,/Users/joedoe/Library/Application Support/Jan/data/llamacpp/models/gemma-3-4b-it-IQ4_XS/model.gguf,-a,gemma-3-4b-it-IQ4_XS,--port,3780,--mmproj,/Users/joedoe/Library/Application Support/Jan/data/llamacpp/models/gemma-3-4b-it-IQ4_XS/mmproj.gguf,-ngl,100,--batch-size,2048,--ubatch-size,512,--flash-attn,off,--no-mmap,--ctx-size,8192
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] Attempting to launch server at path: "/Users/joedoe/Library/Application Support/Jan/data/llamacpp/backends/b6673/macos-arm64/build/bin/llama-server"
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] Using arguments: ["--no-webui", "--jinja", "-m", "/Users/joedoe/Library/Application Support/Jan/data/llamacpp/models/gemma-3-4b-it-IQ4_XS/model.gguf", "-a", "gemma-3-4b-it-IQ4_XS", "--port", "3780", "--mmproj", "/Users/joedoe/Library/Application Support/Jan/data/llamacpp/models/gemma-3-4b-it-IQ4_XS/mmproj.gguf", "-ngl", "100", "--batch-size", "2048", "--ubatch-size", "512", "--flash-attn", "off", "--no-mmap", "--ctx-size", "8192"]
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] MMPROJ Path string: /Users/joedoe/Library/Application Support/Jan/data/llamacpp/models/gemma-3-4b-it-IQ4_XS/mmproj.gguf
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] Waiting for model session to be ready...
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_library_init: using embedded metal library
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_library_init: loaded in 0.005 sec
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_device_init: GPU name: Apple M4 Pro
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_device_init: GPU family: MTLGPUFamilyApple9 (1009)
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_device_init: GPU family: MTLGPUFamilyCommon3 (3003)
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_device_init: GPU family: MTLGPUFamilyMetal3 (5001)
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_device_init: simdgroup reduction = true
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_device_init: simdgroup matrix mul. = true
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_device_init: has unified memory = true
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_device_init: has bfloat = true
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_device_init: use residency sets = true
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_device_init: use shared buffers = true
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_device_init: recommendedMaxWorkingSetSize = 38654.71 MB
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] build: 1 (02010ec) with Apple clang version 14.0.0 (clang-1400.0.29.202) for arm64-apple-darwin21.6.0
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] system info: n_threads = 10, n_threads_batch = 10, total_threads = 14
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] system_info: n_threads = 10 (n_threads_batch = 10) / 14 | Metal : EMBED_LIBRARY = 1 | CPU : NEON = 1 | ARM_FMA = 1 | FP16_VA = 1 | MATMUL_INT8 = 1 | DOTPROD = 1 | ACCELERATE = 1 | REPACK = 1 |
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] Web UI is disabled
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] main: binding port with default address family
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] main: HTTP server is listening, hostname: 127.0.0.1, port: 3780, http threads: 13
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] main: loading model
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv load_model: loading model '/Users/joedoe/Library/Application Support/Jan/data/llamacpp/models/gemma-3-4b-it-IQ4_XS/model.gguf'
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_load_from_file_impl: using device Metal (Apple M4 Pro) (unknown id) - 36863 MiB free
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: loaded meta data with 40 key-value pairs and 444 tensors from /Users/joedoe/Library/Application Support/Jan/data/llamacpp/models/gemma-3-4b-it-IQ4_XS/model.gguf (version GGUF V3 (latest))
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: Dumping metadata keys/values. Note: KV overrides do not apply in this output.
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 0: general.architecture str = gemma3
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 1: general.type str = model
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 2: general.name str = Gemma-3-4B-It
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 3: general.finetune str = it
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 4: general.basename str = Gemma-3-4B-It
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 5: general.quantized_by str = Unsloth
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 6: general.size_label str = 4B
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 7: general.repo_url str = https://huggingface.co/unsloth
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 8: gemma3.context_length u32 = 131072
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 9: gemma3.embedding_length u32 = 2560
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 10: gemma3.block_count u32 = 34
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 11: gemma3.feed_forward_length u32 = 10240
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 12: gemma3.attention.head_count u32 = 8
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 13: gemma3.attention.layer_norm_rms_epsilon f32 = 0.000001
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 14: gemma3.attention.key_length u32 = 256
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 15: gemma3.attention.value_length u32 = 256
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 16: gemma3.rope.freq_base f32 = 1000000.000000
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 17: gemma3.attention.sliding_window u32 = 1024
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 18: gemma3.attention.head_count_kv u32 = 4
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 19: gemma3.rope.scaling.type str = linear
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 20: gemma3.rope.scaling.factor f32 = 8.000000
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 21: tokenizer.ggml.model str = llama
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 22: tokenizer.ggml.pre str = default
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 23: tokenizer.ggml.tokens arr[str,262208] = ["<pad>", "<eos>", "<bos>", "<unk>", ...
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 24: tokenizer.ggml.scores arr[f32,262208] = [-1000.000000, -1000.000000, -1000.00...
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 25: tokenizer.ggml.token_type arr[i32,262208] = [3, 3, 3, 3, 3, 4, 3, 3, 3, 3, 3, 3, ...
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 26: tokenizer.ggml.bos_token_id u32 = 2
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 27: tokenizer.ggml.eos_token_id u32 = 106
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 28: tokenizer.ggml.unknown_token_id u32 = 3
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 29: tokenizer.ggml.padding_token_id u32 = 0
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 30: tokenizer.ggml.add_bos_token bool = true
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 31: tokenizer.ggml.add_eos_token bool = false
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 32: tokenizer.chat_template str = {{ bos_token }}\n{%- if messages[0]['r...
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 33: tokenizer.ggml.add_space_prefix bool = false
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 34: general.quantization_version u32 = 2
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 35: general.file_type u32 = 30
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 36: quantize.imatrix.file str = gemma-3-4b-it-GGUF/imatrix_unsloth.dat
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 37: quantize.imatrix.dataset str = unsloth_calibration_gemma-3-4b-it.txt
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 38: quantize.imatrix.entries_count i32 = 238
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - kv 39: quantize.imatrix.chunks_count i32 = 663
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - type f32: 205 tensors
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - type q6_K: 1 tensors
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_model_loader: - type iq4_xs: 238 tensors
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: file format = GGUF V3 (latest)
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: file type = IQ4_XS - 4.25 bpw
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: file size = 2.10 GiB (4.65 BPW)
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load: printing all EOG tokens:
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load: - 106 ('<end_of_turn>')
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load: special tokens cache size = 6415
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load: token to piece cache size = 1.9446 MB
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: arch = gemma3
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: vocab_only = 0
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: n_ctx_train = 131072
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: n_embd = 2560
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: n_layer = 34
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: n_head = 8
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: n_head_kv = 4
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: n_rot = 256
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: n_swa = 1024
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: is_swa_any = 1
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: n_embd_head_k = 256
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: n_embd_head_v = 256
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: n_gqa = 2
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: n_embd_k_gqa = 1024
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: n_embd_v_gqa = 1024
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: f_norm_eps = 0.0e+00
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: f_norm_rms_eps = 1.0e-06
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: f_clamp_kqv = 0.0e+00
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: f_max_alibi_bias = 0.0e+00
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: f_logit_scale = 0.0e+00
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: f_attn_scale = 6.2e-02
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: n_ff = 10240
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: n_expert = 0
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: n_expert_used = 0
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: causal attn = 1
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: pooling type = 0
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: rope type = 2
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: rope scaling = linear
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: freq_base_train = 1000000.0
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: freq_scale_train = 0.125
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: n_ctx_orig_yarn = 131072
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: rope_finetuned = unknown
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: model type = 4B
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: model params = 3.88 B
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: general.name = Gemma-3-4B-It
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: vocab type = SPM
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: n_vocab = 262208
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: n_merges = 0
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: BOS token = 2 '<bos>'
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: EOS token = 106 '<end_of_turn>'
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: EOT token = 106 '<end_of_turn>'
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: UNK token = 3 '<unk>'
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: PAD token = 0 '<pad>'
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: LF token = 248 '<0x0A>'
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: EOG token = 106 '<end_of_turn>'
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] print_info: max token length = 48
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load_tensors: loading model tensors, this can take a while... (mmap = false)
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load_tensors: offloading 34 repeating layers to GPU
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load_tensors: offloading output layer to GPU
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load_tensors: offloaded 35/35 layers to GPU
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load_tensors: Metal model buffer size = 2152.16 MiB
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load_tensors: CPU model buffer size = 525.13 MiB
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ...............................................................
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_context: constructing llama_context
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_context: n_seq_max = 1
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_context: n_ctx = 8192
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_context: n_ctx_per_seq = 8192
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_context: n_batch = 2048
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_context: n_ubatch = 512
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_context: causal_attn = 1
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_context: flash_attn = disabled
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_context: kv_unified = false
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_context: freq_base = 1000000.0
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_context: freq_scale = 0.125
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_context: n_ctx_per_seq (8192) < n_ctx_train (131072) -- the full capacity of the model will not be utilized
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_init: allocating
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_init: found device: Apple M4 Pro
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_init: picking default device: Apple M4 Pro
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_init: use bfloat = true
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_init: use fusion = true
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_init: use concurrency = true
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_init: use graph optimize = true
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_context: CPU output buffer size = 1.00 MiB
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_kv_cache_iswa: creating non-SWA KV cache, size = 8192 cells
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_kv_cache: Metal KV buffer size = 160.00 MiB
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_kv_cache: size = 160.00 MiB ( 8192 cells, 5 layers, 1/1 seqs), K (f16): 80.00 MiB, V (f16): 80.00 MiB
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_kv_cache_iswa: creating SWA KV cache, size = 1536 cells
[2025-10-13][06:20:07][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_kv_cache: Metal KV buffer size = 174.00 MiB
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_kv_cache: size = 174.00 MiB ( 1536 cells, 29 layers, 1/1 seqs), K (f16): 87.00 MiB, V (f16): 87.00 MiB
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_context: Metal compute buffer size = 517.12 MiB
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_context: CPU compute buffer size = 32.01 MiB
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_context: graph nodes = 1537
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] llama_context: graph splits = 2
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] common_init_from_params: added <end_of_turn> logit bias = -inf
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] common_init_from_params: setting dry_penalty_last_n to ctx_size = 8192
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] common_init_from_params: warming up the model with an empty run - please wait ... (--no-warmup to disable)
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] clip_model_loader: model name: Gemma-3-4B-It
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] clip_model_loader: description:
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] clip_model_loader: GGUF version: 3
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] clip_model_loader: alignment: 32
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] clip_model_loader: n_tensors: 439
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] clip_model_loader: n_kv: 21
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] clip_model_loader: has vision encoder
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_init: allocating
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_init: found device: Apple M4 Pro
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_init: picking default device: Apple M4 Pro
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] clip_ctx: CLIP using Metal backend
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_init: use bfloat = true
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_init: use fusion = true
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_init: use concurrency = true
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ggml_metal_init: use graph optimize = true
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load_hparams: projector: gemma3
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load_hparams: n_embd: 1152
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load_hparams: n_head: 16
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load_hparams: n_ff: 4304
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load_hparams: n_layer: 27
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load_hparams: ffn_op: gelu
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load_hparams: projection_dim: 2560
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] --- vision hparams ---
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load_hparams: image_size: 896
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load_hparams: patch_size: 14
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load_hparams: has_llava_proj: 0
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load_hparams: minicpmv_version: 0
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load_hparams: proj_scale_factor: 4
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load_hparams: n_wa_pattern: 0
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load_hparams: model size: 811.79 MiB
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] load_hparams: metadata size: 0.15 MiB
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] alloc_compute_meta: Metal compute buffer size = 1132.00 MiB
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] alloc_compute_meta: CPU compute buffer size = 9.19 MiB
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv load_model: loaded multimodal model, '/Users/joedoe/Library/Application Support/Jan/data/llamacpp/models/gemma-3-4b-it-IQ4_XS/mmproj.gguf'
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv init: initializing slots, n_slots = 1
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] slot init: id 0 | task -1 | new slot n_ctx_slot = 8192
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv init: Enable thinking? 0
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] main: model loaded
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] main: chat template, chat_template: {{ bos_token }}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- if messages[0]['role'] == 'system' -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- if messages[0]['content'] is string -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- set first_user_prefix = messages[0]['content'] + '
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ' -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- else -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- set first_user_prefix = messages[0]['content'][0]['text'] + '
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ' -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- endif -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- set loop_messages = messages[1:] -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- else -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- set first_user_prefix = "" -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- set loop_messages = messages -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- endif -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- for message in loop_messages -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {{ raise_exception("Conversation roles must alternate user/assistant/user/assistant/...") }}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- endif -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- if (message['role'] == 'assistant') -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- set role = "model" -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- else -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- set role = message['role'] -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- endif -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {{ '<start_of_turn>' + role + '
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ' + (first_user_prefix if loop.first else "") }}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- if message['content'] is string -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {{ message['content'] | trim }}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- elif message['content'] is iterable -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- for item in message['content'] -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- if item['type'] == 'image' -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {{ '<start_of_image>' }}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- elif item['type'] == 'text' -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {{ item['text'] | trim }}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- endif -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- endfor -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- else -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {{ raise_exception("Invalid content type") }}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- endif -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {{ '<end_of_turn>
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] ' }}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- endfor -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- if add_generation_prompt -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {{'<start_of_turn>model
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] '}}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] {%- endif -%}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] , example_format: '<start_of_turn>user
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] You are a helpful assistant
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] Hello<end_of_turn>
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] <start_of_turn>model
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] Hi there<end_of_turn>
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] <start_of_turn>user
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] How are you?<end_of_turn>
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] <start_of_turn>model
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] '
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] main: server is listening on http://127.0.0.1:3780 - starting the main loop
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] Model appears to be ready based on logs: 'main: server is listening on http://127.0.0.1:3780 - starting the main loop'
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv update_slots: all slots are idle
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] Model is ready to accept requests!
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] Server process started with PID: 50086 and is ready
[2025-10-13][06:20:08][tauri_plugin_llamacpp::gguf::commands][INFO] modelSize: 2263241856
[2025-10-13][06:20:08][reqwest::connect][DEBUG] starting new connection: http://localhost:3780/
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv log_server_r: request: GET /health 127.0.0.1 200
[2025-10-13][06:20:08][reqwest::connect][DEBUG] starting new connection: http://localhost:3780/
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv params_from_: Chat format: Generic
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] slot get_availabl: id 0 | task -1 | selected slot by LRU, t_last = -1
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] slot launch_slot_: id 0 | task 0 | processing task
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] slot update_slots: id 0 | task 0 | new prompt, n_ctx_slot = 8192, n_keep = 0, n_prompt_tokens = 1456
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] slot update_slots: id 0 | task 0 | kv cache rm [0, end)
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] slot update_slots: id 0 | task 0 | prompt processing progress, n_past = 1456, n_tokens = 1456, progress = 1.000000
[2025-10-13][06:20:08][tauri_plugin_llamacpp::commands][INFO] [llamacpp] slot update_slots: id 0 | task 0 | prompt done, n_past = 1456, n_tokens = 1456
[2025-10-13][06:20:08][tauri_plugin_llamacpp::gguf::commands][INFO] Using ctx_size: 8192
[2025-10-13][06:20:08][tauri_plugin_llamacpp::gguf::utils][INFO] Received ctx_size parameter: Some(8192)
[2025-10-13][06:20:08][tauri_plugin_llamacpp::gguf::utils][INFO] Received model metadata:
{"gemma3.attention.head_count_kv": "4", "general.repo_url": "https://huggingface.co/unsloth", "quantize.imatrix.entries_count": "238", "quantize.imatrix.file": "gemma-3-4b-it-GGUF/imatrix_unsloth.dat", "general.type": "model", "gemma3.context_length": "131072", "gemma3.attention.sliding_window": "1024", "tokenizer.ggml.scores": "<Array of type Float32 with 262208 elements, data skipped>", "tokenizer.ggml.eos_token_id": "106", "gemma3.rope.scaling.factor": "8", "gemma3.rope.scaling.type": "linear", "gemma3.attention.layer_norm_rms_epsilon": "0.000001", "gemma3.rope.freq_base": "1000000", "tokenizer.chat_template": "{{ bos_token }}\n{%- if messages[0]['role'] == 'system' -%}\n {%- if messages[0]['content'] is string -%}\n {%- set first_user_prefix = messages[0]['content'] + '\n\n' -%}\n {%- else -%}\n {%- set first_user_prefix = messages[0]['content'][0]['text'] + '\n\n' -%}\n {%- endif -%}\n {%- set loop_messages = messages[1:] -%}\n{%- else -%}\n {%- set first_user_prefix = \"\" -%}\n {%- set loop_messages = messages -%}\n{%- endif -%}\n{%- for message in loop_messages -%}\n {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) -%}\n {{ raise_exception(\"Conversation roles must alternate user/assistant/user/assistant/...\") }}\n {%- endif -%}\n {%- if (message['role'] == 'assistant') -%}\n {%- set role = \"model\" -%}\n {%- else -%}\n {%- set role = message['role'] -%}\n {%- endif -%}\n {{ '<start_of_turn>' + role + '\n' + (first_user_prefix if loop.first else \"\") }}\n {%- if message['content'] is string -%}\n {{ message['content'] | trim }}\n {%- elif message['content'] is iterable -%}\n {%- for item in message['content'] -%}\n {%- if item['type'] == 'image' -%}\n {{ '<start_of_image>' }}\n {%- elif item['type'] == 'text' -%}\n {{ item['text'] | trim }}\n {%- endif -%}\n {%- endfor -%}\n {%- else -%}\n {{ raise_exception(\"Invalid content type\") }}\n {%- endif -%}\n {{ '<end_of_turn>\n' }}\n{%- endfor -%}\n{%- if add_generation_prompt -%}\n {{'<start_of_turn>model\n'}}\n{%- endif -%}\n", "general.quantized_by": "Unsloth", "general.finetune": "it", "gemma3.block_count": "34", "general.basename": "Gemma-3-4B-It", "tokenizer.ggml.add_bos_token": "true", "tokenizer.ggml.add_eos_token": "false", "quantize.imatrix.dataset": "unsloth_calibration_gemma-3-4b-it.txt", "quantize.imatrix.chunks_count": "663", "tokenizer.ggml.token_type": "<Array of type Int32 with 262208 elements, data skipped>", "tokenizer.ggml.pre": "default", "tokenizer.ggml.add_space_prefix": "false", "gemma3.attention.value_length": "256", "tokenizer.ggml.bos_token_id": "2", "gemma3.attention.key_length": "256", "gemma3.feed_forward_length": "10240", "general.name": "Gemma-3-4B-It", "gemma3.embedding_length": "2560", "general.file_type": "30", "general.size_label": "4B", "gemma3.attention.head_count": "8", "tokenizer.ggml.model": "llama", "tokenizer.ggml.unknown_token_id": "3", "general.architecture": "gemma3", "tokenizer.ggml.padding_token_id": "0", "tokenizer.ggml.tokens": "<Array of type String with 262208 elements, data skipped>", "general.quantization_version": "2"}
[2025-10-13][06:20:08][tauri_plugin_llamacpp::gguf::utils][INFO] KV estimates -> sliding: 142606336 bytes (~136.00 MB), full: 1140850688 bytes (~1088.00 MB), middle: 641728512 bytes (~612.00 MB)
[2025-10-13][06:20:08][tauri_plugin_llamacpp::gguf::commands][INFO] isModelSupported: Total memory requirement: 2904970368 for /Users/joedoe/Library/Application Support/Jan/data/llamacpp/models/gemma-3-4b-it-IQ4_XS/model.gguf; Got kvCacheSize: 641728512 from BE
[2025-10-13][06:20:08][tauri_plugin_llamacpp::gguf::commands][INFO] No GPUs detected (likely unified memory system), using total RAM as VRAM
[2025-10-13][06:20:08][tauri_plugin_llamacpp::gguf::commands][INFO] Total VRAM reported/calculated (in bytes): 51539607552
[2025-10-13][06:20:08][tauri_plugin_llamacpp::gguf::commands][INFO] System RAM: 0 bytes
[2025-10-13][06:20:08][tauri_plugin_llamacpp::gguf::commands][INFO] Total VRAM: 51539607552 bytes
[2025-10-13][06:20:08][tauri_plugin_llamacpp::gguf::commands][INFO] Usable total memory: 49251117363 bytes
[2025-10-13][06:20:08][tauri_plugin_llamacpp::gguf::commands][INFO] Usable VRAM: 49251117363 bytes
[2025-10-13][06:20:08][tauri_plugin_llamacpp::gguf::commands][INFO] Required: 2904970368 bytes
[2025-10-13][06:20:10][reqwest::connect][DEBUG] starting new connection: http://localhost:3780/
[2025-10-13][06:20:10][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv log_server_r: request: GET /health 127.0.0.1 200
[2025-10-13][06:20:10][reqwest::connect][DEBUG] starting new connection: http://localhost:3780/
[2025-10-13][06:20:10][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv log_server_r: request: POST /apply-template 127.0.0.1 200
[2025-10-13][06:20:10][reqwest::connect][DEBUG] starting new connection: http://localhost:3780/
[2025-10-13][06:20:10][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv log_server_r: request: POST /tokenize 127.0.0.1 200
[2025-10-13][06:20:10][tauri_plugin_llamacpp::commands][INFO] [llamacpp] slot update_slots: id 0 | task 0 | SWA checkpoint create, pos_min = 0, pos_max = 1455, size = 164.955 MiB, total = 1/3 (164.955 MiB)
[2025-10-13][06:20:11][tauri_plugin_llamacpp::commands][INFO] [llamacpp] slot release: id 0 | task 0 | stop processing: n_past = 1504, truncated = 0
[2025-10-13][06:20:11][tauri_plugin_llamacpp::commands][INFO] [llamacpp] slot print_timing: id 0 | task 0 |
[2025-10-13][06:20:11][tauri_plugin_llamacpp::commands][INFO] [llamacpp] prompt eval time = 1613.50 ms / 1456 tokens ( 1.11 ms per token, 902.39 tokens per second)
[2025-10-13][06:20:11][tauri_plugin_llamacpp::commands][INFO] [llamacpp] eval time = 742.45 ms / 49 tokens ( 15.15 ms per token, 66.00 tokens per second)
[2025-10-13][06:20:11][tauri_plugin_llamacpp::commands][INFO] [llamacpp] total time = 2355.95 ms / 1505 tokens
[2025-10-13][06:20:11][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv update_slots: all slots are idle
[2025-10-13][06:20:11][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv log_server_r: request: POST /v1/chat/completions 127.0.0.1 200
[2025-10-13][06:20:12][reqwest::connect][DEBUG] starting new connection: http://localhost:3780/
[2025-10-13][06:20:12][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv log_server_r: request: GET /health 127.0.0.1 200
[2025-10-13][06:20:12][reqwest::connect][DEBUG] starting new connection: http://localhost:3780/
[2025-10-13][06:20:12][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv params_from_: Chat format: Generic
[2025-10-13][06:20:12][tauri_plugin_llamacpp::commands][INFO] [llamacpp] slot get_availabl: id 0 | task 0 | selected slot by lcs similarity, lcs_len = 1462, similarity = 0.972 (> 0.100 thold)
[2025-10-13][06:20:12][tauri_plugin_llamacpp::commands][INFO] [llamacpp] slot launch_slot_: id 0 | task 50 | processing task
[2025-10-13][06:20:12][tauri_plugin_llamacpp::commands][INFO] [llamacpp] slot update_slots: id 0 | task 50 | new prompt, n_ctx_slot = 8192, n_keep = 0, n_prompt_tokens = 1613
[2025-10-13][06:20:12][tauri_plugin_llamacpp::commands][INFO] [llamacpp] slot update_slots: id 0 | task 50 | kv cache rm [1462, end)
[2025-10-13][06:20:12][tauri_plugin_llamacpp::commands][INFO] [llamacpp] slot update_slots: id 0 | task 50 | prompt processing progress, n_past = 1613, n_tokens = 151, progress = 0.093614
[2025-10-13][06:20:12][tauri_plugin_llamacpp::commands][INFO] [llamacpp] slot update_slots: id 0 | task 50 | prompt done, n_past = 1613, n_tokens = 151
[2025-10-13][06:20:12][reqwest::connect][DEBUG] starting new connection: http://localhost:3780/
[2025-10-13][06:20:12][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv log_server_r: request: GET /health 127.0.0.1 200
[2025-10-13][06:20:12][reqwest::connect][DEBUG] starting new connection: http://localhost:3780/
[2025-10-13][06:20:12][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv log_server_r: request: POST /apply-template 127.0.0.1 200
[2025-10-13][06:20:12][reqwest::connect][DEBUG] starting new connection: http://localhost:3780/
[2025-10-13][06:20:12][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv log_server_r: request: POST /tokenize 127.0.0.1 200
[2025-10-13][06:20:13][tauri_plugin_llamacpp::commands][INFO] [llamacpp] slot update_slots: id 0 | task 50 | SWA checkpoint create, pos_min = 77, pos_max = 1612, size = 174.018 MiB, total = 2/3 (338.973 MiB)
[2025-10-13][06:20:13][tauri_plugin_llamacpp::commands][INFO] [llamacpp] slot release: id 0 | task 50 | stop processing: n_past = 1646, truncated = 0
[2025-10-13][06:20:13][tauri_plugin_llamacpp::commands][INFO] [llamacpp] slot print_timing: id 0 | task 50 |
[2025-10-13][06:20:13][tauri_plugin_llamacpp::commands][INFO] [llamacpp] prompt eval time = 289.97 ms / 151 tokens ( 1.92 ms per token, 520.75 tokens per second)
[2025-10-13][06:20:13][tauri_plugin_llamacpp::commands][INFO] [llamacpp] eval time = 472.68 ms / 34 tokens ( 13.90 ms per token, 71.93 tokens per second)
[2025-10-13][06:20:13][tauri_plugin_llamacpp::commands][INFO] [llamacpp] total time = 762.65 ms / 185 tokens
[2025-10-13][06:20:13][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv update_slots: all slots are idle
[2025-10-13][06:20:13][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv log_server_r: request: POST /v1/chat/completions 127.0.0.1 200
[2025-10-13][06:20:13][app_lib::core::mcp::helpers][WARN] MCP server Docker MCP Toolkit health check failed: Transport closed
[2025-10-13][06:20:13][app_lib::core::mcp::helpers][ERROR] MCP server Docker MCP Toolkit failed health check, removing from active servers
[2025-10-13][06:20:13][app_lib::core::mcp::helpers][INFO] Stopping server Docker MCP Toolkit...
[2025-10-13][06:20:13][app_lib::core::mcp::helpers][INFO] MCP server Docker MCP Toolkit quit with reason: Some(Closed)
[2025-10-13][06:20:13][app_lib::core::mcp::helpers][WARN] MCP server Docker MCP Toolkit terminated unexpectedly: Closed
[2025-10-13][06:20:13][app_lib::core::mcp::helpers][INFO] Restarting MCP server Docker MCP Toolkit (Attempt 1/3)
[2025-10-13][06:20:13][app_lib::core::mcp::helpers][INFO] Waiting 1234ms before restart attempt 1 for MCP server Docker MCP Toolkit
[2025-10-13][06:20:13][reqwest::connect][DEBUG] starting new connection: http://localhost:3780/
[2025-10-13][06:20:13][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv log_server_r: request: GET /health 127.0.0.1 200
[2025-10-13][06:20:13][reqwest::connect][DEBUG] starting new connection: http://localhost:3780/
[2025-10-13][06:20:13][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv log_server_r: request: POST /apply-template 127.0.0.1 200
[2025-10-13][06:20:13][reqwest::connect][DEBUG] starting new connection: http://localhost:3780/
[2025-10-13][06:20:13][tauri_plugin_llamacpp::commands][INFO] [llamacpp] srv log_server_r: request: POST /tokenize 127.0.0.1 200
[2025-10-13][06:20:15][app_lib::core::mcp::helpers][INFO] Server Docker MCP Toolkit started successfully.
[2025-10-13][06:20:16][app_lib::core::mcp::helpers][INFO] Marked MCP server Docker MCP Toolkit as successfully connected
[2025-10-13][06:20:16][app_lib::core::mcp::helpers][INFO] MCP server Docker MCP Toolkit restarted successfully.
[2025-10-13][06:20:16][app_lib::core::mcp::helpers][INFO] MCP server Docker MCP Toolkit restarted successfully, resetting restart count from 1 to 0.
[2025-10-13][06:20:16][app_lib::core::mcp::helpers][INFO] Monitoring MCP server Docker MCP Toolkit health
Operating System
- MacOS
- Windows
- Linux