Get Started: Support 🆘 #155

Jeffser · 2024-07-23T03:46:54Z

Jeffser
Jul 23, 2024
Maintainer

The app doesn't work how it's supposed to 😠

Please report the problem at the issues page.

I want to suggest something! ✨

Open an issue or comment on this discussion

I need help with something 🙏🏻

Comment on this discussion

BurningPho3nix · 2024-09-02T11:36:29Z

BurningPho3nix
Sep 2, 2024

How can I tell it to use my AMD GPU?

5 replies

olumolu Sep 2, 2024

The support for amd gpu is wip you can help if you have amd gpu you can compile and test your self if your gpu is used by default. If not let us know if yes let us know here and mention the gpu name

maxkeller Dec 21, 2024

Just a heads up, this doesn't work for me, even though I installed the plugin it still says the extension is missing. I'm on Fedora 41.

olumolu Dec 21, 2024

Wait for new release 3.0.0 that on going issue will be resolved...24th dec update happy holiday update

Duckbuster Jan 30, 2025

Is there a way to confirm it is actually using the GPU? I see VRAM usage go up, but I also still see a lot of CPU load as well. Or is that normal?

Jeffser Jan 31, 2025
Maintainer Author

@Duckbuster you are probably running a model that doesn't fit inside the vram, so Ollama fallbacked to using the CPU

MedDevGeek88 · 2024-11-10T00:53:39Z

MedDevGeek88
Nov 10, 2024

Wondering if there is a way to change the number of workers and threads, optimizations, etc.? Running on a 32-core hyperthreading system, and wondering if it can be manually optimized to maximize use of all CPU cores for all models.

2 replies

olumolu Nov 10, 2024

Underneath technology is ollama that this app use to give you a interactive chat interface you need to ask ollama for that.

MedDevGeek88 Nov 10, 2024

Ok thank you. Wasn’t sure if this app was somehow able to interact with Ollama to change settings like that. Will look in to the back end for those options. Thanks!

kirijin · 2024-11-24T13:25:45Z

kirijin
Nov 24, 2024

Screenshot

In model download window there are no quantized variants to be seen anymore. But there used to be. A bug or a feature? As if the .gguf downloading support is gone...

1 reply

Jeffser Nov 27, 2024
Maintainer Author

GGUF compatibility is still available, the tag list comes from the ollama website, so if they aren't present in that website they won't be listed in Alpaca

GittyMac · 2025-02-06T15:44:32Z

GittyMac
Feb 6, 2025

Why do I randomly see the app in the background app list, even when I didn't enable the option to do so? (There's no portal description in the entry either)

0 replies

u2324 · 2025-03-16T14:19:33Z

u2324
Mar 16, 2025

How do I set the context size for a local ollama instance? The default is only 2048, when the underlying models support 32K or 64K, but that must be included as a request parameter.

2 replies

olumolu Mar 16, 2025

ask ollama for support alpaca just provide ollama as it is

u2324 Mar 17, 2025

Ollama has a clear json parameter that the client needs to include with each request ("num_ctx" inside "extra_params"), which all other front-ends provide a way of accessing and setting. There is no single control on the server side.

RichardJECooke · 2025-04-26T17:12:51Z

RichardJECooke
Apr 26, 2025

How to use GPU in Alpaca please?

This isn't explained on https://flathub.org/apps/com.jeffser.Alpaca.

I ran these commands:

flatpak install flathub com.jeffser.Alpaca -y;
flatpak install com.jeffser.Alpaca.Plugins.Ollama -y;

And in Flatseal, enabled GPU acceleration for Alpaca.

Then ran Alpaca and installed Qwen2.5 coder model.

When I ask it a question my CPU usage rises to 50%, and uses 10GB RAM, but the NVIDIA app shows my GPU usage at its normal 20%. Output is extremely slow. So I think it's not using GPU. What should I do please?

Here's my Alpaca log file for the instance:

print_info: ssm_d_inner      = 0
print_info: ssm_d_state      = 0
print_info: ssm_dt_rank      = 0
print_info: ssm_dt_b_c_rms   = 0
print_info: model type       = 14B
print_info: model params     = 14.77 B
print_info: general.name     = Qwen2.5 Coder 14B Instruct
print_info: vocab type       = BPE
print_info: n_vocab          = 152064
print_info: n_merges         = 151387
print_info: BOS token        = 151643 '<|endoftext|>'
print_info: EOS token        = 151645 '<|im_end|>'
print_info: EOT token        = 151645 '<|im_end|>'
print_info: PAD token        = 151643 '<|endoftext|>'
print_info: LF token         = 198 'Ċ'
print_info: FIM PRE token    = 151659 '<|fim_prefix|>'
print_info: FIM SUF token    = 151661 '<|fim_suffix|>'
print_info: FIM MID token    = 151660 '<|fim_middle|>'
print_info: FIM PAD token    = 151662 '<|fim_pad|>'
print_info: FIM REP token    = 151663 '<|repo_name|>'
print_info: FIM SEP token    = 151664 '<|file_sep|>'
print_info: EOG token        = 151643 '<|endoftext|>'
print_info: EOG token        = 151645 '<|im_end|>'
print_info: EOG token        = 151662 '<|fim_pad|>'
print_info: EOG token        = 151663 '<|repo_name|>'
print_info: EOG token        = 151664 '<|file_sep|>'
print_info: max token length = 256
load_tensors: loading model tensors, this can take a while... (mmap = false)
load_tensors:          CPU model buffer size =  8566.04 MiB
llama_init_from_model: n_seq_max     = 4
llama_init_from_model: n_ctx         = 8192
llama_init_from_model: n_ctx_per_seq = 2048
llama_init_from_model: n_batch       = 2048
llama_init_from_model: n_ubatch      = 512
llama_init_from_model: flash_attn    = 0
llama_init_from_model: freq_base     = 1000000.0
llama_init_from_model: freq_scale    = 1
llama_init_from_model: n_ctx_per_seq (2048) < n_ctx_train (32768) -- the full capacity of the model will not be utilized
llama_kv_cache_init: kv_size = 8192, offload = 1, type_k = 'f16', type_v = 'f16', n_layer = 48, can_shift = 1
llama_kv_cache_init:        CPU KV buffer size =  1536.00 MiB
llama_init_from_model: KV self size  = 1536.00 MiB, K (f16):  768.00 MiB, V (f16):  768.00 MiB
llama_init_from_model:        CPU  output buffer size =     2.40 MiB
llama_init_from_model:        CPU compute buffer size =   696.01 MiB
llama_init_from_model: graph nodes  = 1686
llama_init_from_model: graph splits = 1
time=2025-04-26T18:58:23.282+02:00 level=INFO source=server.go:619 msg="llama runner started in 7.03 seconds"
[GIN] 2025/04/26 - 18:58:40 | 200 | 25.040628849s |       127.0.0.1 | POST     "/v1/chat/completions"
[GIN] 2025/04/26 - 18:58:47 | 200 | 31.493391109s |       127.0.0.1 | POST     "/v1/chat/completions"
[GIN] 2025/04/26 - 19:00:50 | 200 |     822.202µs |       127.0.0.1 | GET      "/api/tags"

And here's nvidia-smi output:

+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.120                Driver Version: 550.120        CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3060        Off |   00000000:09:00.0  On |                  N/A |
|  0%   54C    P8             16W /  170W |     827MiB /  12288MiB |     13%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                                                         
+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A      3315      G   /usr/lib/xorg/Xorg                            241MiB |
|    0   N/A  N/A      3539      G   /usr/bin/gnome-shell                          184MiB |
|    0   N/A  N/A      3777      G   /usr/bin/ckb-next                               2MiB |
|    0   N/A  N/A      4417      G   /usr/libexec/xdg-desktop-portal-gnome          63MiB |
|    0   N/A  N/A      4463      G   ...erProcess --variations-seed-version         45MiB |
|    0   N/A  N/A      8671      G   /app/lib/firefox/firefox                      161MiB |
|    0   N/A  N/A    283017      G   /usr/bin/gnome-system-monitor                  25MiB |
|    0   N/A  N/A    289563    C+G   /usr/bin/gjs                                   24MiB |
|    0   N/A  N/A    289762      G   ...erProcess --variations-seed-version         47MiB |
|    0   N/A  N/A    295618      G   /usr/bin/nvidia-settings                        0MiB |
+-----------------------------------------------------------------------------------------+

2 replies

RichardJECooke May 1, 2025

Hello? Anyone there?

olumolu May 1, 2025

kindly wait #747

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get Started: Support 🆘 #155

{{title}}

Replies: 6 comments 12 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Get Started: Support 🆘 #155

Jeffser Jul 23, 2024 Maintainer

The app doesn't work how it's supposed to 😠

I want to suggest something! ✨

I need help with something 🙏🏻

Replies: 6 comments · 12 replies

Jeffser Jan 31, 2025 Maintainer Author

Jeffser Nov 27, 2024 Maintainer Author

ask ollama for support alpaca just provide ollama as it is

How to use GPU in Alpaca please?

Jeffser
Jul 23, 2024
Maintainer

Replies: 6 comments 12 replies

Jeffser Jan 31, 2025
Maintainer Author

Jeffser Nov 27, 2024
Maintainer Author